
Data Scientist, Data Analyst, Business Analyst, and Data Engineer. These are the typical roles in the data science world. Do you know which position you want? Will this activity change your intentions for the future? Did you guess right?
A data science career may be right for you if you are a methodological person and like to solve problems.
You are considering a data science career so you want to know..what the heck do data scientists do anyway? As with most questions, it depends on the context but I lay out some possibilities.
A data science career may be right for you if you have the data science skills mentioned in the lecture, or if you are willing to work on developing these skills. For example, do statistics seem interesting to you even though you are currently not skilled at it? Great. Passion and willingness to learn are all that matters.
A data science career may be in your future but you're not sure where to start. One common problem for newbies is to wonder...what the best language for data science actually is. I have particular answers based on your specific context.
If I didn't know anything about you or your plans, the best language for data science would be Python. It is the ONLY programming language for data science that I can suggest to students no matter their particular plans or location around the world. Find out why in this lecture.
I have taught SAS programming for many years. One thing that still surprises me is rarely hearing SAS in the category of best language for data science. Discover my argument for SAS in this lecture.
R is a very popular language for data science and data analysis in academic settings.
SQL is a must-learn.
An introduction to data science methodology.
The first and arguably the most important aspect of data science methodology is business understanding.
The second aspect of data science methodology is data understanding.
The third part of data science methodology is data preparation. This is seen as nonsexy work but nevertheless important.
Modeling is one of the more fun aspects of data science methodology.
Evaluation of model performance is an essential part of data science methodology.
An often overlooked aspect of the data science process is deployment.
Here you can download the Jupyter Notebooks and Datasets used in the course.
Welcome! Nice to have you. I'm certain that by the end you will have learned a lot and earned a valuable skill. You can think of the course as compromising 3 parts, and I present the material in each part differently. For example, in the last section, the essential math for data science is presented almost entirely via whiteboard presentation.
The opening section of Data Science 101 examines common questions asked by passionate learners like you (i.e., what do data scientists actually do, what's the best language for data science, and addressing different terms (big data, data mining, and comparing terms like machine learning vs. deep learning).
Following that, you will explore data science methodology via a Healthcare Insurance case study. You will see the typical data science steps and techniques utilized by data professionals. You might be surprised to hear that other roles than data scientists do actually exist. Next, if machine learning and natural language processing are of interest, we will build a simple chatbot so you can get a clear sense of what is involved. One day you might be building such systems.
The following section is an introduction to Data Science in Python. You will have an opportunity to master python for data science as each section is followed by an assignment that allows you to practice your skills. By the end of the section, you will understand Python fundamentals, decision and looping structures, Python functions, how to work with nested data, and list comprehension. The final part will show you how to use the two most popular libraries for data science, Numpy, and Pandas.
The final section delves into essential math for data science. You will get the hang of linear algebra for data science, along with probability, and statistics. My goal for the linear algebra part was to introduce all necessary concepts and intuition so that you can gain an understanding of an often utilized technique for data fitting called least squares. I also wanted to spend a lot of time on probability, both classical and bayesian, as reasoning about problems is a much more difficult aspect of data science than simply running statistics.
So, don't wait, start Data Science 101 and develop modern-day skills. If you should not enjoy the course for any reason, Udemy offers a 30-day money-back guarantee.