Find online courses made by experts from around the world.
Take your courses with you and learn anywhere, anytime.
Learn and practice real-world skills and achieve your goals.
Use the R Programming Language to execute data science projects and become a data scientist. Implement business solutions, using machine learning and predictive analytics.
The R language provides a way to tackle day-to-day data science tasks, and this course will teach you how to apply the R programming language and useful statistical techniques to everyday business situations.
With this course, you'll be able to use the visualizations, statistical models, and data manipulation tools that modern data scientists rely upon daily to recognize trends and suggest courses of action.
Understand Data Science to Be a More Effective Data Analyst
●Use R and RStudio
●Master Modeling and Machine Learning
●Load, Visualize, and Interpret Data
Use R to Analyze Data and Come Up with Valuable Business Solutions
This course is designed for those who are analytically minded and are familiar with basic statistics and programming or scripting. Some familiarity with R is strongly recommended; otherwise, you can learn R as you go.
You'll learn applied predictive modeling methods, as well as how to explore and visualize data, how to use and understand common machine learning algorithms in R, and how to relate machine learning methods to business problems.
All of these skills will combine to give you the ability to explore data, ask the right questions, execute predictive models, and communicate your informed recommendations and solutions to company leaders.
Contents and Overview
This course begins with a walk-through of a template data science project before diving into the R statistical programming language.
You will be guided through modeling and machine learning. You'll use machine learning methods to create algorithms for a business, and you'll validate and evaluate models.
You'll learn how to load data into R and learn how to interpret and visualize the data while dealing with variables and missing values. You’ll be taught how to come to sound conclusions about your data, despite some real-world challenges.
By the end of this course, you'll be a better data analyst because you'll have an understanding of applied predictive modeling methods, and you'll know how to use existing machine learning methods in R. This will allow you to work with team members in a data science project, find problems, and come up solutions.
You’ll complete this course with the confidence to correctly analyze data from a variety of sources, while sharing conclusions that will make a business more competitive and successful.
The course will teach students how to use existing machine learning methods in R, but will not teach them how to implement these algorithms from scratch. Students should be familiar with basic statistics and basic scripting/programming.
Not for you? No problem.
30 day money back guarantee.
Learn on the go.
Desktop, iOS and Android.
Certificate of completion.
|Section 1: Course Overview|
The course introduction describes what to expect from Introduction to Data Science and help you decide if the course is for you. The examples (available here: http://winvector.github.io/IntroductionToDataScience/ ) are mostly worked using R and RStudio which is freely available software from http://cran.r-project.org and http://www.rstudio.com . We do require some familiarity with "R" and statistics (though a later lesson will discuss starting with R and RStudio).
Walk-through of a data science projectPreview
Starting with R and data
|Section 2: Modeling and Machine Learning|
Mapping Business to Machine Learning Tasks
|Lecture 6||1 page|
Your feedback is valuable, both for us developing courses and for helping other students pick courses.
Before we move on the machine learning parts of data science we ask that you consider adding a fair review of the course. To do this you use the "back to course" link (should be on the top left when viewing the course) and click the "write review link" (should be in the top right corner).
This is just an ask. If you would prefer not to review the course until the end (or at all), we understand.
Naive Bayes: background
Naive Bayes: practice
Linear Regression: background
Linear Regression: practice
Logistic Regression: background
Logistic Regression: practice
Decision Trees and Random Forest: background
Random Forest: practice
Generalized Additive Models
Support Vector Machines
Regularization for Linear and Logistic Regression
|Section 3: Data|
Loading Data in R
The Shape of Data
Dealing with Categorical Variables
Useful Data Transformations
|Section 4: Moving On|
Nina Zumel, PhD, has over 10 years of experience in research, machine learning, and data science. She is a co-author of the popular book Practical Data Science with R, co-author of the EMC data scientist certification program, and blogs often on statistics, data science, and data visualization.
I am principal at with the data science consulting firm Win-Vector LLC. Win-Vector LLC specializes in data science research, implementation, and training. I have over 10 years of experience in research, teaching, machine learning, and data science.
I am co-author of the popular book Practical Data Science with R, and I blog often on mathematics, programming, machine learning, and optimization on the Win-Vector blog.
My profesional experience includes managing a data science group for Shopping dot com (an eBay company), working in price optimization for Rapt (acquired by Microsoft), and apply machine learning at a web-scale for Kosmix (acquired by Walmart online). My original fields of study were mathematics (AB UC Berkeley) and computer science (Ph.D. Carnegie Mellon) with a heavy emphasis on probability theory.