Bite-Sized Data Science with Python and Pandas: Introduction
4.3 (12 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
64 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Bite-Sized Data Science with Python and Pandas: Introduction to your Wishlist.

Add to Wishlist

Bite-Sized Data Science with Python and Pandas: Introduction

Follow along as we analyze a real-life dataset and learn data science with Python and Pandas
4.3 (12 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
64 students enrolled
Created by Troy Shu
Last updated 12/2015
English
Current price: $10 Original price: $20 Discount: 50% off
1 day left at this price!
30-Day Money-Back Guarantee
Includes:
  • 1 hour on-demand video
  • 6 Articles
  • 3 Supplemental Resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Manipulate and transform data series and tables in Python
  • Build a multiple regression model in Python
  • Use iPython Notebook for research and analysis in Python
  • Visualize data to glean insights from it, in Python
View Curriculum
Requirements
  • Students should have experience writing, at a minimum, basic programs in Python
Description

Learn the basics of data science with Python, with this short course designed for students to follow along, and built around a concrete, real-world dataset.

Listening to theoretical examples is never fun, and I've always liked actually applying what I learn to concrete examples, so this course is built around us analyzing a real-life dataset together. The dataset we'll be using is the "Parkinson's Disease Telemedicine dataset", and our goal will be to see if we can predict the severity of Parkinson's Disease in patients from just a dozen simple measurements, which would be a vast improvement over the current time consuming process that doctors and patients have to go through.

This course will provide a good introduction to several different aspects of data science, and all in Python, one of the most popular and powerful languages used by data scientists today.

You'll learn how to:

- Set up your data analysis research environment (in an iPython notebook)

- Visualize the data to understand it better

- Manipulate and transform data to prepare it for modeling

- Apply a statistical model to the data

The course is comprised of short lectures which walk you through the data analysis, as you follow along. There are also several coding exercises throughout to test your knowledge!

Check out the course to learn data science with Python today!

Who is the target audience?
  • This course is best suited for students who already have a basic understanding of both Python and statistics
  • This course is for students who like learning with real-life, concrete examples, and following along by programming on their own computers
  • This course is for students who want to learn the basics of data manipulation and visualization, and statistical model building in Python
Students Who Viewed This Course Also Viewed
Curriculum For This Course
Expand All 24 Lectures Collapse All 24 Lectures 01:03:25
+
Welcome, information about this course
1 Lecture 01:59
+
Setting up Python and Libraries
5 Lectures 06:41

File and command to install all necessary libraries at once, with pip
00:06

Links to help you install pip
00:13


+
Our data set: the Parkinson's Telemedicine Dataset
2 Lectures 04:44

A quick explanation of the dataset
02:12
+
Starting our analysis
2 Lectures 09:31
Starting a new iPython Notebook
05:44

Loading the data into our iPython Notebook
03:47
+
Manipulating data with pandas, the data analysis library
5 Lectures 14:04
DataFrames are data tables
02:26

Series are single rows or columns of data
04:17

Slicing DataFrames to get the data we need
02:53

Keeping track of the variable names we need
03:57

Coding Exercise: summary statistics
00:31
+
Visualizing the data to understand it better before modeling
3 Lectures 10:08
Looking at the data's distributions with box plots and histograms
06:26

Seeing multicolinearity with a scatter plot matrix
03:22

Coding exercise: a single correlation
00:20
+
Transforming the data to prepare it for modeling
3 Lectures 09:44
Taking care of multicolinearity
01:55

Log transforming data to take care of skewed distributions
07:31

Coding exercise: practicing apply()
00:18
+
Modeling the data
1 Lecture 04:41
Applying a multiple regression to answer the ultimate question
04:41
+
Conclusion
2 Lectures 01:43
Thank you
01:32

Download the data and iPython notebook that was used throughout this lecture
00:11
About the Instructor
Troy Shu
4.3 Average rating
12 Reviews
64 Students
1 Course
Founder of Terragon, building data-driven products

Troy Shu has worked on Wall Street, at a startup, and has now started his own company, building lots of data-driven products and doing tons of data analysis in Python along the way.

He currently runs his own consulting business, building data-driven products for other companies. Before that, Troy worked at a lending startup called Bond Street, where he built the company's risk models and developed the "MVP" (minimum viable product) for the automated loan underwriting platform. He has also worked at a hedge fund where he built stock picking algorithms and launched a new hedge fund. Troy double majored in Computer Science and Economics, with concentrations in Statistics and Finance, at the University of Pennsylvania.