Browse

2017-03-26 04:59:01

Learn By Example: Statistics and Data Science in R

Please confirm that you want to add **Learn By Example: Statistics and Data Science in R** to your Wishlist.

A gentle yet thorough introduction to Data Science, Statistics and R using real life examples

Bestselling

1,512 students enrolled

Current price: $10
Original price: $50
Discount:
80% off

30-Day Money-Back Guarantee

- 9 hours on-demand video
- 132 Supplemental Resources
- Full lifetime access
- Access on mobile and TV

- Certificate of Completion

What Will I Learn?

- Harness R and R packages to read, process and visualize data
- Understand linear regression and use it confidently to build models
- Understand the intricacies of all the different data structures in R
- Use Linear regression in R to overcome the difficulties of LINEST() in Excel
- Draw inferences from data and support them using tests of significance
- Use descriptive statistics to perform a quick study of some data and present results

Requirements

- No prerequisites : We start from basics and cover everything you need to know. We will be installing R and RStudio as part of the course and using it for most of the examples. Excel is used for one of the examples and basic knowledge of excel is assumed.

Description

**Taught by** a Stanford-educated, ex-Googler and an IIT, IIM - educated ex-Flipkart lead analyst. This team has decades of practical experience in quant trading, analytics and e-commerce.

**This course is a gentle yet thorough introduction to Data Science, Statistics and R using real life examples. **

Let’s parse that.

**Gentle, yet thorough: **This course does not require a prior quantitative or mathematics background. It starts by introducing basic concepts such as the mean, median etc and eventually covers all aspects of an analytics (or) data science career from analysing and preparing raw data to visualising your findings.

**Data Science, Statistics and R: **This course is an introduction to Data Science and Statistics using the R programming language. It covers both the theoretical aspects of Statistical concepts and the practical implementation using R.

**Real life examples: **Every concept is explained with the help of examples, case studies and source code in R wherever necessary. The examples cover a wide array of topics and range from A/B testing in an Internet company context to the Capital Asset Pricing Model in a quant finance context.

**What's Covered:**

* Data Analysis with R: *Datatypes and Data structures in R, Vectors, Arrays, Matrices, Lists, Data Frames, Reading data from files, Aggregating, Sorting & Merging Data Frames

** Linear Regression: **Regression, Simple Linear Regression in Excel, Simple Linear Regression in R, Multiple Linear Regression in R, Categorical variables in regression, Robust regression, Parsing regression diagnostic plots

* Data Visualization in R: *Line plot, Scatter plot, Bar plot, Histogram, Scatterplot matrix, Heat map, Packages for Data Visualisation : Rcolorbrewer, ggplot2

* Descriptive Statistics: *Mean, Median, Mode, IQR, Standard Deviation, Frequency Distributions, Histograms, Boxplots

** Inferential Statistics: **Random Variables, Probability Distributions, Uniform Distribution, Normal Distribution, Sampling, Sampling Distribution, Hypothesis testing, Test statistic, Test of significance

**Using discussion forums**

Please use the discussion forums on this course to engage with other students and to help each other out. Unfortunately, much as we would like to, it is not possible for us at Loonycorn to respond to individual questions from students:-(

**We're super small and self-funded with only 2-3 people developing technical video content. ****Our mission is to make high-quality courses available at super low prices.**

The only way to keep our prices this low is to ***NOT offer additional technical support over email or in-person***. The truth is, direct support is hugely expensive and just does not scale.

We understand that this is not ideal and that a lot of students might benefit from this additional support. Hiring resources for additional support would make our offering much more expensive, thus defeating our original purpose.

**It is a hard trade-off.**

Thank you for your patience and understanding!

Who is the target audience?

- Yep! MBA graduates or business professionals who are looking to move to a heavily quantitative role
- Yep! Engineers who want to understand basic statistics and lay a foundation for a career in Data Science
- Yep! Analytics professionals who have mostly worked in Descriptive analytics and want to make the shift to being modelers or data scientists
- Yep! Folks who've worked mostly with tools like Excel and want to learn how to use R for statistical analysis

Students Who Viewed This Course Also Viewed

Curriculum For This Course

Expand All 82 Lectures
Collapse All 82 Lectures
09:07:16

+
–

Introduction
3 Lectures
20:53

Preview
02:32

Q. How do companies make decisions?

A. Using data

We talk about what it takes to go from data to making a decision from data. This sets the agenda for the rest of the course - each of the things on this journey is covered in the upcoming sections

Preview
13:11

Get setup with R and Rstudio. All the examples that follow in this course will have source code attached. Download and run them in Rstudio

R and RStudio installed

05:10

+
–

The 10 second answer : Descriptive Statistics
8 Lectures
49:34

Bosses are impatient. They often want you to cut to the chase, and give them an answer that's ok, but in a short amount of time. Descriptive statistics are the first place to start - they are often the 10s answer to any question about the data.

Descriptive Statistics : Mean, Median, Mode

10:07

Computing a frequency distribution using R

Our first foray into R : Frequency Distributions

06:06

A histogram is a good visual summary of your data.

Draw your first plot : A Histogram

03:11

Computing the Mean, Median, Mode in R

Computing Mean, Median, Mode in R

02:21

The mean, median and mode are point estimates to represent your data. IQR is a measure that explains the spread of the data.

What is IQR (Inter-quartile Range)?

08:08

Visualize the IQR and outliers using box and whisker plots

Box and Whisker Plots

03:11

The standard deviation measures the spread of a dataset, and it so happens, the standard deviation is actually very profound.

The Standard Deviation

10:24

Computing IQR and Standard Deviation in R

06:06

+
–

Inferential Statistics
5 Lectures
45:29

Drawing inferences from data is key to being able to take decisions using data. There is a science to this, whose foundation is in random variables, probability distributions, and performing tests of statistical significance.

Drawing inferences from data

03:25

Random variables are everywhere. Any data that you'll study is a random variable whose behaviour is determined by a probability distribution.

Random Variables are ubiquitous

16:54

The Normal Distribution is arguably the most well-known and commonly seen probability distribution. It is characterized by its probability density function, mean and standard deviation.

The Normal Probability Distribution

09:31

Sampling is a little like fishing. Sampling is crucial to induction - drawing conclusions about something by looking at some evidence.

Sampling is like fishing

06:14

A sample is described by sample statistics like the sample mean. The sampling distribution is the probability distribution of sample means.

Sample Statistics and Sampling Distributions

09:25

+
–

Case studies in Inferential Statistics
6 Lectures
01:07:25

Find a point estimate for the average weight of all football players using a sample of football players in 1 college team.

Case Study 1 : Football Players (Estimating Population Mean from a Sample)

06:45

Find a point estimate for the % of voters in favor of a candidate.

Case Study 2 : Election Polling (Estimating Population Proportion from a Sample)

07:50

A test of significance is an important step in building support for your findings and inferences. Here is the first example of a test of significance - is the population mean equal to a given value?

Case Study 3 : A Medical Study (Hypothesis Test for the Population Mean)

13:53

Perform a test of significance to check whether the population % is equal to a certain value

Case Study 4 : Employee Behavior (Hypothesis Test for the Population Proportion)

09:49

Perform a test of significance to compare 2 population means. The example used is A/B Testing - which is pretty widely used in internet companies to test out product features.

Preview
17:18

Perform a test of significance to compare two population proportions

Preview
11:50

+
–

Diving into R
6 Lectures
45:34

The next few sections dive deep into all the data processing, slicing and dicing ability that R provides. The wide variety of R packages available is one reason why R is popular among many data scientists.

Harnessing the power of R

07:26

Let's start with the basics. What are variables and how do we assign variables in R?

Assigning Variables

08:47

print(), show(), message(), cat() are different ways to print something to screen.

Printing an output

13:03

Numbers in R are of type numeric.

Numbers are of type numeric

05:24

R has built-in datatypes for dates and timestamps.

Characters and Dates

07:30

Logical is a datatype that is the result of conditional tests in R

Logicals

03:24

+
–

Vectors
15 Lectures
01:02:35

The wide variety of built-in data structures are what makes R different from other standard programming languages. These include vectors, arrays, matrices, data frames and lists.

Data Structures are the building blocks of R

08:24

Creating a Vector

02:22

The mode of a vector is the datatype of all its elements.

The Mode of a Vector

04:18

Vectors are Atomic

02:24

Doing something with each element of a Vector

03:09

Finding the sum, product, or mean of a vector

Aggregating Vectors

01:28

Operations between vectors of the same length

05:39

Operations between vectors of different length

05:30

Generate sequences using the : operator, rep() and seq()

Generating Sequences

06:25

Using conditions with Vectors

02:04

Find the lengths of multiple strings using Vectors

02:22

Generate a complex sequence (using recycling)

02:49

Access elements based on their position in the vector.

Vector Indexing (using numbers)

06:56

Access elements based on whether they pass a conditional test.

Vector Indexing (using conditions)

06:18

Assign names to the elements of a vector

Vector Indexing (using names)

02:27

+
–

Arrays
5 Lectures
30:31

Creating an array can be done by using a vector and then arranging it along dimensions.

Creating an Array

11:36

Indexing an Array

07:38

Operations between 2 Arrays

02:09

Operations between an Array and a Vector

02:45

Outer products are complex operations that operate on every pair of elements from two arrays.

Outer Products

06:23

+
–

Matrices
5 Lectures
16:58

A Matrix is a 2 Dimensional array. But it has special meaning and can be interpreted in a bunch of different ways.

A Matrix is a 2-Dimensional Array

07:58

Creating a Matrix

02:00

Matrix Multiplication

02:48

rbind() and cbind() to merge matrices.

Merging Matrices

02:06

+
–

Factors
5 Lectures
17:20

A factor is a special type of vector used to represent categorical variables

What is a factor?

06:48

Find the distinct values in a dataset (using factors)

01:28

Replace the levels of a factor

02:18

Aggregate factors with table()

01:39

Aggregate factors with tapply()

05:07

+
–

Lists and Data Frames
6 Lectures
30:06

Lists are fundamentally different from vectors, arrays and matrices - which are all homogenous data structures.

Introducing Lists

05:11

Data Frames are how R stores data read from files and databases.

Introducing Data Frames

04:28

Reading Data from files

04:52

Indexing a Data Frame

05:38

Using the aggregate() and order() functions

Aggregating and Sorting a Data Frame

06:28

Merge data frames based on one or more common columns

Merging Data Frames

03:29

4 More Sections

About the Instructor

A 4-person team;ex-Google; Stanford, IIM Ahmedabad, IIT

Loonycorn is us, Janani Ravi, Vitthal Srinivasan, Swetha Kolalapudi and Navdeep Singh. Between the four of us, we have studied at Stanford, IIM Ahmedabad, the IITs and have spent years (decades, actually) working in tech, in the Bay Area, New York, Singapore and Bangalore.

Janani: 7 years at Google (New York, Singapore); Studied at Stanford; also worked at Flipkart and Microsoft

Vitthal: Also Google (Singapore) and studied at Stanford; Flipkart, Credit Suisse and INSEAD too

Swetha: Early Flipkart employee, IIM Ahmedabad and IIT Madras alum

Navdeep: longtime Flipkart employee too, and IIT Guwahati alum

We think we might have hit upon a neat way of teaching complicated tech courses in a funny, practical, engaging way, which is why we are so excited to be here on Udemy!

We hope you will try our offerings, and think you'll like them :-)

- About Us
- Udemy for Business
- Become an Instructor
- Affiliates
- Blog
- Topics
- Mobile Apps
- Support
- Careers
- Resources

- Copyright © 2017 Udemy, Inc.
- Terms
- Privacy Policy
- Intellectual Property