Introduction to R

Learn the core fundamentals of the R language for interactive use as well as programming
4.0 (128 ratings) Instead of using a simple lifetime average, Udemy calculates a
course's star rating by considering a number of different factors
such as the number of ratings, the age of ratings, and the
likelihood of fraudulent ratings.
3,414 students enrolled
$100
Take This Course
  • Lectures 103
  • Contents Video: 10 hours
    Other: 5 hours
  • Skill Level Beginner Level
  • Languages English
  • Includes Lifetime access
    30 day money back guarantee!
    Available on iOS and Android
    Certificate of Completion
Wishlisted Wishlist

How taking a course works

Discover

Find online courses made by experts from around the world.

Learn

Take your courses with you and learn anywhere, anytime.

Master

Learn and practice real-world skills and achieve your goals.

About This Course

Published 5/2013 English

Course Description

With "Introduction to R", you will gain a solid grounding of the fundamentals of the R language!

This course has about 90 videos and 140+ exercise questions, over 10 chapters. To begin with, you will learn to Download and Install R (and R studio) on your computer. Then I show you some basic things in your first R session.

From there, you will review topics in increasing order of difficulty, starting with Data/Object Types and Operations, Importing into R, and Loops and Conditions.

Next, you will be introduced to the use of R in Analytics, where you will learn a little about each object type in R and use that in Data Mining/Analytical Operations.

After that, you will learn the use of R in Statistics, where you will see about using R to evaluate Descriptive Statistics, Probability Distributions, Hypothesis Testing, Linear Modeling, Generalized Linear Models, Non-Linear Regression, and Trees.

Following that, the next topic will be Graphics, where you will learn to create 2-dimensional Univariate and Multi-variate plots. You will also learn about formatting various parts of a plot, covering a range of topics like Plot Layout, Region, Points, Lines, Axes, Text, Color and so on.

At that point, the course finishes off with two topics: Exporting out of R, and Creating Functions.

Each chapter is designed to teach you several concepts, and these have been grouped into sub-sections. A sub-section usually has the following:

  • A Concept Video
  • An Exercise Sheet
  • An Exercise Video (with answers)


Why take a course to learn R?

When I look to advancing my R knowledge today, I still face the same sort of situation as when I originally started to use R. Back when I was learning R, my approach was learn by doing. There was a lot of free material out there (and I refer to that early in the course) that gave me a framework, but the wording was highly technical in nature. Even with the R help and the free material, it took me up to a couple of months of experimentation to gain a certain level of proficiency. What I would have liked at that time was a way to learn the fundamentals quicker. I have designed this course with exactly that in mind.

Why my course?

For those of you that are new to R, this course will cover enough breadth/depth in R to give you a solid grounding. I use simple language to explain the concepts. Also, I give you 140+ exercise questions many of which are based on real world data for practice to get you up and running quickly, all in a single package. This course is designed to get you functional with R in little over a week.

For those beginners with some experience that have learnt R through experimentation, this course is designed to complement what you know, and round out your understanding of the same.

What are the requirements?

  • Windows/Mac/Linux
  • Basic proficiency in math - vectors, matrices, algebra
  • Basic proficiency in statistics - probability distributions, linear modeling, etc
  • A high speed internet connection

What am I going to get from this course?

  • 90 videos (15+ hours)
  • To educate you on the fundamentals of R
  • 140+ exercise problems
  • To accelerate your learning of R through practice

What is the target audience?

  • Enterprise Data Analysts
  • Students
  • Anyone interested in Data Mining, Statistics, Data Visualization

What you get with this course?

Not for you? No problem.
30 day money back guarantee.

Forever yours.
Lifetime access.

Learn on the go.
Desktop, iOS and Android.

Get rewarded.
Certificate of completion.

Curriculum

Section 1: Getting Started
13:40
Hi everyone! Welcome to my course on R fundamentals. We start slow in Chapters 2 to 4 with some basics, pick up steam in Chapter 5 - 8, and cool down in Chapters 9, 10. Lots of material coming your way. Pace yourselves...or not!
03:51
In this lecture, a brief outline of the course website is provided, showing you how to navigate through the course and the curriculum.
Section 1: Material
7 pages
Section 2: Your first R Session
09:17
In Section 2, you will be introduced to the R software and will install and work with it for the first time. In this video, you will find instructions on downloading/installing R, as well as simple navigation, accessing help etc.
02:44
In this video, you will find answers to exercise questions on "Finding your way around R".
08:07

This lecture gets you started with using R by discussing basic commands such as assignment, case sensitivity, comments etc

02:40
In this video, you will find answers to exercise questions on "Basic Commands".
04:36
In this lecture, you will be introduced to operators in R - Arithmetic and Logical.
02:08
In this video, you will find answers to exercise questions on "Operators".
09:22
This lecture deals with different items, including 
  • finding and removing objects
  • infinite, missing and indefinite values
  • working with packages, and
  • R preferences
02:07
In this video, you will find answers to exercise questions on "Miscellaneous".
03:31
This lecture provides you a brief intro of the R Studio IDE.
Section 2: Material
15 pages
Section 3: Basics - Objects and Data Types
12:05
In Section 3, you will be introduced to Object/Data types in R. In this lecture, you will review different Data Types supported in R, including integer, double, complex, logical, date and character. You will also see about working with Data Types
03:41
In this video, you will find answers to exercise questions on "Data Types".
15:44
In this lecture, you will be introduced to Object types in R, including vectors, arrays, matrices, factors, lists, data frames and tables. You will learn about attributes of an object - intrinsic and non-intrinsic. You will be introduced to the class of an object. 
01:30
In this video, you will find answers to exercise questions on "Object Types".
11:19
This lecture focusses on Vectors. You will learn to create a numeric vector, and perform arithmetic and mathematical operations. You will learn to replicate a vector, and create sequences. Finally you will see about logical and character vectors.
01:43
In this video, you will find answers to exercise questions on "Vectors".
14:50
This lecture focusses on Arrays and Matrices. You will learn about creation, subsections, and operations on Arrays and Matrices. You will review transposing. Then you will see some special operations that are matrix-specific.
02:58
In this video, you will find answers to exercise questions on "Arrays and Matrices".
07:17
This lecture focusses on Factors, where you will learn about creation, levels and ordering a factor. It also focusses on Lists, where you will see about list values, names, and modifying a list.
06:34
In this video, you will find answers to exercise questions on "Factors and Lists".
09:49
This lecture focusses on Data Frames where you will learn about creation, referencing, and working with data frames. It also deals with Tables, where you will see about creation, and the underlying tabulate() function.
05:14
In this video, you will find answers to exercise questions on "".
Section 3: Material
33 pages
Section 4: Importing Data into R
12:21
In Section 4, you will learn about Importing into R. In this lecture, you learn about importing text files as a data frame, and then as a vector.
01:31
In this video, you will find answers to exercise questions on "Importing Text Files".
04:25
In this lecture, you will learn about importing data in an Excel file into R as a data frame. As part of the exercise, an Excel file "phones.xls" has been provided to you.
02:27
In this video, you will find answers to exercise questions on "Importing Excel Files".
Section 4: Material
8 pages
Section 5: Data Mining/Manipulation
14:26
Section 5 is your first "heavy" section; it covers data mining/manipulation by object Type. In the first lecture, you will start with Vectors - subscripts, ordering, statistics, applying functions, subdivision and sampling.
03:12
In this video, you will find answers to exercise questions on "Vector Operations".
10:49
This lecture focusses on Array Operations: subscripts, Outer Product, and applying functions.
03:14
In this video, you will find answers to exercise questions on "Array Operations".
11:53
This lecture deals with Matrix Operations: Subscripts, diagonal matrix, Matrix multiplication, cross product, inverse of a matrix, solving linear equations, and least squares regression.
03:30
In this video, you will find answers to exercise questions on "Matrix Operations".
14:05
In this lecture, you will learn about Data Frame Operations: accessing a subset, adding rows/columns, combining data frames, obtaining summaries, and modifying the data frame. 
03:49
In this video, you will find answers to exercise questions on "Data Frame Operations".
11:12
This lecture introduces you to Factor Operations: summarizing data at different levels of a factor, creating a factor out of numeric data, generating a factor out of patterns, re-ordering levels based on data, and unclass(). 
03:32
In this video, you will find answers to exercise questions on "Factor Operations".
11:47
This lecture deals with Text Operations: length, parts of a string, Concatenation, Pattern recognition/replacement.
02:42
In this video, you will find answers to exercise questions on "Operations on Text".
12:19
This lecture deals with Operations on Dates: creation, formatting, arithmetic, System date/time, POSIX time, and some built in date constants
03:23
In this video, you will find answers to exercise questions on "Date Operations".
Section 5: Material
41 pages
Section 6: Loops and Conditions
07:35
This Section introduces Loops and Conditions, where you will learn about conditional statements: If-then, While and Repeat. You will also see about the For Loop. The contents of this section apply in Section 10 where you will learn about Functions. There is no Exercise in this Section; you will practice Loops/Conditions in Section 10. 
Section 6: Material
5 pages
Section 7: Statistics
06:58
Section 7 deals with the use of R in Statistics, where you will see about using R to evaluate Descriptive Statistics, Probability Distributions, Hypothesis Testing, Linear Modeling, Generalized Linear Models, Non-Linear Regression, and Trees. In this lecture, you will focus on Descriptive Statistics, including Mean, Quantile, MAD, Variance, Range, Covariance and Correlation. 
03:28
In this video, you will find answers to exercise questions on "Descriptive Statistics".
10:52
In this lecture, you will focus on Probability Distributions, where you will learn about working with the PDF, CDF, quantile function and generating a random sample, for a variety of Probability Distributions including Normal, T, Binomial, Uniform, Exponential, etc 
01:26
In this video, you will find answers to exercise questions on "Probability Distributions".
12:28
In this lecture, you will focus in on Hypothesis Testing: One and Two Sample T-tests, where you will learn about One and Two Sided Tests.
03:21
In this video, you will find answers to exercise questions on "Hypothesis Testing - One and Two Sample Tests".
06:11
In this lecture, you will focus on the KS-test to determine whether two samples are statistically similar. You will also learn about the F-test that tests two samples based on their variance. 
01:37
In this video, you will find answers to exercise questions on "Hypothesis Testing - KS-test and F-test".
08:24
This lecture deals with the creation of Formula Objects to be used in R linear models. It discusses the use of operators in a Formula and how they are different from their typical mathematical meaning. 
01:52
In this video, you will find answers to exercise questions on "Linear Modeling - Working with Formula Objects".
10:35
In this lecture, you will learn how to use a formula object, and a data set to generate a linear model. Then you will see about mining the model, getting information out of it, and performing an ANOVA.
04:19
In this video, you will find answers to exercise questions on "Linear Modeling - Generating a Linear Model".
04:30
In this lecture, you will learn about updating a Linear Model: simulating the addition and deletion of model terms. You will also see about making a permanent change to the Model Formula.
01:36
In this video, you will find answers to exercise questions on "Linear Modeling - Updating a Linear Model".
08:00
In this lecture, you will review Generalized Linear Models, in situations where the response variable has a non-Normal Probability Distribution. You will learn about generating the model, mining it for information, and performing an ANOVA. The lecture will also show you how to generate a Logistic Regression model of Low Birth Weight data. The exercise that follows is based on the same Low Birth Weight data set; there will be no exercise video for the same.
08:09
In this lecture, you will learn about using R for Non-Linear Regression. You will see the creation of Formula Objects, as well as Model Generation. You will review how to mine the model.
02:22
In this video, you will find answers to exercise questions on "Non-Linear Regression".
08:15
In this Lecture, you will learn how to generate a Tree Model out of data that has discrete classes in it. You will review how to fine-tune and control the tree structure. 
04:10
In this video, you will find answers to exercise questions on "Tree Models".
Section 7: Material
71 pages
Section 8: Graphics
14:01
Section 8 deals in Graphics, where you will learn to create 2-dimensional Univariate and Multi-variate plots. You will also learn about formatting various parts of a plot, covering a range of topics like Plot Layout, Region, Points, Lines, Axes, Text, Color and so on. In this lecture, you will see about using function plot() to plot a vector, time-series object and a bar-plot for factor data. You will also see about using function barplot() to generate a bar plot.
03:52
In this video, you will find answers to exercise questions on "Univariate Plots - I".
13:24
In this lecture, you will review more Univariate plots: Piecharts, Histograms, Boxplots, Quantile-Quantile plots and the Kernel Density Estimation Plots.
02:30
In this video, you will find answers to exercise questions on "Univariate Plots - II".
14:32
The first lecture on Multi-variate plots deals mainly with Scatterplots. You will first learn to use function plot() to generate a scatter plot of two variables. Then you will use plot() to generate a matrix of scatterplots in the same figure - 3 different flavors. Next, you will see about using function pairs() to generate a matrix of scatterplots.

In this lecture, in addition, you will use plot() to generate multiple boxplots in the same figure. You will also learn about qqplot() to generate a quantile-quantile plot of two sample quantiles.
04:25
In this video, you will find answers to exercise questions on "Multivariate Plots - I".
11:41
In the second lecture on Multivariate Plots, you will learn to generate a Coplot, a Stars/Segment diagram, and a Cleveland Dot Plot.
03:57
In this video, you will find answers to exercise questions on "Multivariate Plots - II".
09:37
The remaining lectures in Section 8 focus on formatting a plot. You will start in this lecture with Points: using function par() to change the point type, adding Points, and identifying and labeling points on a plot through user input.
03:56
In this video, you will find answers to exercise questions on "Formatting a Plot - Points".
09:02
In this Lecture, you will learn to format Lines in a Plot - Line type, Line width. You will then learn to add lines through existing points, draw lines through the Plot, and finally to add segments/arrows to a Plot.
02:38
In this video, you will find answers to exercise questions on "Formatting a Plot - Lines".
12:57
In this Lecture, you will learn to format the plot region if it is a single Plot, and the Plot layout if it is a grid of plots. You will learn concepts such as Device Region, Figure Region, Plot Region, and Margins. The exercise in this lecture is a repeat of what you see in the concept video and as a result, there is no exercise video. 
10:40
In this Lecture, you will learn to format the axes of a plot, including box type, axis scale, and axis display. You will see about adding an axis to a plot.
01:29
In this video, you will find answers to exercise questions on "Formatting a Plot - Axes".
10:22
In this Lecture, you will see about formatting Text on a plot: Titles, Adding text, Margin Text, Text Position, Annotation, Text size, and Font/Style.
02:06
In this video, you will find answers to exercise questions on "Formatting a Plot - Text".
06:09
In this Lecture, you will learn about working with Plot Color: use of constants such as color(), palette() and adding contiguous colors from the spectrum. You will see about adding color to text, foreground and background of a plot.
02:06
In this video, you will find answers to exercise questions on "Formatting a Plot - Color".
05:12
Section 8 comes to a close with a discussion on miscellaneous items such as global vs. local changes to plot parameters. You will also learn how to add a polygon and shapes - circles, squares, rectangles etc - to a plot. 
01:03
In this video, you will find answers to exercise questions on "Miscellaneous".
Section 8: Material
82 pages
Section 9: Exporting Data out of R
06:12
Section 9 deals with Exporting out of R - Text and Graphics. In this lecture, you will learn about exporting a vector. You will also see about exporting a data frame into a text file.

Students Who Viewed This Course Also Viewed

  • Loading
  • Loading
  • Loading

Instructor Biography

Jagannath Rajagopal, Entrepreneur and Data Scientist

Hi! You can call me Jag. I have spent most of the past 10 years implementing Statistical Forecasting Systems at major companies in North America and Asia. I graduated from Georgia Tech [Atlanta, GA, USA] with a Masters in Industrial Engineering and so have a statistics background.

As part of my prior job, I have had to work with data extensively - mining, analyzing and summarizing. I have developed routines to cleanse historical sales data for input to Statistical Forecasting algorithms. I have also had to teach Statistical Forecasting and the use of said techniques and algorithms to every client I have been at.

These days, I am an entrepreneur and am based in Mississauga, ON, Canada. I am focussed on a couple of areas, one of which is online education.

If you want to connect with me, you can find me on LinkedIn - just mention that you found me on Udemy. Also check out my Deep Learning YouTube channel, Facebook page and Twitter page.

Ready to start learning?
Take This Course

How Packages Make R Complete

R

Users of R, the statistical programming language, are a surprisingly passionate bunch. If you're skeptical of the devotion, just search "love #rstats" in Twitter and you will see an outpouring of ecstatic declarations of love for the language.

For the outsider, it can be difficult to understand why someone would feel such fondness statistical programming language. It's just a way to do work with data and do statistical analysis, why would people care so much?

The answer likely lies in R's history as a collaborative project. R is an open source language, which means that no one "owns" R, and anyone can contribute to it. R has two million users globally, and many of these users have contributed to the language's development. R's main competitors in the statistical programming space (SPSS, SAS, and Stata) are all privately owned, and more difficult to modify.

For many people, R is not just a statistical software, but a vibrant community. The strong Meetup culture of R users is evidence of this. Even though they have similar user numbers, well over 100,000 people have signed up for R Meetups across the world, while only 5,000 people have signed up for SAS Meetups.

The main way R users can contribute to the "R Project" is by creating packages. Packages are programming tools that simplify the code necessary to complete common tasks such as aggregating and plotting data. R users have become accustomed to the idea that if they can't figure out how to do something, it won't be a problem, because, as statistician Roger Peng says, "There's an R Package for That."

This article explores the rise of the package and some examples of the coolest and most unique packages available to R users.

In many ways, the history of R is a story of the effort to make statistical programming easier and easier.

R was created in 1993 by the Kiwi statisticians Ross Ihaka and Robert Gentleman. The language is a modification of S, a programming language developed at Bell Labs in the 1970s. S was created to make programming simple enough for statisticians without computer science skills - the vast majority of them.

Prior to the development of S, statisticians would have had to learn to code in Fortran, an early programming language with a high barrier to entry, if they wanted to use computers for statistics. Instead of having to write out complicated code in Fortran to find the average of a set of numbers, a simple one word function could be used in S. R is very similar to S, but with a few key technical differences that make it even easier to use.

Even still, R is not the easiest language to learn. There are lots of tasks for which writing R code would be quite challenging for a non-expert or someone without a computer science background. And this is definitely relevant, since less than 15% of students who take Udemy's R courses have an academic background in computer science.

This is where packages come in. Packages are add-ons users can choose to download that simplify the code necessary for a task. Any R user who figures out some nifty code to solve a problem can "package" it up and share the code with other R users. The creators of these packages make no money off them, and likely do it out of their impulse to share and gain prestige within the community.

A simple example of how packages make life easier for the data analyst comes from the gdata package. Using "base R", the language without any add-ons, it can be a nuisance to import an Excel file into R. You have to save your file in another format (like a text file or a csv) and then when you read that new file into R, you have to specify whether the first row is a header.

Using the gdata package, created by R user Gregory Warnes, getting your data from Excel to R is a cinch. All you do is put the name of your file in parentheses and write "read.xls" before it.

The number of R packages has grown exponentially over the last 10 years. In 2005, there were only a couple hundred packages, but there are now well over 6,000. A list of every package is available here. The chart below displays this package explosion.




Data: r4stats

Hundreds of R users have "authored" packages. In order to have a package listed officially on the R website, the package must be scrupulously documented so it is clear to R users how to use it. One of the authors of the package must be put forward as the "maintainer", the person who will fix the any errors that appear.

The next chart highlights the individuals who maintain the most packages.




Data: datacamp

Packages these individuals develop improve the coding lives of the community and users are thankful. At the top of the list of maintainers is the influential package developer Hadley Wickham. His contributions to R have led to his being "nerd-famous" in the R community.

The following chart displays the most popular R packages, as defined by the number of downloads in August, 2015. Five of the ten most popular packages were authored by Wickham.




The most popular package is Rccp. Rccp is what allows many other R packages to run quickly (it calls C++ from R). Without it, R users would be spending a lot more time waiting around. The second most popular package, plyr, makes it easy to summarize data. It's the equivalent of pivot tables in Excel, but much more flexible. Other packages are for making cool charts (ggplot2), simplifying working with text variables (stringr), and providing the perfect color scheme for graphics (RColorBrewer).

Most of 20 most popular packages are attempts to uncomplicate common tasks, but users also make packages for endeavors that are surprisingly specific. We wanted to highlight a few of these more eccentric packages. They exhibit the breadth of cool stuff R users have created.

The "gender" Package

Have you ever needed to predict the gender of individuals in your dataset based on their name? You are in luck. The R programmer Lincoln Mullen created the package gender to make this a simple process for R users. With some straightforward code, you can find out the probability of a name belonging to a man or woman based on historical datasets.

For example, to find out the probability of a person named Hillary born in the United States in 1942 being male you can input the following code.

gender("Hillary", years = 1942)

The answer is 66% male. By 1970, Hillary was an exclusively female name.

This tool has come in handy for data scientists studying historical documents in which gender was not listed.

The "rvest" Package

R is not just for statistical analysis anymore. In a way that the creators of R could have never dreamed of, R has been developed to interact with the web. For example, the rvest package allows R users to crawl websites for interesting data.

Want a dataset of every cast member in a movie listed on the movie database website IMDB.com? Want to get all of the headlines from the New York Times on a given day? Get data on the best time to sell your used books? rvest gives R users the power to do this with easy to learn code.

Particularly convenient for R users is that when data from a website is pulled into R, it is in a format ready for statistical analysis.

The "sqlDF" Package

There are some R users who want to do everything they can within the R environment, including using other languages. The sqlDF package allows R users to use the popular database management language SQL (structure query language) within R. Users can take any SQL command, put it in between parentheses, with "sqlDF" before the parentheses, and R will execute the command.

This ability to use other languages within R can be convenient because periodically it is faster or easier to complete a task in another language. There are similar packages that allow users to run the languages Python and C++ from R.

The "acs" Package

While most packages are to make coding simpler, there are also a number of packages intended to give R users easy access to useful datasets. For example, the acs acs package gives R users the ability to download and analyze data from the United State Census. Before acs was created, getting census data into R was a laborious process. Now, thanks to package creator Ezra Glenn, it's a snap.

There are numerous other packages that give R users access to datasets. For example, the USABoundaries package contains state and country boundaries for the United States from 1629 to 2000 and the fueleconomy holds data on the fuel economy of all cars sold in the US from 1984 to 2014.

For many R users, the best aspect of working with R is that, like Isaac Newton, you are "standing upon the shoulders of giants."" Whatever your statistical problem, it is likely that some R user before has encountered that same issue, and made a package to ease the process for you.

It is just this feeling, that thousands of other people are helping you complete their project, which leads many R users to fall in love.