The Comprehensive Programming in R Course

How to design and develop efficient general-purpose R applications for diverse tasks and domains.
4.0 (87 ratings) Instead of using a simple lifetime average, Udemy calculates a
course's star rating by considering a number of different factors
such as the number of ratings, the age of ratings, and the
likelihood of fraudulent ratings.
1,730 students enrolled
$19
$60
68% off
Take This Course
  • Lectures 120
  • Length 25 hours
  • Skill Level All Levels
  • Languages English
  • Includes Lifetime access
    30 day money back guarantee!
    Available on iOS and Android
    Certificate of Completion
Wishlisted Wishlist

How taking a course works

Discover

Find online courses made by experts from around the world.

Learn

Take your courses with you and learn anywhere, anytime.

Master

Learn and practice real-world skills and achieve your goals.

About This Course

Published 8/2015 English

Course Description

The Comprehensive Programming in R Course is actually a combination of two R programming courses that together comprise a gentle, yet thorough introduction to the practice of general-purpose application development in the R environment. The original first course (Sections 1-8) consists of approximately 12 hours of video content and provides extensive example-based instruction on details for programming R data structures. The original second course (Sections 9-14), an additional 12 hours of video content, provides a comprehensive overview on the most important conceptual topics for writing efficient programs to execute in the unique R environment. Participants in this comprehensive course may already be skilled programmers (in other languages) or they may be complete novices to R programming or to programming in general, but their common objective is to write R applications for diverse domains and purposes. No statistical knowledge is necessary. These two courses, combined into one course here on Udemy, together comprise a thorough introduction to using the R environment and language for general-purpose application development.

The Comprehensive Programming in R Course (Sections 1-8) presents an detailed, in-depth overview of the R programming environment and of the nature and programming implications of basic R objects in the form of vectors, matrices, dataframes and lists. The Comprehensive Programming in R Course (Sections 9-14) then applies this understanding of these basic R object structures to instruct with respect to programming the structures; performing mathematical modeling and simulations; the specifics of object-oriented programming in R; input and output; string manipulation; and performance enhancement for computation speed and to optimize computer memory resources.

What are the requirements?

  • Students will need to install the no-cost R console and the no-cost RStudio application (instructions are provided).

What am I going to get from this course?

  • Acquire the skills needed to successfully develop general-purpose programming applications in the R environment
  • Possess an in-depth understanding of the R programming environment and of the requirements for, and programming implications of, writing code using basic R objects: vectors, matrices, dataframes and lists.
  • Understand the object-oriented characteristics of programming in R and know how to create S3 and S4 Class objects and functions that process these S3 and S4 objects.
  • Know how to program mathematical functions, models and simulations in R.
  • Know how to write R programs that effectively use and manipulate text and string variable objects.
  • Know how to use the scan(), readline(), cat(), print() and readLines() functions in R for efficient data input and output and for effective user-prompting.
  • Know how to 'tweak' R programs for maximum performance efficiency.

What is the target audience?

  • Anyone interested in writing computer applications that execute in the R environment.
  • The common objective of students is common objective is to write R applications for diverse domains and purposes.
  • Students may already be skilled programmers (in other languages) or they may be complete novices to R programming or to programming in general,
  • Undergraduate or graduate students looking to acquire marketable job skills prior to graduation.
  • Analytics professionals looking to acquire additional job skills.

What you get with this course?

Not for you? No problem.
30 day money back guarantee.

Forever yours.
Lifetime access.

Learn on the go.
Desktop, iOS and Android.

Get rewarded.
Certificate of completion.

Curriculum

Section 1: Introduction and Overview of R
Introduction to Comprehensive R Programming Course
Preview
01:52
Introduction and Getting Started
14:54
Getting Started and First R Session
14:53
First R Session (part 2)
Preview
14:51
First R Session (part 3)
15:08
Matrices, Lists and Dataframes
14:58
15:02

One of the great strengths of R is the user's ability to add functions. In fact, many of the functions in Rare actually functions of functions. The structure of a function is given below.

myfunction <- function(arg1, arg2, ... )

{statements}

Objects in the function are local to the function.

Functions and Default Arguments
14:49
More Examples of Functions (part 1)
14:17
More Functions Examples (part 2)
12:00
More Functions Examples (part 3)
11:12
More Functions Examples (part 4)
12:25
More Functions Examples (part 5)
10:19
More Functions Examples (part 6)
07:31
Section 2: What are Vector Data Structures in R ?
Homemade t-test Exercise Solution
15:50
Section 2 Exercise and Package Demonstrations
14:23
15:30

A vector is a sequence of data elements of the same basic type. Members in a vector are officially called components. Nevertheless, they are often called elements.

More Examples of Vectors
14:36
Common Vector Operations and More
14:18
Findruns Example and Vectors Exercises
14:12
Section 3: More Discussion of Vector Data Structures
Vector-Based Programming Exercise Solution (part 1)
14:46
Vector Exercise Solution (part 2) and Begin General Vector Discussion
16:05
Continue General Vector Discussion
Preview
16:03
More General Vector Examples
12:56
More on Vectors and Vector Equality
16:40
Extended Vector Example and Exercise
13:08
Section 4: Finish Vectors and Begin Matrices
Finish Vector Discussion
16:33
Vector-Maker Exercise Solutions
17:08
14:57

Creating matrices

The function matrix creates matrices.
 matrix(data, nrow, ncol, byrow) 
The data argument is usually a list of the elements that will fill the matrix. The nrow and ncolarguments specify the dimension of the matrix. Often only one dimension argument is needed if, for example, there are 20 elements in the data list and ncol is specified to be 4 then R will automatically calculate that there should be 5 rows and 4 columns since 4*5=20. The byrowargument specifies how the matrix is to be filled. The default value for byrow is FALSE which means that by default the matrix will be filled column by column.

seq1 <- seq(1:6)

mat1 <- matrix(seq1, 2)

mat2 <- matrix(seq1, 2, byrow = T)

Filtering Matrices and More Examples
15:55
Still More Matrices Examples
16:52
Section 5: Finish Matrices and Begin Lists Discussion
Min-Merge Vector Exercise Solutions
15:15
Game of Craps Exercise Solution
Preview
09:02
Naming Matrix Rows and Columns
15:47
11:48

A list is an R structure that may contain object of any other types, including other lists. Lots of the modeling functions (like t.test() for the t test or lm() for linear models) produce lists as their return values, but you can also construct one yourself:

 mylist <- list (a = 1:5, b = "Hi There", c = function(x) x * sin(x)) 
Processing Text with Lists
Preview
14:47
Applying Functions to Lists
17:32
Vector and Matrix Exercise
04:52
Section 6: Continue Lists Discussion
Review Programming Exercises
16:08
Finish Programming Exercise Review and Begin Discussing Lists
15:16
List Data Structures General Discussion (part 2)
Preview
16:22
List Data Structures General Discussion (part 3)
15:46
Lists Data Structures General Discussion (part 4)
15:48
Section 7: Details About Dataframe Data Structures
13:52

Data Frames

A data frame is more general than a matrix, in that different columns can have different modes (numeric, character, factor, etc.). This is similar to SAS and SPSS datasets.

d <- c(1,2,3,4)<br> e <- c("red", "white", "red", NA)<br> f <- c(TRUE,TRUE,TRUE,FALSE)<br> mydata <- data.frame(d,e,f)<br> names(mydata) <- c("ID","Color","Passed") # variable names

There are a variety of ways to identify the elements of a data frame .

myframe[3:5] # columns 3,4,5 of data frame<br> myframe[c("ID","Age")] # columns ID and Age from data frame<br> myframe$X1 # variable x1 in the data frame

15:08

A data frame is a table, or two-dimensional array-like structure, in which each column contains measurements on one variable, and each row contains one case or sample (observation) with the corresponding values for each variable for that observation.

Extracting Subdata Frames
Preview
16:20
A Salary Survey Extended Example
16:00
Merging Dataframes
14:29
End Dataframes Discussion; Matrix Exercise
14:10
Section 8: More Matrix and List Examples
Covariance Matrix Exercise Solution
12:21
14:05

Lists

An ordered collection of objects (components). A list allows you to gather a variety of (possibly unrelated) objects under one name.

# example of a list with 4 components - <br> # a string, a numeric vector, a matrix, and a scaler <br> w <- list(name="Fred", mynumbers=a, mymatrix=y, age=5.3)<br> <br> # example of a list containing two lists <br> v <- c(list1,list2)

Identify elements of a list using the [[]] convention.

mylist[[2]] # 2nd component of the list<br> mylist[["mynumbers"]] # component named mynumbers in list

List Example: Tree Growth (part 2)
10:45
14:32

Factors

Tell R that a variable is nominal by making it a factor. The factor stores the nominal values as a vector of integers in the range [ 1... k ] (where k is the number of unique values in the nominal variable), and an internal vector of character strings (the original values) mapped to these integers.

# variable gender with 20 "male" entries and <br> # 30 "female" entries <br> gender <- c(rep("male",20), rep("female", 30)) <br> gender <- factor(gender) <br> # stores gender as 20 1s and 30 2s and associates<br> # 1=female, 2=male internally (alphabetically)<br> # R now treats gender as a nominal variable <br> summary(gender)

An ordered factor is used to represent an ordinal variable.

# variable rating coded as "large", "medium", "small'<br> rating <- ordered(rating)<br> # recodes rating to 1,2,3 and associates<br> # 1=large, 2=medium, 3=small internally<br> # R now treats rating as ordinal

R will treat factors as nominal variables and ordered factors as ordinal variables in statistical proceedures and graphical analyses. You can use options in the factor( ) and ordered( ) functions to control the mapping of integers to strings (overiding the alphabetical ordering). You can also use factors to createvalue labels.

Factors: tapply() and split() Functions
15:58
10:58

1. Creating factor variables

Factor variables are categorical variables that can be either numeric or string variables. There are a number of advantages to converting categorical variables to factor variables. Perhaps the most important advantage is that they can be used in statistical modeling where they will be implemented correctly, i.e., they will then be assigned the correct number of degrees of freedom. Factor variables are also very useful in many different types of graphics. Furthermore, storing string variables as factor variables is a more efficient use of memory. To create a factor variable we use the factor function. The only required argument is a vector of values which can be either string or numeric. Optional arguments include the levels argument, which determines the categories of the factor variable, and the default is the sorted list of all the distinct values of the data vector. The labels argument is another optional argument which is a vector of values that will be the labels of the categories in thelevels argument.

Pascal's Triangle Exercise
02:37
Section 9: Programming in R Environments
Pascal's Triangle Exercise Solution
10:27
Begin Programming Structures
15:32
14:16

R Programming Environment and Scope

In order to write functions in a proper way and avoid unusual errors, we need to know the concept of environment and scope in R.

R Programming Environment

Environment can be thought of as a collection of objects (functions, variables etc.). An environment is created when we first fire up the R interpreter. Any variable we define, is now in this environment. The top level environment available to us at the R command prompt is the global environment called R_GlobalEnv. Global environment can be referred to as .GlobalEnv in R codes as well. We can use thels() function to show what variables and functions are defined in the current environment. Moreover, we can use the environment() function to get the current environment.

Nesting Multiple Environments
16:06
Referencing Variables in Other Frames
14:53
Writing to Global Variables and Recursion
14:05
19:32

Anonymous Functions

As remarked at several points in this book, the purpose of the R function function() is to create functions. For instance, consider this code:

 inc <- function(x) return(x+1) 

It instructs R to create a function that adds 1 to its argument and then assigns that function to inc. However, that last step—the assignment—is not always taken. We can simply use the function object created by our call tofunction() without naming that object. The functions in that context are called anonymous, since they have no name. (That is somewhat misleading, since even nonanonymous functions only have a name in the sense that a variable is pointing to them.)

Sorting Programs Exercise
07:08
Section 10: Performing Math and Simulations
Sorting Programs Exercise Solution (part 1)
12:30
Sorting Programs Exercise Solution (part 2)
13:57
Calculating a Probability
Preview
12:27
Linear Algebra Operations
17:08
15:22

Set OperationsDescription

Performs set union, intersection, (asymmetric!) difference, equality and membership on two vectors.

Usage
 union(x, y)  intersect(x, y)  setdiff(x, y)  setequal(x, y)  is.element(el, set) 
Argumentsx, y, el, setvectors (of the same mode) containing a sequence of items (conceptually) with no duplicated values.Details

Each of union, intersect, setdiff and setequal will discard any duplicated values in the arguments, and they apply as.vector to their arguments (and so in particular coerce factors to character vectors).

is.element(x, y) is identical to x %in% y.

Combinatorial Simulations (part 1)
10:40
Combinatorial Simulations (part 2)
15:28
Winning at Roulette Exercise
07:39
Section 11: Object Oriented Programming (OOP) and S3 and S4 Classes
Winning at Roulette Exercise solution
13:17
11:16

Central to any object-oriented system are the concepts of class and method. A class defines the behavior of objects by describing their attributes and their relationship to other classes. The class is also used when selecting methods, functions that behave differently depending on the class of their input. Classes are usually organised in a hierarchy: if a method does not exist for a child, then the parent's method is used instead; the child inherits behaviour from the parent.

OOP Example: lm() Function
10:32
09:34

R's OO systems differ in how classes and methods are defined:

  • S3 implements a style of OO programming called generic-function OO. This is different from most programming languages, like Java, C++, and C#, which implement message-passing OO. With message-passing, messages (methods) are sent to objects and the object determines which function to call. Typically, this object has a special appearance in the method call, usually appearing before the name of the method/message: e.g.,canvas.drawRect("blue"). S3 is different. While computations are still carried out via methods, a special type of function called a generic function decides which method to call, e.g., drawRect(canvas, "blue"). S3 is a very casual system. It has no formal definition of classes.
  • S4 works similarly to S3, but is more formal. There are two major differences to S3. S4 has formal class definitions, which describe the representation and inheritance for each class, and has special helper functions for defining generics and methods. S4 also has multiple dispatch, which means that generic functions can pick methods based on the class of any number of arguments, not just one.
Using Inheritance
07:23
Compressing Matrices Example (part 1)
14:37
Compressing Matrices Example (part 2)
02:35
Writing S3 Classes Exercise
02:35
14:11

R's OO systems differ in how classes and methods are defined:

  • S3 implements a style of OO programming called generic-function OO. This is different from most programming languages, like Java, C++, and C#, which implement message-passing OO. With message-passing, messages (methods) are sent to objects and the object determines which function to call. Typically, this object has a special appearance in the method call, usually appearing before the name of the method/message: e.g.,canvas.drawRect("blue"). S3 is different. While computations are still carried out via methods, a special type of function called a generic function decides which method to call, e.g., drawRect(canvas, "blue"). S3 is a very casual system. It has no formal definition of classes.
  • S4 works similarly to S3, but is more formal. There are two major differences to S3. S4 has formal class definitions, which describe the representation and inheritance for each class, and has special helper functions for defining generics and methods. S4 also has multiple dispatch, which means that generic functions can pick methods based on the class of any number of arguments, not just one.
Implementing S4 Generic Functions
16:31
Writing S4 Classes Exercise
03:41
Live S3 and S4 Class Development
07:36
Continue S3 Class Development
13:19
Developing a Corresponding S4 Class
10:01
Section 12: Input and Output
Writing S3 Classes Exercise Solution
09:08
Writing S4 Classes Exercise Solution
08:02

Students Who Viewed This Course Also Viewed

  • Loading
  • Loading
  • Loading

Instructor Biography

Geoffrey Hubona, Ph.D., Professor of Information Systems

Dr. Geoffrey Hubona held full-time tenure-track, and tenured, assistant and associate professor faculty positions at 3 major state universities in the Eastern United States from 1993-2010. In these positions, he taught dozens of various statistics, business information systems, and computer science courses to undergraduate, master's and Ph.D. students. He earned a Ph.D. in Business Administration (Information Systems and Computer Science) from the University of South Florida (USF) in Tampa, FL (1993); an MA in Economics (1990), also from USF; an MBA in Finance (1979) from George Mason University in Fairfax, VA; and a BA in Psychology (1972) from the University of Virginia in Charlottesville, VA. He was a full-time assistant professor at the University of Maryland Baltimore County (1993-1996) in Catonsville, MD; a tenured associate professor in the department of Information Systems in the Business College at Virginia Commonwealth University (1996-2001) in Richmond, VA; and an associate professor in the CIS department of the Robinson College of Business at Georgia State University (2001-2010). He is the founder of the Georgia R School (2010-2014) and of R-Courseware (2014-Present), online educational organizations that teach research methods and quantitative analysis techniques. These research methods techniques include linear and non-linear modeling, multivariate methods, data mining, programming and simulation, and structural equation modeling and partial least squares (PLS) path modeling. Dr. Hubona is an expert of the analytical, open-source R software suite and of various PLS path modeling software packages, including SmartPLS. He has published dozens of research articles that explain and use these techniques for the analysis of data, and, with software co-development partner Dean Lim, has created a popular cloud-based PLS software application, PLS-GUI.

Ready to start learning?
Take This Course