Find online courses made by experts from around the world.
Take your courses with you and learn anywhere, anytime.
Learn and practice realworld skills and achieve your goals.
Essential Fundamentals of R is an integrated program that draws from a variety of introductory topics and courses to provide participants with a solid base of knowledge with which to use R software for any intended purpose. No statistical knowledge, programming knowledge, or experience with R software is necessary. Essential Fundamentals of R (7 sessions) covers those important introductory topics basic to using R functions and data objects for any purpose: installing R and RStudio; interactive versus batch use of R; reading data and datasets into R; essentials of scripting; getting help in R; primitive data types; important data structures; using functions in R; writing userdefined functions; the 'apply' family of functions in R; data set manipulation: and subsetting, and row and column selection. Most sessions present "handson" material that demonstrate the execution of R 'scripts' (sets of commands) and utilize many extended examples of R functions, applications, and packages for a variety of common purposes. RStudio, a popular, open source Integrated Development Environment (IDE) for developing and using R applications, is also utilized in the program, supplemented with Rbased direct scripts (e.g. 'commandline prompts') when necessary.
Not for you? No problem.
30 day money back guarantee.
Forever yours.
Lifetime access.
Learn on the go.
Desktop, iOS and Android.
Get rewarded.
Certificate of completion.
Section 1: Introduction and Orientation  

Lecture 1  14:56  
R is a programming language and software environment for statistical computing and graphics. The R language is widely used among statisticians and data miners for developing statistical software and data analysis. Polls, surveys of data miners, and studies of scholarly literature databases show that R's popularity has increased substantially in recent years. R is an implementation of the S programming language combined with lexical scoping semantics inspired by Scheme. S was created by John Chambers while at Bell Labs. There are some important differences, but much of the code written for S runs unaltered. R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, and is currently developed by the R Development Core Team, of which Chambers is a member. R is named partly after the first names of the first two R authors and partly as a play on the name of S. R is a GNU project.^{ }The source code for the R software environment is written primarily in C, Fortran, and R.^{ }R is freely available under the GNU General Public License, and precompiled binary versions are provided for various operating systems. R uses a command line interface; there are also several graphical frontends for it. 

Lecture 2  14:24  
R is a programming language and software environment for statistical computing and graphics. The R language is widely used among statisticians and data miners for developing statistical software and data analysis. Polls, surveys of data miners, and studies of scholarly literature databases show that R's popularity has increased substantially in recent years. R is an implementation of the S programming language combined with lexical scoping semantics inspired by Scheme. S was created by John Chambers while at Bell Labs. There are some important differences, but much of the code written for S runs unaltered. R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, and is currently developed by the R Development Core Team, of which Chambers is a member. R is named partly after the first names of the first two R authors and partly as a play on the name of S. R is a GNU project.^{ }The source code for the R software environment is written primarily in C, Fortran, and R.^{ }R is freely available under the GNU General Public License, and precompiled binary versions are provided for various operating systems. R uses a command line interface; there are also several graphical frontends for it. 

Lecture 3  15:37  
The R Environment consists of all the files necessary for running the R Program as well as data sets and other objects that you have created or loaded into your Workspace. These files can be broken down into three basic types: 1. The base packages that run all the standard analyses that we use in this course. These files are installed automatically when you first download and install the R program. 2. Additional packages you can install on your own and which allow for more advanced statistical analysis or additional commands. 3. The data sets that you download and other objects (data sets and variables) that you create. 

Lecture 4 
Workspace Management R Manuals

12:58  
Lecture 5 
HandsOn Tutorial of R Basics (part 1)

14:35  
Lecture 6 
HandsOn Tutorial of R Basics (part 2)

14:53  
Lecture 7 
Tutorial with R Functions

13:54  
Lecture 8  19:25  
R Functions for Probability Distributions Every distribution that R handles has four functions. There is a root name, for example, the root name for the normal distribution is
pnorm , qnorm , dnorm , and rnorm . For the binomial distribution, these functions are pbinom , qbinom , dbinom , and rbinom . And so forth. For a continuous distribution (like the normal), the most useful functions for doing problems involving probability calculations are the " For a discrete distribution (like the binomial), the " f(x) = P(X = x)and hence is useful in calculating probabilities.


Section 2: Input and Output, Data and Data Structures  
Lecture 9 
Data Input and Output

14:44  
Lecture 10 
Accessing Data Sets in R

14:40  
Lecture 11 
Basic Data Structures (part 1)

14:47  
Lecture 12 
Basic Data Structures (part 2)
Preview

14:42  
Lecture 13 
Basic Data Structures (part 3)

14:35  
Lecture 14  16:04  
A data frame is used for storing data tables. It is a list of vectors of equal length. For example, the following variable df is a data frame containing three vectors n, s, b. > n = c(2, 3, 5) We use builtin data frames in R for our tutorials. For example, here is a builtin data frame in R, called mtcars. > mtcars The top line of the table, called the header, contains the column names. Each horizontal line afterward denotes a data row, which begins with the name of the row, and then followed by the actual data. Each data member of a row is called a cell. To retrieve data in a cell, we would enter its row and column coordinates in the single square bracket "[]" operator. The two coordinates are separated by a comma. In other words, the coordinates begins with row position, then followed by a comma, and ends with the column position. The order is important. Here is the cell value from the first row, second column of mtcars. > mtcars[1, 2] Moreover, we can use the row and column names instead of the numeric coordinates. > mtcars["Mazda RX4", "cyl"] Lastly, the number of data rows in the data frame is given by the nrow function. > nrow(mtcars) # number of data rows And the number of columns of a data frame is given by the ncol function. > ncol(mtcars) # number of columns Further details of the mtcars data set is available in the R documentation. > help(mtcars) 

Lecture 15  10:48  
One of the most important aspects of computing with data is the ability to manipulate it, to enable subsequent analysis and visualization. R offers a wide range of tools for this purpose. 

Lecture 16 
Input Output Exercises

01:04  
Lecture 17 
Dataframe Manipulation Exercises

04:11  
Section 3: Manipulating Dataframes in Depth  
Lecture 18 
Input Output Exercises Solution

14:21  
Lecture 19 
Data Manipulation Exercise Solution

14:25  
Lecture 20 
Manipulating Dataframes (part 3)
Preview

07:48  
Lecture 21 
Manipulating Dataframes (part 4)

14:24  
Lecture 22 
Manipulating Dataframes (part 5)

18:37  
Lecture 23 
Manipulating Dataframes (part 6)

12:12  
Section 4: UserDefined Functions in R  
Lecture 24 
Remaining Data Manipulation Exercises Solutions

14:58  
Lecture 25 
UserDefined Function Exercise and Finish Manipulating Dataframes

15:43  
Lecture 26  14:06  
One of the great strengths of R is the user's ability to add functions. In fact, many of the functions in R are actually functions of functions. The structure of a function is given below. 

Lecture 27  14:16  
Objects in the function are local to the function. The object returned can be any data type. 

Lecture 28 
Formal, Local and Free Parameters
Preview

14:42  
Lecture 29 
Flexible Arguments to Functions

12:06  
Section 5: Writing Functions in R  
Lecture 30 
UserDefined Functions Exercise Solution

13:22  
Lecture 31 
More on UserDefined Functions

14:31  
Lecture 32  15:13  
The classic, Fortranlike loop is available in R. The syntax is a little different, but the idea is identical; you request that an index, i, takes on a sequence of values, and that one or more lines of commands are executed as many times as there are different values of i. Here is a loop executed five times with the values of i from 1 to 5: we print the square of each value: for (i in 1:5) print(i∧2) [1] 1 [1] 4 [1] 9 [1] 16 [1] 25 

Lecture 33 
Control Statements

16:05  
Lecture 34 
Returning Values from a Function
Preview

15:29  
Lecture 35  12:33  
The purpose of the R function inc < function(x) return(x+1) It instructs R to create a function that adds 1 to its argument and then assigns that function to 

Section 6: The Apply Family of Functions  
Lecture 36 
Some Short Programs in R (part 1)

15:13  
Lecture 37 
Some Short Programs in R (part 2)

16:02  
Lecture 38  16:11  
"Apply" functions keep you from having to write loops to perform some operation on every row or every column of a matrix or data frame, or on every element in a list. For example, the builtin data set state.x77 contains eight columns of data describing the 50 U.S. states in 1977. If you wanted the average of each of the eight columns, you could do this: > avgs < numeric (8) > for (i in 1:8) + avgs[i] < mean (state.x77[,i]) # The "+" is R's continuation character; don't type it > avgs [1] 4246.4200 4435.8000 1.1700 70.8786 7.3780 53.1080 104.4600 70735.8800This is comparatively slow, much more so in large datasets. R is bad at looping. A more vectorized way to do this is to use the apply() function. In this example, apply extracts each column as a vector, one at a time, and passes it to the median() function. > apply (state.x77, 2, median) Population Income Illiteracy Life Exp Murder HS Grad Frost Area 2838.5 4519 0.95 70.675 6.85 53.25 114.5 54277The 2 means "go by column"  a 1 would have meant "go by row." Of course, if we had used a 1, we would have computed 50 averages, one for each row. If we had had a threedimensional array we could have used a 3 there. The third argument specifies the function to be applied to each column. We can use any function that makes sense there. We can use our own function or even pass in a function that we write on the spot. If your function returns a vector of constant length, SPlus will stick the vectors together into a matrix. However, if your function returns vectors of different lengths, SPlus will have to create a list 

Lecture 39 
The Apply Family of Functions (part 2)
Preview

17:35  
Lecture 40 
The Apply Family of Functions (part 3)

11:47  
Lecture 41 
Apply Functions Exercises

08:22  
Section 7: Reshaping and Recoding Data  
Lecture 42 
Apply Functions Exercises Solutions

14:24  
Lecture 43  12:04  
The reshape package in R l 

Lecture 44  16:57  
Recoding data allows you to change a data type, for example, or to calculate new data columns from existing ones. 

Lecture 45  15:50  
Recoding data allows you to change a data type, for example, or to calculate new data columns from existing ones. 

Lecture 46 
More VectorMaker Exercises

06:58 
Dr. Geoffrey Hubona held fulltime tenuretrack, and tenured, assistant and associate professor faculty positions at 3 major state universities in the Eastern United States from 19932010. In these positions, he taught dozens of various statistics, business information systems, and computer science courses to undergraduate, master's and Ph.D. students. He earned a Ph.D. in Business Administration (Information Systems and Computer Science) from the University of South Florida (USF) in Tampa, FL (1993); an MA in Economics (1990), also from USF; an MBA in Finance (1979) from George Mason University in Fairfax, VA; and a BA in Psychology (1972) from the University of Virginia in Charlottesville, VA. He was a fulltime assistant professor at the University of Maryland Baltimore County (19931996) in Catonsville, MD; a tenured associate professor in the department of Information Systems in the Business College at Virginia Commonwealth University (19962001) in Richmond, VA; and an associate professor in the CIS department of the Robinson College of Business at Georgia State University (20012010). He is the founder of the Georgia R School (20102014) and of RCourseware (2014Present), online educational organizations that teach research methods and quantitative analysis techniques. These research methods techniques include linear and nonlinear modeling, multivariate methods, data mining, programming and simulation, and structural equation modeling and partial least squares (PLS) path modeling. Dr. Hubona is an expert of the analytical, opensource R software suite and of various PLS path modeling software packages, including SmartPLS. He has published dozens of research articles that explain and use these techniques for the analysis of data, and, with software codevelopment partner Dean Lim, has created a popular cloudbased PLS software application, PLSGUI.