Find online courses made by experts from around the world.
Take your courses with you and learn anywhere, anytime.
Learn and practice realworld skills and achieve your goals.
Programming Statistical Applications in R is an introductory course teaching the basics of programming mathematical and statistical applications using the R language. The course makes extensive use of the Introduction to Scientific Programming and Simulation using R (spuRs) package from the Comprehensive R Archive Network (CRAN). The course is a scientificprogramming foundations course and is a useful complement and precursor to the more simulationapplication oriented R Programming for Simulation and MonteCarlo Methods Udemy course. The two courses were originally developed as a twocourse sequence (although they do share some exercises in common). Together, both courses provide a powerful set of unique and useful instruction about how to create your own mathematical and statistical functions and applications using R software.
Programming Statistical Applications in R is a "handson" course that comprehensively teaches fundamental R programming skills, concepts and techniques useful for developing statistical applications with R software. The course also uses dozens of "realworld" scientific function examples. It is not necessary for a student to be familiar with R, nor is it necessary to be knowledgeable about programming in general, to successfully complete this course. This course is 'selfcontained' and includes all materials, slides, exercises (and solutions); in fact, everything that is seen in the course video lessons is included in zipped, downloadable materials files. The course is a great instructional resource for anyone interested in refining their skills and knowledge about statistical programming using the R language. It would be useful for practicing quantitative analysis professionals, and for undergraduate and graduate students seeking new jobrelated skills and/or skills applicable to the analysis of research data.
The course begins with basic instruction about installing and using the R console and the RStudio application and provides necessary instruction for creating and executing R scripts and R functions. Basic R data structures are explained, followed by instruction on data input and output and on basic R programming techniques and control structures. Detailed examples of creating new statistical R functions, and of using existing statistical R functions, are presented. Boostrap and Jackknife resampling methods are explained in detail, as are methods and techniques for estimating inference and for constructing confidence intervals, as well as of performing Nfold cross validation assessments of competing statistical models. Finally, detailed instructions and examples for debugging and for making R programs run more efficiently are demonstrated.
Not for you? No problem.
30 day money back guarantee.
Forever yours.
Lifetime access.
Learn on the go.
Desktop, iOS and Android.
Get rewarded.
Certificate of completion.
Section 1: Introduction to Course Materials, Installing Packages, and Executing Scripts  

Lecture 1 
Course Introduction
Preview

01:58  
Lecture 2 
Introduction to Course Materials

03:21  
Lecture 3  00:45  
RStudio is an Integrated Development Environment (IDE) software tool developed especially to run R software. 

Lecture 4  07:34  
R is a programming language and software environment for statistical computing and graphics. The R language is widely used among statisticians and data miners for developing statistical software and data analysis. 

Lecture 5 
A Look at the R Console and RStudio

04:43  
Lecture 6 
Executing Script and Installing Packages in RStudio (part 1)
Preview

07:25  
Lecture 7 
Executing Script and Installing Packages in RStudio (part 2)

07:08  
Lecture 8 
R Script Demonstrations using RStudio

06:40  
Lecture 9  07:46  
To make the best of the R language, you'll need a strong understanding of the basic data types and data structures and how to operate on those. It is very Important to understand because these are the objects you will manipulate on a daytoday basis in R. 

Lecture 10 
Scripting Basic Data Structures (part 2)

08:28  
Lecture 11  07:11  
Functions have named arguments which potentially have default values. The formal arguments are the arguments included in the function definition. The formals function returns a list of all the formal arguments of a function. Not every function call in R makes use of all the formal arguments. Function arguments can be missing or might have default values. 

Lecture 12 
R Functions (part 2)

06:59  
Lecture 13 
R Functions (part 3)

07:11  
Lecture 14  06:15  
Creating matrices The function matrix creates matrices. matrix(data, nrow, ncol, byrow) The data argument is usually a list of the elements that will fill the matrix. The nrow and ncol arguments specify the dimension of the matrix. Often only one dimension argument is needed if, for example, there are 20 elements in the data list and ncol is specified to be 4 then R will automatically calculate that there should be 5 rows and 4 columns since 4*5=20. The byrow argument specifies how the matrix is to be filled. The default value for byrow is FALSE which means that by default the matrix will be filled column by column. 

Lecture 15 
Manipulating Matrices (part 2)

06:22  
Lecture 16 
Manipulating Matrices (part 3)

05:39  
Section 2: Basic R Programming Concepts and Techniques  
Lecture 17 
Basic R Programming Concepts and Examples (part 1)

07:15  
Lecture 18 
Basic R Programming Concepts and Examples (part 2)
Preview

08:37  
Lecture 19  07:39  
R has the standard control structures you would expect. expr can be multiple (compound) statements by enclosing them in braces { }. It is more efficient to use builtin functions rather than control structures whenever possible. ifelse 

Lecture 20 
Looping Control Structure Examples (part 2)

08:48  
Lecture 21 
Looping and Control Structure Exercises

00:50  
Lecture 22 
Data Input and Output (part 1)

07:06  
Lecture 23 
Data Input and Output (part 2)

05:56  
Lecture 24 
Formatting Output (part 1)

10:13  
Lecture 25 
Formatting Output (part 2)

07:46  
Lecture 26 
Interactive Input and Output

07:54  
Lecture 27 
Looping and Control Structure Exercises (part 1)

09:15  
Lecture 28 
Looping and Control Structure Exercises (part 2)

07:35  
Lecture 29 
Looping and Control Structure Exercises (part 3)

07:50  
Lecture 30 
Writing Output to a File (part 1)

06:55  
Lecture 31 
Writing Output to a File (part 2)

06:37  
Lecture 32 
Plotting as Output (part 1)

06:12  
Lecture 33 
Plotting as Output (part 2)

07:33  
Lecture 34 
Exercise: Writing Statistical and Scientific Expressions

1 page  
Lecture 35 
Exercise Solution: Writing Statistical and Scientific Functions

8 pages  
Section 3: Writing UserDefined Functions in R  
Lecture 36 
Writing Functions as Programs (part 1)

10:03  
Lecture 37 
Writing Functions as Programs (part 2)

08:02  
Lecture 38 
Windsorized Means Example
Preview

09:00  
Lecture 39  08:15  
Userwritten Functions One of the great strengths of R is the user's ability to add functions. In fact, many of the functions in Rare actually functions of functions. The structure of a function is given below. Objects in the function are local to the function. The object returned can be any data type. 

Lecture 40 
Writing Functions in R (part 2)

08:04  
Lecture 41 
Writing Functions in R (part 3)

09:35  
Lecture 42 
Writing Functions in R (part 4)

07:31  
Lecture 43 
Apply Family of Functions (part 1)

08:28  
Lecture 44 
Apply Family of Functions (part 2)

08:53  
Lecture 45 
Apply Family of Functions (part 3)

07:40  
Lecture 46 
Apply Family of Functions (part 4)

10:43  
Lecture 47 
Apply Family of Functions (part 5)

06:07  
Lecture 48 
Making Programs Run Efficiently

10:27  
Lecture 49 
Exercise: Writing Functions and Programs

2 pages  
Lecture 50 
Exercise Solutions: Writing Functions and Programs (part 1)

07:58  
Lecture 51 
Exercise Solutions: Writing Functions and Programs (part 2)

04:40  
Lecture 52 
Exercise: Vector Maker Functions

04:38  
Section 4: Data Types and Structures: Factors, Dataframes and Lists  
Lecture 53 
Exercise Solutions: Vector Maker Functions (part 1)

09:02  
Lecture 54 
Exercise Solutions: Vector Maker Functions (part 2)

07:41  
Lecture 55  08:33  
Factors Tell R that a variable is nominal by making it a factor. The factor stores the nominal values as a vector of integers in the range [ 1... k ] (where k is the number of unique values in the nominal variable), and an internal vector of character strings (the original values) mapped to these integers. 

Lecture 56 
Data Types: Factors (part 2)

10:20  
Lecture 57  07:33  
Data Frames A data frame is more general than a matrix, in that different columns can have different modes (numeric, character, factor, etc.). This is similar to SAS and SPSS datasets. 

Lecture 58 
Data Structures: Dataframes (part 2)

08:14  
Lecture 59 
Data Structures: Dataframes (part 3)

08:22  
Lecture 60 
Data Structures: Dataframes (part 4)

06:47  
Lecture 61  09:29  
Lists An ordered collection of objects (components). A list allows you to gather a variety of (possibly unrelated) objects under one name. 

Lecture 62 
Data Structures: Lists (part 2)

12:03  
Section 5: Bootstrap and Jackknife Resampling Methods  
Lecture 63  07:42  
In statistics, resampling is any of a variety of methods for doing one of the following:
Common resampling techniques include bootstrapping, jackknifing and permutation tests. 

Lecture 64 
Bootstrap Estimate of Standard Error and Bias (part 2)

07:39  
Lecture 65 
Bootstrapping a Ratio Statistic

10:13  
Lecture 66 
Jackknife Estimate of Bias and Standard Error

11:30  
Lecture 67 
Bootstrapping Confidence Intervals (part 1)

08:41  
Lecture 68 
Bootstrapping Confidence Intervals (part 2)

09:13  
Lecture 69 
Bootstrapping Confidence Intervals (part 3)

10:27  
Lecture 70  07:33  
In kfold (also called nfold) crossvalidation, the original sample is randomly partitioned into k equal sized subsamples. Of the k subsamples, a single subsample is retained as the validation data for testing the model, and the remaining k − 1 subsamples are used as training data. The crossvalidation process is then repeated k times (the folds), with each of the k subsamples used exactly once as the validation data. The k results from the folds can then be averaged (or otherwise combined) to produce a single estimation. The advantage of this method over repeated random subsampling (see below) is that all observations are used for both training and validation, and each observation is used for validation exactly once. 10fold crossvalidation is commonly used,^{[7]} but in general k remains an unfixed parameter. When k=n (the number of observations), the kfold crossvalidation is exactly the leaveoneout crossvalidation. In stratified kfold crossvalidation, the folds are selected so that the mean response value is approximately equal in all the folds. In the case of a dichotomous classification, this means that each fold contains roughly the same proportions of the two types of class labels. 

Lecture 71 
NFold CrossValidation of Models (part 2)

04:42  
Lecture 72 
NFold CrossValidation of Models (part 3)

10:42  
Lecture 73 
BootstrapJackknife Resampling Exercise

01:04  
Section 6: Debugging and Program Efficiency  
Lecture 74 
BootstrapJackknife Resampling Exercise Solution

03:28  
Lecture 75 
Debugging R Programs

15:13  
Lecture 76 
Findruns Program Debugging Example (part 1)
Preview

12:13  
Lecture 77 
Findruns Program Debugging Example (part 2)

07:29  
Lecture 78  10:54  
Another approach can be employed that makes use of the local environment within a function to access the variables. When we define methods with this approach later, Local Environment Approach, the results will look more like object oriented approaches seen in other languages. The approach relies on the local scope created when a function is called. A new environment is created that can be identified using the environment command. The environment can be saved in the list created for the class, and the variables within this scope can then be accessed using the identification of the environment. 

Lecture 79 
Program Efficiencies and Scoping Rules

11:45  
Lecture 80  04:20  
An environment, in R, can be thought of as a list of variables and their values. I'm not sure if this is how it is achieved in practice, but it helps me to think of it as a lookup table  for example if a variable Which environment R uses depends on context  if you are typing into the R command line, the environment used is called the global environment. When a function is called a new environment especailly for this function is created automatically  and is destroyed on leaving the function. This is the default environment for any variables created during the execution of the function. Finally, it is worth noting that a particular variable name can apprear in more than one environment  and so if R tries to find the value of a variable, and the variable name appears in more than one environment, the rules governing which environment R will search determine the value that will be found. 

Lecture 81  07:22  
First, everything in R is treated like as an object. We have seen this with functions. Many of the objects that are created within an R session have attributes associated with them. One common attribute associated with an object is its class. You can set the class attribute using the class command. One thing to notice is that the class is a vector which allows an object to inherit from multiple classes, and it allows you to specify the order of inheritance for complex classes. You can also use the class command to determine the classes associated with an object. 

Lecture 82  06:15  
Here we look at two different ways to construct an S3 class. The first approach is more commonly used and is more straightforward. It makes use of basic list properties. The second approach makes use of the local environment within a function to define the variables tracked by the class. The advantage to the second approach is that it looks more like the object oriented approach that many are familiar with. The disadvantage is that it is more difficult to read the code, and it is more like working with pointers which is different from the way other objects work in R. 

Lecture 83  06:26  
The S4 approach differs from the S3 approach to creating a class in that it is a more rigid definition. The idea is that an object is created using the setClass command. The command takes a number of options. Many of the options are not required, but we make use of several of the optional arguments because they represent good practices with respect to object oriented programming. 

Lecture 84 
Numerical Accuracy and Program Efficiency (part 1)

07:42  
Lecture 85 
Numerical Accuracy and Program Efficiency (part 2)

10:51  
Lecture 86 
More on Program Efficiency (part 1)

06:20  
Lecture 87 
More on Program Efficiency (part 2)

06:42  
Lecture 88 
Selection Sort Exercise

03:58 
Dr. Geoffrey Hubona held fulltime tenuretrack, and tenured, assistant and associate professor faculty positions at 3 major state universities in the Eastern United States from 19932010. In these positions, he taught dozens of various statistics, business information systems, and computer science courses to undergraduate, master's and Ph.D. students. He earned a Ph.D. in Business Administration (Information Systems and Computer Science) from the University of South Florida (USF) in Tampa, FL (1993); an MA in Economics (1990), also from USF; an MBA in Finance (1979) from George Mason University in Fairfax, VA; and a BA in Psychology (1972) from the University of Virginia in Charlottesville, VA. He was a fulltime assistant professor at the University of Maryland Baltimore County (19931996) in Catonsville, MD; a tenured associate professor in the department of Information Systems in the Business College at Virginia Commonwealth University (19962001) in Richmond, VA; and an associate professor in the CIS department of the Robinson College of Business at Georgia State University (20012010). He is the founder of the Georgia R School (20102014) and of RCourseware (2014Present), online educational organizations that teach research methods and quantitative analysis techniques. These research methods techniques include linear and nonlinear modeling, multivariate methods, data mining, programming and simulation, and structural equation modeling and partial least squares (PLS) path modeling. Dr. Hubona is an expert of the analytical, opensource R software suite and of various PLS path modeling software packages, including SmartPLS. He has published dozens of research articles that explain and use these techniques for the analysis of data, and, with software codevelopment partner Dean Lim, has created a popular cloudbased PLS software application, PLSGUI.