Find online courses made by experts from around the world.
Take your courses with you and learn anywhere, anytime.
Learn and practice realworld skills and achieve your goals.
Comprehensive Linear Modeling with R provides a wide overview of numerous contemporary linear and nonlinear modeling approaches for the analysis of research data. These include basic, conditional and simultaneous inference techniques; analysis of variance (ANOVA); linear regression; survival analysis; generalized linear models (GLMs); parametric and nonparametric smoothers and generalized additive models (GAMs); longitudinal and mixedeffects, splitplot and other nested model designs. The course showcases the use of R Commander in performing these tasks. R Commander is a popular GUIbased "frontend" to the broad range of embedded statistical functionality in R software. R Commander is an 'SPSSlike' GUI that enables the implementation of a large variety of statistical and graphical techniques using both menus and scripts. Please note that the R Commander GUI is written in the RGtk2 Rspecific visual language (based on GTK+) which is known to have problems running on a Mac computer.
The course progresses through dozens of statistical techniques by first explaining the concepts and then demonstrating the use of each with concrete examples based on actual studies and research data. Beginning with a quick overview of different graphical plotting techniques, the course then reviews basic approaches to establish inference and conditional inference, followed by a review of analysis of variance (ANOVA). The course then progresses through linear regression and a section on validating linear models. Then generalized linear modeling (GLM) is explained and demonstrated with numerous examples. Also included are sections explaining and demonstrating linear and nonlinear models for survival analysis, smoothers and generalized additive models (GAMs), longitudinal models with and without generalized estimating equations (GEE), mixedeffects, splitplot, and nested designs. Also included are detailed examples and explanations of validating linear models using various graphical displays, as well as comparing alternative models to choose the 'best' model. The course concludes with a section on the special considerations and techniques for establishing simultaneous inference in the linear modeling domain.
The rather long course aims for complete coverage of linear (and some nonlinear) modeling approaches using R and is suitable for beginning, intermediate and advanced R users who seek to refine these skills. These candidates would include graduate students and/or quantitative and/or dataanalytic professionals who perform linear (and nonlinear) modeling as part of their professional duties.
Not for you? No problem.
30 day money back guarantee.
Forever yours.
Lifetime access.
Learn on the go.
Desktop, iOS and Android.
Get rewarded.
Certificate of completion.
Section 1: Data Analysis with R Commander Graphical Displays  

Lecture 1 
Introduction to Course
Preview

01:45  
Lecture 2 
Notes About: (1) R and (2) R Commander and (3) Materials
Preview

10:24  
Lecture 3 
Don't Overlook Sectional Exercises !
Preview

02:12  
Lecture 4 
Materials and Agenda Topics
Preview

11:01  
Lecture 5 
Graphical Displays using R Commander (part 1)
Preview

09:27  
Lecture 6 
Graphical Displays using Rcmdr (part 2)

07:30  
Lecture 7 
Graphical Displays using Rcmdr (part 3)

08:59  
Lecture 8 
Graphical Displays using Rcmdr (part 4)

08:07  
Lecture 9 
Graphical Displays using Rcmdr (part 5)

10:54  
Lecture 10 
Graphical Displays using Rcmdr (part 6)

07:16  
Lecture 11 
Graphical Displays using Rcmdr (part 7)

06:48  
Lecture 12 
Graphical Displays using Rcmdr (part 8)

08:49  
Section 2: Simple and Conditional Inference  
Lecture 13  07:55  
Statistical inference is the process of deducing properties of an underlying distribution by analysis of data. Inferential statistical analysis infers properties about a population: this includes testing hypotheses and deriving estimates. The population is assumed to be larger than the observed data set; in other words, the observed data is assumed to be sampled from a larger population. 

Lecture 14 
Inference about Roomwidth using Rcmdr
Preview

11:44  
Lecture 15 
Roomwidth Inference Continued

09:37  
Lecture 16 
Simple Inference: Waves Data

09:46  
Lecture 17 
Simple Inference: Waves NonParametric

10:14  
Lecture 18 
Simple Inference: Piston Rings

12:30  
Lecture 19 
Conditional Inference: Roomwidths Revisited

08:30  
Lecture 20 
Conditional Inference: Roomwidths Continued

08:32  
Lecture 21 
Conditional Inference: Gastrointestinal Damage

07:40  
Lecture 22 
Conditional Inference: Birth Defects

05:40  
Lecture 23 
Inference Exercises

01:33  
Lecture 24 
Inference Exercise Answers (part 1)

08:51  
Lecture 25 
Inference Exercise Answers (part 2)

06:37  
Section 3: Analysis of Variance (ANOVA)  
Lecture 26 
Partial Exercise Solution (part 1)

07:29  
Lecture 27 
Partial Exercise Solution (part 2)

08:55  
Lecture 28  08:57  
Analysis of variance (ANOVA) is a collection of statistical models used to analyze the differences among group means and their associated procedures (such as "variation" among and between groups). In the ANOVA setting, the observed variance in a particular variable is partitioned into components attributable to different sources of variation. In its simplest form, ANOVA provides a statistical test of whether or not the means of several groups are equal, and therefore generalizes the ttest to more than two groups. 

Lecture 29 
Weight Gain in Rats (Rcmdr)
Preview

08:47  
Lecture 30 
Finish Weight Gain then Foster Feeding in Rats

11:25  
Lecture 31 
Water Hardness Revisited

07:15  
Lecture 32 
Male Egyptian Skulls (part 1)

06:49  
Lecture 33 
Male Egyptian Skulls (part 2)

08:02  
Lecture 34 
More Exercises

00:28  
Section 4: Linear Modeling  
Lecture 35  07:39  
In statistics, linear regression is an approach for modeling the relationship between a scalar dependent variable y and one or more explanatory variables (or independent variables) denoted X. The case of one explanatory variable is called simple linear regression. For more than one explanatory variable, the process is called multiple linear regression 

Lecture 36 
Estimating the Age of the Universe (slides and script, part 1)
Preview

06:15  
Lecture 37 
Estimating the Age of the Universe (script, part 2)
Preview

07:54  
Lecture 38 
Age of the Universe (script, part 3)

06:48  
Lecture 39 
Cloud Seeding (slides and script, part 1)

11:43  
Lecture 40 
Cloud Seeding (script, part 2)

09:07  
Lecture 41 
Cloud Seeding (script, part 3)

07:44  
Lecture 42 
Cloud Seeding Diagnostic Plots (part 4)

05:50  
Section 5: Validating Linear Models (aka 'Model Checking')  
Lecture 43 
Model Checking (part 1)

06:37  
Lecture 44 
Model Checking (part 2)
Preview

07:46  
Lecture 45 
Model Checking (part 3)

07:34  
Lecture 46 
Model Checking (part 4)

07:23  
Lecture 47 
Model Checking (part 5)

08:01  
Lecture 48 
Model Checking (part 6)

06:49  
Section 6: Generalized Linear Modeling (GLMs)  
Lecture 49  09:20  
In statistics, the generalized linear model (GLM) is a flexible generalization of ordinary linear regression that allows for response variables that have error distribution models other than a normal distribution. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to be a function of its predicted value. 

Lecture 50 
ESR and Plasma Proteins (part 1)
Preview

11:55  
Lecture 51 
ESR and Plasma Proteins (part 2)

11:50  
Lecture 52 
ESR and Plasma Proteins (part 3)

12:14  
Lecture 53 
Women's Role in Society (part 1)

08:44  
Lecture 54 
Women's Role in Society (part 2)

07:34  
Lecture 55 
Women's Role in Society (part 3)

06:36  
Lecture 56 
Colonic Polyps

06:57  
Lecture 57 
Driving and Back Pain

08:34  
Section 7: Survival Analysis  
Lecture 58  12:06  
Survival analysis is a branch of statistics for analyzing the expected duration of time until one or more events happen, such as death in biological organisms and failure in mechanical systems. This topic is called reliability theory or reliability analysis inengineering, duration analysis or duration modelling in economics, and event history analysis in sociology. Survival analysis attempts to answer questions such as: what is the proportion of a population which will survive past a certain time? Of those that survive, at what rate will they die or fail? Can multiple causes of death or failure be taken into account? How do particular circumstances or characteristics increase or decrease the probability of survival? 

Lecture 59 
Glioma Radioimmunotherapy
Preview

08:53  
Lecture 60 
Breast Cancer Survival

11:08  
Section 8: Smoothers and Generalized Additive Modeling (GAMs)  
Lecture 61  08:50  
A smoother is a statistical technique for estimating a real valued function by using its noisy observations, when no parametric model for this function is known. The estimated function is smooth, or nonlinear, and the level of smoothness is set by a single parameter. 

Lecture 62  05:03  
In statistics, a generalized additive model (GAM) is a generalized linear model in which the linear predictor depends linearly on unknown smooth functions of some predictor variables, and interest focuses on inference about these smooth functions. 

Lecture 63 
Air Pollution in U.S. Cities
Preview

11:46  
Lecture 64  06:16  
Kyphosis (from Greek κυφός kyphos, a hump) refers to the abnormally excessive convex kyphotic curvature of the spine as it occurs in the thoracic and sacral regions. (Inward concave curving of the cervical and lumbar regions of the spine is called lordosis.) Kyphosis can be called roundback or Kelso's hunchback. It can result from degenerative diseases such as arthritis; developmental problems, most commonlyScheuermann's disease; osteoporosis with compression fractures of the vertebra; Multiple myeloma or trauma. 

Lecture 65 
Kyphosis (part 2)

09:16  
Lecture 66  07:05  
LOESS and LOWESS (locally weighted scatterplot smoothing) are two strongly related nonparametric regression methods that combine multiple regression models in a knearestneighborbased metamodel. "LOESS" is a later generalization of LOWESS; although it is not a true initialism, it may be understood as standing for "LOcal regrESSion".^{} LOESS and LOWESS thus build on "classical" methods, such as linear and nonlinear least squares regression. 

Lecture 67 
Lowess Smoothers (part 2)

08:34  
Lecture 68 
Lowess Smoothers (part 3)

07:37  
Lecture 69 
GAM with Binary Isolation Data

09:52  
Lecture 70 
GAM Examples using mgcv Package (part 1)

07:07  
Lecture 71 
GAM Examples using mgcv Package (part 2)

09:13  
Lecture 72 
GAM Examples using mgcv Package (part 3)

06:56  
Lecture 73 
Strongly Humped Data (part 1)

07:06  
Lecture 74 
Strongly Humped Data (part 2)

08:47  
Section 9: Linear MixedEffects Models  
Lecture 75  08:12  
A mixed model is a statistical model containing both fixed effects and random effects. These models are useful in a wide variety of disciplines in the physical, biological and social sciences. They are particularly useful in settings where repeated measurements are made on the same statistical units (longitudinal study), or where measurements are made on clusters of related statistical units. 

Lecture 76 
Linear MixedEffects Models (slides, part 2)

07:55  
Lecture 77 
Beat the Blues Slides and Data
Preview

09:01  
Lecture 78 
Beat the Blues Study (part 2)

07:08  
Lecture 79  07:10  
In descriptive statistics, a box plot or boxplot is a convenient way of graphically depicting groups of numerical data through their quartiles. Box plots may also have lines extending vertically from the boxes (whiskers) indicating variability outside the upper and lower quartiles, hence the terms boxandwhisker plot and boxandwhisker diagram. Outliers may be plotted as individual points. Box plots are nonparametric: they display variation in samples of a statistical population without making any assumptions of the underlying statistical distribution. 

Lecture 80 
Run Beat the Blues Models (part 1)

05:33  
Lecture 81 
Run Beat the Blues Models (part 2)

07:24  
Section 10: Generalized Estimating Equations (GEE)  
Lecture 82  10:02  
In statistics, a generalized estimating equation (GEE) is used to estimate the parameters of a generalized linear model with a possible unknown correlation between outcomes.^{} Parameter estimates from the GEE are consistent even when the covariance structure is misspecified, under mild regularity conditions. The focus of the GEE is on estimating the average response over the population ("populationaveraged" effects) rather than theregression parameters that would enable prediction of the effect of changing one or more covariates on a given individual. GEEs are usually used in conjunction with Huber–White standard error estimates, also known as "robust standard error" or "sandwich variance" estimates. In the case of a linear model with a working independence variance structure, these are known as "heteroscedasticity consistent standard error" estimators. Indeed, the GEE unified several independent formulations of these standard error estimators in a general framework. 

Lecture 83 
Generalized Estimating Equations (GEE) (slides, part 2)

06:54  
Lecture 84 
GEE with Beat the Blues as Binomial GLM (part 1)
Preview

07:08  
Lecture 85 
GEE with Beat the Blues as Binomial GLM (part 2)

08:05  
Lecture 86 
Respiratory Illness with Binary Response Variable (part 1)

06:10  
Lecture 87 
Respiratory Illness with Binary Response Variable (part 2)

08:34  
Lecture 88 
Respiratory Illness with Binary Response Variable (part 3)

08:38  
Lecture 89 
Respiratory Illness with Binary Response Variable (part 4)

10:06  
Section 11: SplitPlot and Nested Designs 
Dr. Geoffrey Hubona held fulltime tenuretrack, and tenured, assistant and associate professor faculty positions at 3 major state universities in the Eastern United States from 19932010. In these positions, he taught dozens of various statistics, business information systems, and computer science courses to undergraduate, master's and Ph.D. students. He earned a Ph.D. in Business Administration (Information Systems and Computer Science) from the University of South Florida (USF) in Tampa, FL (1993); an MA in Economics (1990), also from USF; an MBA in Finance (1979) from George Mason University in Fairfax, VA; and a BA in Psychology (1972) from the University of Virginia in Charlottesville, VA. He was a fulltime assistant professor at the University of Maryland Baltimore County (19931996) in Catonsville, MD; a tenured associate professor in the department of Information Systems in the Business College at Virginia Commonwealth University (19962001) in Richmond, VA; and an associate professor in the CIS department of the Robinson College of Business at Georgia State University (20012010). He is the founder of the Georgia R School (20102014) and of RCourseware (2014Present), online educational organizations that teach research methods and quantitative analysis techniques. These research methods techniques include linear and nonlinear modeling, multivariate methods, data mining, programming and simulation, and structural equation modeling and partial least squares (PLS) path modeling. Dr. Hubona is an expert of the analytical, opensource R software suite and of various PLS path modeling software packages, including SmartPLS. He has published dozens of research articles that explain and use these techniques for the analysis of data, and, with software codevelopment partner Dean Lim, has created a popular cloudbased PLS software application, PLSGUI.