Browse

Please confirm that you want to add **Comprehensive Linear Modeling with R** to your Wishlist.

Learn to model with R: ANOVA, regression, GLMs, survival analysis, GAMs, mixed-effects, split-plot and nested designs

953 students enrolled

- 14.5 hours on-demand video
- 20 Supplemental Resources
- Full lifetime access
- Access on mobile and TV

- Certificate of Completion

What Will I Learn?

- Understand, use and apply, estimate, interpret and validate: ANOVA; regression; survival analysis; GLMs; smoothers and GAMs; longitudinal, mixed-effects, split-plot and nested model designs using their own data and R software.
- Achieve proficiency using the popular no-cost and versatile R Commander GUI as an interface to the broad statistical and graphical capabilities in R.
- Know and use tests for simple, conditional, and simultaneous inference.
- Apply various graphs and plots to validate linear models.
- Be able to compare and choose the 'best' among multiple competing models.

Requirements

- Students will need to install R and R Commander using the ample video and written instructions that are provided for doing so.

Description

** Comprehensive Linear Modeling with R** provides a wide overview of numerous contemporary linear and non-linear modeling approaches for the analysis of research data. These include basic, conditional and simultaneous inference techniques; analysis of variance (ANOVA); linear regression; survival analysis; generalized linear models (GLMs); parametric and non-parametric smoothers and generalized additive models (GAMs); longitudinal and mixed-effects, split-plot and other nested model designs. The course showcases the use of R Commander in performing these tasks. R Commander is a popular GUI-based "front-end" to the broad range of embedded statistical functionality in R software. R Commander is an 'SPSS-like' GUI that enables the implementation of a large variety of statistical and graphical techniques using both menus and scripts. Please note that the R Commander GUI is written in the RGtk2 R-specific visual language (based on GTK+) which is known to have problems running on a Mac computer.

The course progresses through dozens of statistical techniques by first explaining the concepts and then demonstrating the use of each with concrete examples based on actual studies and research data. Beginning with a quick overview of different graphical plotting techniques, the course then reviews basic approaches to establish inference and conditional inference, followed by a review of analysis of variance (ANOVA). The course then progresses through linear regression and a section on validating linear models. Then generalized linear modeling (GLM) is explained and demonstrated with numerous examples. Also included are sections explaining and demonstrating linear and non-linear models for survival analysis, smoothers and generalized additive models (GAMs), longitudinal models with and without generalized estimating equations (GEE), mixed-effects, split-plot, and nested designs. Also included are detailed examples and explanations of validating linear models using various graphical displays, as well as comparing alternative models to choose the 'best' model. The course concludes with a section on the special considerations and techniques for establishing simultaneous inference in the linear modeling domain.

The rather long course aims for complete coverage of linear (and some non-linear) modeling approaches using R and is suitable for beginning, intermediate and advanced R users who seek to refine these skills. These candidates would include graduate students and/or quantitative and/or data-analytic professionals who perform linear (and non-linear) modeling as part of their professional duties.

Who is the target audience?

- This course is aimed at graduate students and working quantitative and data-analytic professionals who seek to acquire a wide range of linear (and non-linear) modeling skills using R.
- People who only have a Mac computer available to use should know that the R Commander interface is written in the R-specific RGtk2 language (based on GTK+) which is known to be problematic running on a Mac computer.

Students Who Viewed This Course Also Viewed

Curriculum For This Course

Expand All 104 Lectures
Collapse All 104 Lectures
14:16:50

+
–

Data Analysis with R Commander Graphical Displays
12 Lectures
01:33:12

Graphical Displays using Rcmdr (part 2)

07:30

Graphical Displays using Rcmdr (part 3)

08:59

Graphical Displays using Rcmdr (part 4)

08:07

Graphical Displays using Rcmdr (part 5)

10:54

Graphical Displays using Rcmdr (part 6)

07:16

Graphical Displays using Rcmdr (part 7)

06:48

Graphical Displays using Rcmdr (part 8)

08:49

+
–

Simple and Conditional Inference
13 Lectures
01:49:09

**Statistical inference** is the process of deducing properties of an underlying distribution by analysis of data. Inferential statistical analysis infers properties about a population: this includes testing hypotheses and deriving estimates. The population is assumed to be larger than the observed data set; in other words, the observed data is assumed to be sampled from a larger population.

Preview
07:55

Roomwidth Inference Continued

09:37

Simple Inference: Waves Data

09:46

Simple Inference: Waves Non-Parametric

10:14

Simple Inference: Piston Rings

12:30

Conditional Inference: Roomwidths Revisited

08:30

Conditional Inference: Roomwidths Continued

08:32

Conditional Inference: Gastrointestinal Damage

07:40

Conditional Inference: Birth Defects

05:40

Inference Exercises

01:33

Inference Exercise Answers (part 1)

08:51

Inference Exercise Answers (part 2)

06:37

+
–

Analysis of Variance (ANOVA)
9 Lectures
01:08:07

Partial Exercise Solution (part 1)

07:29

Partial Exercise Solution (part 2)

08:55

**Analysis of variance** (**ANOVA**) is a collection of statistical models used to analyze the differences among group means and their associated procedures (such as "variation" among and between groups). In the ANOVA setting, the observed variance in a particular variable is partitioned into components attributable to different sources of variation. In its simplest form, ANOVA provides a statistical test of whether or not the means of several groups are equal, and therefore generalizes the *t*-test to more than two groups.

Preview
08:57

Finish Weight Gain then Foster Feeding in Rats

11:25

Water Hardness Revisited

07:15

Male Egyptian Skulls (part 1)

06:49

Male Egyptian Skulls (part 2)

08:02

More Exercises

00:28

+
–

Linear Modeling
8 Lectures
01:03:00

In statistics, **linear regression** is an approach for modeling the relationship between a scalar dependent variable *y* and one or more explanatory variables (or independent variables) denoted *X*. The case of one explanatory variable is called *simple linear regression*. For more than one explanatory variable, the process is called *multiple linear regression*

What is Linear Modeling? (slides)

07:39

Age of the Universe (script, part 3)

06:48

Cloud Seeding (slides and script, part 1)

11:43

Cloud Seeding (script, part 2)

09:07

Cloud Seeding (script, part 3)

07:44

Cloud Seeding Diagnostic Plots (part 4)

05:50

+
–

Validating Linear Models (aka 'Model Checking')
6 Lectures
44:10

Model Checking (part 1)

06:37

Model Checking (part 3)

07:34

Model Checking (part 4)

07:23

Model Checking (part 5)

08:01

Model Checking (part 6)

06:49

+
–

Generalized Linear Modeling (GLMs)
9 Lectures
01:23:44

In statistics, the **generalized linear model** (**GLM**) is a flexible generalization of ordinary linear regression that allows for response variables that have error distribution models other than a normal distribution. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a *link function* and by allowing the magnitude of the variance of each measurement to be a function of its predicted value.

Preview
09:20

ESR and Plasma Proteins (part 2)

11:50

ESR and Plasma Proteins (part 3)

12:14

Women's Role in Society (part 1)

08:44

Women's Role in Society (part 2)

07:34

Women's Role in Society (part 3)

06:36

Colonic Polyps

06:57

Driving and Back Pain

08:34

+
–

Survival Analysis
3 Lectures
32:07

**Survival analysis** is a branch of statistics for analyzing the expected duration of time until one or more events happen, such as death in biological organisms and failure in mechanical systems. This topic is called **reliability theory** or **reliability analysis** inengineering, **duration analysis** or **duration modelling** in economics, and **event history analysis** in sociology. Survival analysis attempts to answer questions such as: what is the proportion of a population which will survive past a certain time? Of those that survive, at what rate will they die or fail? Can multiple causes of death or failure be taken into account? How do particular circumstances or characteristics increase or decrease the probability of survival?

What is Survival Analysis? (slides)

12:06

Breast Cancer Survival

11:08

+
–

Smoothers and Generalized Additive Modeling (GAMs)
14 Lectures
01:53:28

A **smoother** is a statistical technique for estimating a real valued function by using its noisy observations, when no parametric model for this function is known. The estimated function is smooth, or non-linear, and the level of smoothness is set by a single parameter.

Preview
08:50

In statistics, a **generalized additive model (GAM)** is a generalized linear model in which the linear predictor depends linearly on unknown smooth functions of some predictor variables, and interest focuses on inference about these smooth functions.

Smoothers and GAMs (slides, part 2)

05:03

**Kyphosis** (from Greek κυφός *kyphos*, a hump) refers to the abnormally excessive convex *kyphotic* curvature of the spine as it occurs in the thoracic and sacral regions. (Inward concave curving of the cervical and lumbar regions of the spine is called lordosis.) Kyphosis can be called **roundback** or **Kelso's hunchback**. It can result from degenerative diseases such as arthritis; developmental problems, most commonlyScheuermann's disease; osteoporosis with compression fractures of the vertebra; Multiple myeloma or trauma.

Kyphosis (part 1)

06:16

Kyphosis (part 2)

09:16

**LOESS** and **LOWESS** (**locally weighted scatterplot smoothing**) are two strongly related non-parametric regression methods that combine multiple regression models in a *k*-nearest-neighbor-based meta-model. "LOESS" is a later generalization of LOWESS; although it is not a true initialism, it may be understood as standing for "LOcal regrESSion".^{}

LOESS and LOWESS thus build on "classical" methods, such as linear and nonlinear least squares regression.

Non-Parametric Smoothers (part 1)

07:05

Lowess Smoothers (part 2)

08:34

Lowess Smoothers (part 3)

07:37

GAM with Binary Isolation Data

09:52

GAM Examples using mgcv Package (part 1)

07:07

GAM Examples using mgcv Package (part 2)

09:13

GAM Examples using mgcv Package (part 3)

06:56

Strongly Humped Data (part 1)

07:06

Strongly Humped Data (part 2)

08:47

+
–

Linear Mixed-Effects Models
7 Lectures
52:23

A **mixed model** is a statistical model containing both fixed effects and random effects. These models are useful in a wide variety of disciplines in the physical, biological and social sciences. They are particularly useful in settings where repeated measurements are made on the same statistical units (longitudinal study), or where measurements are made on clusters of related statistical units.

Linear Mixed-Effects Models (slides, part 1)

08:12

Linear Mixed-Effects Models (slides, part 2)

07:55

Beat the Blues Study (part 2)

07:08

In descriptive statistics, a **box plot** or **boxplot** is a convenient way of graphically depicting groups of numerical data through their quartiles. Box plots may also have lines extending vertically from the boxes (*whiskers*) indicating variability outside the upper and lower quartiles, hence the terms **box-and-whisker plot** and **box-and-whisker diagram**. Outliers may be plotted as individual points. Box plots are non-parametric: they display variation in samples of a statistical population without making any assumptions of the underlying statistical distribution.

Beat the Blues Study Boxplots and Data Transformation (part 3)

07:10

Run Beat the Blues Models (part 1)

05:33

Run Beat the Blues Models (part 2)

07:24

+
–

Generalized Estimating Equations (GEE)
8 Lectures
01:05:37

In statistics, a **generalized estimating equation (GEE)** is used to estimate the parameters of a generalized linear model with a possible unknown correlation between outcomes.^{}

Parameter estimates from the GEE are consistent even when the covariance structure is misspecified, under mild regularity conditions. The focus of the GEE is on estimating the average response over the population ("population-averaged" effects) rather than theregression parameters that would enable prediction of the effect of changing one or more covariates on a given individual. GEEs are usually used in conjunction with Huber–White standard error estimates, also known as "robust standard error" or "sandwich variance" estimates. In the case of a linear model with a working independence variance structure, these are known as "heteroscedasticity consistent standard error" estimators. Indeed, the GEE unified several independent formulations of these standard error estimators in a general framework.

Preview
10:02

Generalized Estimating Equations (GEE) (slides, part 2)

06:54

GEE with Beat the Blues as Binomial GLM (part 2)

08:05

Respiratory Illness with Binary Response Variable (part 1)

06:10

Respiratory Illness with Binary Response Variable (part 2)

08:34

Respiratory Illness with Binary Response Variable (part 3)

08:38

Respiratory Illness with Binary Response Variable (part 4)

10:06

2 More Sections

About the Instructor

Professor of Information Systems

- About Us
- Udemy for Business
- Become an Instructor
- Affiliates
- Blog
- Topics
- Mobile Apps
- Support
- Careers
- Resources

- Copyright © 2017 Udemy, Inc.
- Terms
- Privacy Policy
- Intellectual Property