Linear Regression, GLMs and GAMs with R

How to extend linear regression to specify and estimate generalized linear models and additive models.
4.0 (32 ratings)
Instead of using a simple lifetime average, Udemy calculates a
course's star rating by considering a number of different factors
such as the number of ratings, the age of ratings, and the
likelihood of fraudulent ratings.
592 students enrolled
$19
$40
52% off
Take This Course
  • Lectures 69
  • Length 8 hours
  • Skill Level All Levels
  • Languages English
  • Includes Lifetime access
    30 day money back guarantee!
    Available on iOS and Android
    Certificate of Completion
Wishlisted Wishlist

How taking a course works

Discover

Find online courses made by experts from around the world.

Learn

Take your courses with you and learn anywhere, anytime.

Master

Learn and practice real-world skills and achieve your goals.

About This Course

Published 1/2016 English

Course Description

Linear Regression, GLMs and GAMs with R demonstrates how to use R to extend the basic assumptions and constraints of linear regression to specify, model, and interpret the results of generalized linear (GLMs) and generalized additive (GAMs) models. The course demonstrates the estimation of GLMs and GAMs by working through a series of practical examples from the book Generalized Additive Models: An Introduction with R by Simon N. Wood (Chapman & Hall/CRC Texts in Statistical Science, 2006). Linear statistical models have a univariate response modeled as a linear function of predictor variables and a zero mean random error term. The assumption of linearity is a critical (and limiting) characteristic. Generalized linear models (GLMs) relax this assumption of linearity. They permit the expected value of the response variable to be a smoothed (e.g. non-linear) monotonic function of the linear predictors. GLMs also relax the assumption that the response variable is normally distributed by allowing for many distributions (e.g. normal, poisson, binomial, log-linear, etc.). Generalized additive models (GAMs) are extensions of GLMs. GAMs allow for the estimation of regression coefficients that take the form of non-parametric smoothers. Nonparametric smoothers like lowess (locally weighted scatterplot smoothing) fit a smooth curve to data using localized subsets of the data. This course provides an overview of modeling GLMs and GAMs using R. GLMs, and especially GAMs, have evolved into standard statistical methodologies of considerable flexibility. The course addresses recent approaches to modeling, estimating and interpreting GAMs. The focus of the course is on modeling and interpreting GLMs and especially GAMs with R. Use of the freely available R software illustrates the practicalities of linear, generalized linear, and generalized additive models.

What are the requirements?

  • Students will need to install R and R Commander software but ample instruction for doing so is provided.

What am I going to get from this course?

  • Understand the assumptions of ordinary least squares (OLS) linear regression.
  • Specify, estimate and interpret linear (regression) models using R.
  • Understand how the assumptions of OLS regression are modified (relaxed) in order to specify, estimate and interpret generalized linear models (GLMs).
  • Specify, estimate and interpret GLMs using R.
  • Understand the mechanics and limitations of specifying, estimating and interpreting generalized additive models (GAMs).

What is the target audience?

  • This course would be useful for anyone involved with linear modeling estimation, including graduate students and/or working professionals in quantitative modeling and data analysis.
  • The focus, and majority of content, of this course is on generalized additive modeling. Anyone who wishes to learn how to specify, estimate and interpret GAMs would especially benefit from this course.

What you get with this course?

Not for you? No problem.
30 day money back guarantee.

Forever yours.
Lifetime access.

Learn on the go.
Desktop, iOS and Android.

Get rewarded.
Certificate of completion.

Curriculum

Section 1: Introduction to Course and to Linear Modeling
Introduction to Course
Preview
01:51
Preliminaries: Installing R, RStudio, R Commander, Course Materials and Exercise
Preview
05:16
Beginning Agenda (slides)
Preview
08:18
05:11

The term "linear" refers to the fact that we are fitting a line. The term model refers to the equation that summarizes the line that we fit. The term "linear model" is often taken as synonymous with linear regression model.

06:08

Assumptions of Linear Models (regression):

  1. The residuals are independent
  2. The residuals are normally distributed
  3. The residuals have a mean of 0 at all values of X
  4. The residuals have constant variance
Desirable Properties of Beta-hat (slides, part 3)
07:19
Example: Estimate Age of Universe (slides)
04:39
Example: Estimate Age of Universe Live in R (part 1)
07:44
Example: Estimate Age of Universe Live in R (part 2)
Preview
09:22
Example: Estimating Age of the Universe (part 3)
08:50
Finish Example and More Notes on Linear Modeling
08:31
Linear Modeling Exercises
01:48
Section 2: Generalized Linear Models (GLMs) Part 1
06:58

In statistics, the generalized linear model (GLM) is a flexible generalization of ordinary linear regression that allows for response variables that have error distribution models other than a normal distribution. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to be a function of its predicted value.

Introduction to GLMs (slides, part 2)
07:29
Introduction to GLMs (slides, part 3)
07:50
Introduction to GLMs (slides, part 4)
06:44
07:50

Proportion data has values that fall between zero and one. Naturally, it would be nice to have the predicted values also fall between zero and one. One way to accomplish this is to use a generalized linear model (glm) with a logit link and the binomial family.

Example: Binomial (Proportion) Model with Heart Disease (part 2)
07:26
Example: Binomial (Proportion) Model with Heart Disease (part 3)
08:16
Example: Binomial (Proportion) Model with Heart Disease (part 4)
Preview
06:22
GLM Exercises
01:05
Section 3: Generalized Linear Models Part 2
Current Agenda
01:46
Linear Regression Exercise Solutions (part 1)
07:31
Linear Regression Exercise Solutions (part 2)
07:29
GLM Exercise Solutions (part 3)
09:30
08:15

In statistics, Poisson regression is a form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable Y has a Poisson distribution, and assumes the logarithm of its expected value can be modeled by a linear combination of unknown parameters. A Poisson regression model is sometimes known as a log-linear model, especially when used to model contingency tables.

Poisson regression models are generalized linear models with the logarithm as the (canonical) link function, and the Poisson distribution function as the assumed probability distribution of the response.

Example: Poisson Model with Count Data (part 2)
09:29
Example: Binary Response Variable (part 1)
Preview
04:43
Example: Binary Response Variable (part 2)
Preview
06:12
Exercise: GLM to GAM
01:40
05:55

Log-linear analysis is a technique used in statistics to examine the relationship between more than two categorical variables.

More on Deviance and Overdispersion (slides)
03:11
Section 4: Generalized Additive Models Explained
07:41

In statistics, a generalized additive model (GAM) is a generalized linear model in which the linear predictor depends linearly on unknown smooth functions of some predictor variables, and interest focuses on inference about these smooth functions. GAMs were originally developed by Trevor Hastie and Robert Tibshirani to blend properties of generalized linear models with additive models.

What are GAMs? (Crawley, slides, part 2)
06:02
Demonstrate GAM Ozone Data (part 1)
09:40
Demonstrate GAM Ozone Data (part 2)
09:42
General Approaches for Fitting GAMs (slides)
Preview
02:44
What are GAMs? (Wood, slides, part 1)
11:34
Univariate Polynomial GAMs (Wood, slides, part 2)
07:27
Univariate Polynomial GAMs (Wood, slides, part 3)
05:52
GAMs as 4th Order Polynomials (slides, part 1)
06:21
GAMs as 4th Order Polynomials (slides, part 2)
04:29
GAMs as Regression Splines (slides)
03:38
Cubic Splines (slides, part 1)
08:45
Cubic Splines (slides, part 2)
04:21
Function to Establish Basis for Spline (slides)
07:33
Build-a-GAM (slides, part 1)
07:46
Build-a-GAM (slides, part 2)
10:16
Build-a-GAM (slides, part 3)
06:17
Build-a-GAM Demonstration in R Script
Preview
11:34
Build-a-GAM Cross Validation
08:13
Bivariate GAMs with 2 Explanatory Independent Variables (slides, part 1)
09:17
Bivariate GAMs with 2 Explanatory Independent Variables (slides, part 2)
07:31
Exercises
01:33
Section 5: Detailed GAM Examples
Current Agenda (slides)
05:23
Cherry Trees and Finer Control (slides, part 1)
08:10
Finer Control of GAM (slides, part 2)
10:52
Using Smoothers with More than One Predictor (slides)
07:04
More on Alternative Smoothing Bases (slides)
08:06
Parametric Model Terms (slides)
08:29
Example: Brain Imaging (part 1)
Preview
07:51
Example: Brain Imaging (part 2)
08:09
Example: Brain Imaging (part 3)
07:38
Example: Brain Imaging (part 4)
07:03
Example: Brain Imaging (part 5)
07:41
Example: Air Pollution in Chicago (part 1)
09:33
Example: Air Pollution in Chicago (part 2)
09:17
Air Pollution in Chicago (part 3)
04:40
More Exercises
05:41

Students Who Viewed This Course Also Viewed

  • Loading
  • Loading
  • Loading

Instructor Biography

Geoffrey Hubona, Ph.D., Professor of Information Systems

Dr. Geoffrey Hubona held full-time tenure-track, and tenured, assistant and associate professor faculty positions at 3 major state universities in the Eastern United States from 1993-2010. In these positions, he taught dozens of various statistics, business information systems, and computer science courses to undergraduate, master's and Ph.D. students. He earned a Ph.D. in Business Administration (Information Systems and Computer Science) from the University of South Florida (USF) in Tampa, FL (1993); an MA in Economics (1990), also from USF; an MBA in Finance (1979) from George Mason University in Fairfax, VA; and a BA in Psychology (1972) from the University of Virginia in Charlottesville, VA. He was a full-time assistant professor at the University of Maryland Baltimore County (1993-1996) in Catonsville, MD; a tenured associate professor in the department of Information Systems in the Business College at Virginia Commonwealth University (1996-2001) in Richmond, VA; and an associate professor in the CIS department of the Robinson College of Business at Georgia State University (2001-2010). He is the founder of the Georgia R School (2010-2014) and of R-Courseware (2014-Present), online educational organizations that teach research methods and quantitative analysis techniques. These research methods techniques include linear and non-linear modeling, multivariate methods, data mining, programming and simulation, and structural equation modeling and partial least squares (PLS) path modeling. Dr. Hubona is an expert of the analytical, open-source R software suite and of various PLS path modeling software packages, including SmartPLS. He has published dozens of research articles that explain and use these techniques for the analysis of data, and, with software co-development partner Dean Lim, has created a popular cloud-based PLS software application, PLS-GUI.

Ready to start learning?
Take This Course