Udemy
    •  
    •  
    •  
    •  
    •  
    •  
    •  
    •  
Turn what you know into an opportunity and reach millions around the world.
Learn More
Your cart is empty.
Keep shopping
Applied Multivariate Analysis with R
Rating: 4.0 out of 5(413 ratings)
4,652 students

Applied Multivariate Analysis with R

Learn to use R software to conduct PCAs, MDSs, cluster analyses, EFAs and to estimate SEM models.
Last updated 7/2020
English

What you'll learn

  • Conceptualize and apply multivariate skills and "hands-on" techniques using R software in analyzing real data.
  • Create novel and stunning 2D and 3D multivariate data visualizations with R.
  • Set up and estimate a Principal Components Analysis (PCA).
  • Formulate and estimate a Multidimensional Scaling (MDS) problem.
  • Group similar (or dissimilar) data with Cluster Analysis techniques.
  • Estimate and interpret an Exploratory Factor Analysis (EFA).
  • Specify and estimate a Structural Equation Model (SEM) using RAM notation in R.
  • Be knowledgeable about SEM simulation capabilities from the R SIMSEM package.

Course content

7 sections75 lectures12h 13m total length
  • Introduction to Multivariate Analysis (MVA) Course11:40

    This video presents an overview of the Applied Multivariate Analysis (MVA) course.

  • Materials for Section 1 Introduction to MV Data and Analysis2:25

    The materials used in the video lectures for Section 1 Introduction to Multivariate Data and Analysis are briefly explained and then provided as a .zip file download after the short video is presented.

  • What is "Multivariate Analysis" ?14:15

    Multivariate analysis (MVA) is based on the statistical principle of multivariate statistics, which involves the observation and analysis of more than one statistical outcome variable at a time. In design and analysis, the technique is used to perform trade studies across multiple dimensions while taking into account the effects of all variables on the responses of interest. Some of the applications include:

    • To reduce a large number of variables to a smaller number of factors for data modeling

    • To validate a scale or index by demonstrating that its constituent items load on the same factor, and to drop proposed scale items which cross-load on more than one factor.

    • To select a subset of variables from a larger set, based on which original variables have the highest correlations with some other factors.

    • To create a set of factors to be treated as uncorrelated variables as one approach to handling multi-collinearity in such procedures as multiple regression

    In this "hands-on" course on applied multivariate analysis, we focus on how to actually use and conduct MVA analyses, using dozens of real data sets and R software. We examine the techniques and examples of principal components analysis, multidimensional scaling, cluster analysis, exploratory factor analysis, and an introduction to structural equation modeling.

  • Missing Values and the Measure Dataset8:20

    Missing data is a huge problem in analyzing data sets because many statistical and mathematical functions fail when any individual data observations have even one missing data element. We explain and demonstrate why this is a problem using a 'body measures' dataset that we construct in R, and we show some "quick fixes" to getting around this problem of missing data in multivariate analysis.

  • Other Multivariate Datasets10:11

    We create several multivariate data sets using R software. We use these data sets and others in the rest of the course.

  • Covariance, Correlation and Distance (part 1)11:36

    In probability theory and statistics, a covariance matrix (also known as dispersion matrix or variance–covariance matrix) is a matrix whose element in the i, j position is the covariance between the i th and j th elements of a random vector (that is, of a vector of random variables). Each element of the vector is a scalar random variable, either with a finite number of observed empirical values or with a finite or infinite number of potential values specified by a theoretical joint probability distribution of all the random variables.

    The correlation matrix of n random variables X1, ..., Xn is the n × n matrix whose i,j entry is corr(Xi, Xj). If the measures of correlation used are product-moment coefficients, the correlation matrix is the same as the covariance matrix of the standardized random variables Xi / σ (Xi) for i = 1, ..., n. This applies to both the matrix of population correlations (in which case "σ" is the population standard deviation), and to the matrix of sample correlations (in which case "σ" denotes the sample standard deviation). Consequently, each is necessarily a positive-semidefinite matrix.

    The correlation matrix is symmetric because the correlation between Xi and Xj is the same as the correlation between Xj and Xi.

  • Covariance, Correlation and Distance (part 2)10:21

    We continue our discussion of creating, estimating and using both covariance and correlation matrices in multivariate analysis using R software. We also introduce the concept of "distance" for finding similarities / differences among sets of variables.

  • Covariance, Correlation and Distance (part 3)10:12

    We continue our discussion of creating, estimating and using both covariance and correlation matrices in multivariate analysis using R software. We also introduce the concept of "distance" for finding similarities / differences among sets of variables.

  • The Multivariate Normal Density Function11:28

    We describe, create (with simulation), demonstrate and visualize a multivariate normal (MVN) density function using R. In probability theory and statistics, the multivariate normal distribution or multivariate Gaussian distribution, is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One possible definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables each of which clusters around a mean value.

  • Setting Up Normality Plots10:03

    We demonstrate several R software graphical approaches to test for univariate and multivariate normality.

  • Drawing Normality Plots13:41

    We continue our illustrative cases and examples of creating normality plots in R software.

  • Covariance, Correlation and Normality Exercises7:09

    This video lecture explains the three covariance, correlation and normality exercises for the first section of the applied MVA course.

Requirements

  • No specific knowledge or skills are required.
  • Students will need to install the popular no-cost R Console and RStudio software (instructions provided).
  • However, it is helpful if students have some interest and aptitude in quantitative or statistical analysis.

Description

Applied Multivariate Analysis (MVA) with R is a practical, conceptual and applied "hands-on" course that teaches students how to perform various specific MVA tasks using real data sets and R software. It is an excellent and practical background course for anyone engaged with educational or professional tasks and responsibilities in the fields of data mining or predictive analytics, statistical or quantitative modeling (including linear, GLM and/or non-linear modeling, covariance-based Structural Equation Modeling (SEM) specification and estimation, and/or variance-based PLS Path Model specification and estimation. Students learn all about the nature of multivariate data and multivariate analysis. Students specifically learn how to create and estimate: covariance and correlation matrices; Principal Components Analyses (PCA); Multidimensional Scaling (MDS); Cluster Analysis; Exploratory Factor Analyses (EFA); and SEM model estimation. The course also teaches how to create dozens of different dazzling 2D and 3D multivariate data visualizations using R software. All software, R scripts, datasets and slides used in all lectures are provided in the course materials. The course is structured as a series of seven sections, each addressing a specific MVA topic and each section culminating with one or more "hands-on" exercises for the students to complete before proceeding to reinforce learning the presented MVA concepts and skills. The course is an excellent vehicle to acquire "real-world" predictive analytics skills that are in high demand today in the workplace. The course is also a fertile source of relevant skills and knowledge for graduate students and faculty who are required to analyze and interpret research data.

Who this course is for:

  • Anyone interested in using multivariate analysis technques as a basis for data mining, statistical modeling, and structural equation modeling (SEM) estimation.
  • Practicing quantitative analysis professionals including college and university faculty seeking to learn new multivariate data analysis skills.
  • Undergraduate students looking for jobs in predictive or business analytics fields.
  • Graduate students wishing to learn more applied data analysis techniques and approaches.