Find online courses made by experts from around the world.
Take your courses with you and learn anywhere, anytime.
Learn and practice realworld skills and achieve your goals.
It is often both useful and revealing to create visualizations, plots and graphs of the multivariate data that is the subject of one's research project. Often, both preanalysis and postanalysis visualizations can help one understand “what is going on in the data" in a way that looking at numerical summaries of fitted model estimates cannot. The lattice package in R is uniquely designed to graphically depict relationships in multivariate data sets.
This course describes and demonstrates this creative approach for constructing and drawing gridbased multivariate graphic plots and figures using R. Lattice graphics are characterized as multivariable (3, 4, 5 or more variables) plots that use conditioning and paneling. Consequently, it is a popular approach for, and a good fit to visually present the results of multivariable statistical model fitting. The appearance of most of the plots, graphs and figures are determined by panel functions, rather than by the highlevel graphics function calls themselves. Further, the user of lattice graphics has extensive and comprehensive control over many more of the details and features of the visual plots, far greater control that is afforded by the base graphics approach in R. The method is based on trellis graphics which were popularized in the S language developed by Bell Labs.
Not for you? No problem.
30 day money back guarantee.
Forever yours.
Lifetime access.
Learn on the go.
Desktop, iOS and Android.
Get rewarded.
Certificate of completion.
Section 1: Introduction to Lattice and to "Trellis" Graphics  

Lecture 1 
Introduction to Course
Preview

01:16  
Lecture 2  16:24  
The lattice package, written by Deepayan Sarkar, attempts to improve on base R graphics by providing better defaults and the ability to easily display multivariate relationships. In particular, the package supports the creation of trellis graphs  graphs that display a variable or the relationship between variables, conditioned on one or more other variables. The typical format is where graph_type is selected from the listed below. formula specifies the variable(s) to display and any conditioning variables . For example ~xA means display numeric variable x for each level of factor A.y~x  A*B means display the relationship between numeric variables y and x separately for every combination of factor A and B levels. ~x means display numeric variable x alone. 

Lecture 3  13:18  
A trellis object, as returned by high level lattice functions like 

Lecture 4 
Dimension and Physical Layout

14:00  
Lecture 5 
Scales and Axes

08:43  
Lecture 6  17:04  


Lecture 7 
Visualizing Univariate Distributions (part 2)

14:56  
Lecture 8  14:14  


Lecture 9  08:03  
Boxandwhisker plots summarize the data using a few quantiles, and possibly some outliers. This summarizing can be important when the number of observations is large. When the number of observations per sample is small, it is often sufficient to simply plot the sample values side by side in a common scale. Such plots are known as strip plots, also referred to as univariate scatter plots. They are in fact very similar to the bivariate scatter plots. 

Section 2: Multiway Tables and Scatter Plots  
Lecture 10  12:16  
An important subset of statistical data comes in the form of tables. Tables usually record the frequency or proportion of observations that fall into a particular category or combination of categories. They could also encode some other summary measure such as a rate (of binary events) or mean (of a continuous variable). In R, tables are usually represented by arrays of one (vectors), two (matrices), or more dimensions. To distinguish them from other vectors and arrays, they often have class “table”. The R functions table() and xtabs() can be used to create tables from raw data. 

Lecture 11 
Multipanel Dot Plots

11:09  
Lecture 12  13:37  
A scatter plot graphs two variables directly against each other in a Cartesian coordinate system. It is a simple graphic in the sense that the data are directly encoded without being summarized in any way; often the aspects that the user needs to worry about most are graphical ones such as whether to join the points by a line, what colors to use, and so on. Depending on the purpose, scatter plots can also be enhanced in several ways. In this chapter, we go over some of the variants supported by panel.xyplot(), which is the default panel function for both xyplot() and splom() (under the alias panel.splom()). 

Lecture 13 
Shingles and Advanced Indexing

06:34  
Lecture 14 
More Scatter Plots (part 1)

16:08  
Lecture 15 
More Scatter Plots (part 2)

10:22  
Lecture 16  13:39  
Scatterplot matrices, produced by splom(), are exactly what the name suggests; they are a matrix of pairwise scatter plots given two or more variables. Conditioning is possible, but it is more common to call splom() with a data frame as its first argument. 

Lecture 17  07:29  
Like scatterplot matrices, parallel coordinates plots are hypervariate in nature, that is, they show relationships between an arbitrary number of variables. Their design is related to univariate scatter plots; in fact, they are basically univariate scatter plots of all variables of interest stacked parallel to each other (vertically in the implementation in lattice), with values that correspond to the same observation linked by line segments. 

Section 3: Trivariate, 3D, and Other Complex Displays  
Lecture 18  09:01  
Trivariate displays encode three primary variables in a panel. There are four highlevel functions in lattice that produce trivariate displays: cloud() creates threedimensional scatter plots of unstructured trivariate data, whereas levelplot(), contourplot(), and wireframe() render surfaces or two dimensional tables evaluated on a systematic rectangular grid. Of these, cloud() and wireframe() are similar in that they both create twodimensional projections of threedimensional constructs, and they share several common arguments that control the details of the projection. 

Lecture 19  09:56  
We begin with cloud(), which produces threedimensional scatter plots. Most of the discussion in this section about projection and how to control it in cloud() applies to wireframe() as well. 

Lecture 20 
3D Scatter Plots (part 2)

08:38  
Lecture 21 
3D Panel Functions

17:08  
Lecture 22 
Visualizing 3D Surfaces
Preview

13:52  
Lecture 23 
More 3D Visualizations

16:50  
Lecture 24  13:39  
The methods we used to plot regression surfaces using wireframe() can be easily adapted to mathematical surfaces. 

Section 4: Finer Control Graphical Parameters and Other Settings  
Lecture 25  15:05  
Graphical parameters are often critical in determining the effectiveness of a plot. Such parameters include obvious ones such as colors, symbols, line types, and fonts for the various elements of a graph, as well as more subtle ones such as the length of tick marks or the amount of space separating different components of the graph. The parameters used in lattice displays are highly customizable. Many of them can be controlled directly by specifying suitable arguments in a highlevel function call. Most derive their default values from a system of common global settings that can also be modified by the user. The latter approach has two primary benefits: it allows good global defaults to be specified, and it provides a consistent “look and feel” to lattice graphics while letting the user retain ultimate control. 

Lecture 26 
Graphical Parameters Continued

14:15  
Lecture 27 
Plot Coordinates and Axis Annotation

13:19  
Lecture 28 
Labels and Legends
Preview

14:49  
Lecture 29 
Data Manipulation (part 1)

13:56  
Lecture 30 
Data Manipulation (part 2)

15:27  
Lecture 31 
Shingles and Related Utilities

14:57  
Lecture 32 
Ordering Categorical Variables

14:59 
Dr. Geoffrey Hubona held fulltime tenuretrack, and tenured, assistant and associate professor faculty positions at 3 major state universities in the Eastern United States from 19932010. In these positions, he taught dozens of various statistics, business information systems, and computer science courses to undergraduate, master's and Ph.D. students. He earned a Ph.D. in Business Administration (Information Systems and Computer Science) from the University of South Florida (USF) in Tampa, FL (1993); an MA in Economics (1990), also from USF; an MBA in Finance (1979) from George Mason University in Fairfax, VA; and a BA in Psychology (1972) from the University of Virginia in Charlottesville, VA. He was a fulltime assistant professor at the University of Maryland Baltimore County (19931996) in Catonsville, MD; a tenured associate professor in the department of Information Systems in the Business College at Virginia Commonwealth University (19962001) in Richmond, VA; and an associate professor in the CIS department of the Robinson College of Business at Georgia State University (20012010). He is the founder of the Georgia R School (20102014) and of RCourseware (2014Present), online educational organizations that teach research methods and quantitative analysis techniques. These research methods techniques include linear and nonlinear modeling, multivariate methods, data mining, programming and simulation, and structural equation modeling and partial least squares (PLS) path modeling. Dr. Hubona is an expert of the analytical, opensource R software suite and of various PLS path modeling software packages, including SmartPLS. He has published dozens of research articles that explain and use these techniques for the analysis of data, and, with software codevelopment partner Dean Lim, has created a popular cloudbased PLS software application, PLSGUI.