Multivariate Data Visualization with R
What you'll learn
- Graphically depict visual 2D, 3D, 4D (and so on) relationships that exist in multivariate data sets.
- Understand how "trellis" graphic objects are different from other graphic objects in R.
- Understand how to apply the techniques of conditioning and paneling to present multivariate data relationships.
- Understand the nature of lattice panel functions and know how to create and modify them for brilliant multivariate graphics displays.
- Have a powerful visual toolset to visually present the results of multi-variable statistical model fitting.
- Students will need to install R and RStudio (instructions are provided in the course materials).
It is often both useful and revealing to create visualizations, plots and graphs of the multivariate data that is the subject of one's research project. Often, both pre-analysis and post-analysis visualizations can help one understand “what is going on in the data" in a way that looking at numerical summaries of fitted model estimates cannot. The lattice package in R is uniquely designed to graphically depict relationships in multivariate data sets.
This course describes and demonstrates this creative approach for constructing and drawing grid-based multivariate graphic plots and figures using R. Lattice graphics are characterized as multi-variable (3, 4, 5 or more variables) plots that use conditioning and paneling. Consequently, it is a popular approach for, and a good fit to visually present the results of multi-variable statistical model fitting. The appearance of most of the plots, graphs and figures are determined by panel functions, rather than by the high-level graphics function calls themselves. Further, the user of lattice graphics has extensive and comprehensive control over many more of the details and features of the visual plots, far greater control that is afforded by the base graphics approach in R. The method is based on trellis graphics which were popularized in the S language developed by Bell Labs.
Who this course is for:
- Anyone who uses R, or who wants to use R, for any sort of multivariate data analysis would benefit from taking this course.
- The course is appropriate for students, scientists, or other quantitative-analysis professionals who want to display numerical information in plots and graphs.
- To take advantage of the course, students will need to have a basic (introductory) level or ability to use R software. However, all of the graphic R scripts are provided with the course materials.
Dr. Geoffrey Hubona has held full-time tenure-track, and tenured, assistant and associate professor faculty positions at 4 major state universities in the United States since 1993. Currently, he is an associate professor of MIS at Texas A&M International University where he teaches for-credit courses on Business Data Visualization (undergrad), Advanced Programming using R (graduate), and Data Mining and Business Analytics (graduate). In previous academic faculty positions, he taught dozens of various statistics, business information systems, and computer science courses to undergraduate, master's and Ph.D. students. He earned a Ph.D. in Business Administration (Information Systems and Computer Science) from the University of South Florida (USF) in Tampa, FL; an MA in Economics, also from USF; an MBA in Finance from George Mason University in Fairfax, VA; and a BA in Psychology from the University of Virginia in Charlottesville, VA. He is the founder of the Georgia R School (2010-2014) and of R-Courseware (2014-Present), online educational organizations that teach research methods and quantitative analysis techniques. These research methods techniques include linear and non-linear modeling, multivariate methods, data mining, programming and simulation, and structural equation modeling and partial least squares (PLS) path modeling.