# The Comprehensive Statistics and Data Science with R Course

## What you'll learn

- Students will understand what R is, and how to input and output data files into their R sessions.
- Students will know how to manipulate numbers and vectors, and will understand objects and classes.
- Students will understand how to create data structures in R: vectors; arrays and matrices; lists and data frames.
- Students will know how to use R as a statistical environment following many examples.
- Students will understand how to create, estimate and interpret ANOVA, regression, GLM and GAM statistical models with many examples of each.
- Students will learn how to create statistical and other visualizations using both the base and ggplot graphics capabilities in R.

## Course content

- Preview01:55
- Preview04:49
- Preview08:43
- Preview06:45
- 08:35Agenda and What is R ? (slides, Part 1)
- 06:59What is R ? (slides, part 2)
- 06:54What is R ? (slides, part 3)
- 06:13What is R ? (slides, part 4)
- 06:45What is R ? (slides, part 5)
- Preview05:40
- Preview06:26
- 09:03Reading in Data (part 3)

## Requirements

- Students must install R and RStudio (free software) but ample instructions are provided.

## Description

This course, ** The Comprehensive Statistics and Data Science with R Course**, is mostly based on the authoritative documentation in the online "An Introduction to R" manual produced with each new R release by the Comprehensive R Archive Network (CRAN) development core team. These are the people who actually write, test, produce and release the R code to the general public by way of the CRAN mirrors. It is a rich and detailed 10-session course which covers much of the content in the contemporary 105-page CRAN manual. The ten sessions follow the outline in the

**An Introduction to R**online manual and specifically instruct with respect to the following user topics:

1. Introduction to R; Inputting data into R

2. Simple manipulation of numbers and vectors

3. Objects, their modes and attributes

4. Arrays and matrices

5. Lists and data frames

6. Writing user-defined functions

7. Working with R as a statistical environment

8. Statistical models and formulae; ANOVA and regression

9. GLMs and GAMs

10. Creating statistical and other visualizations with R

It is a comprehensive and decidedly "hands-on" course. You are taught how to actually use R and R script to create everything that you see on-screen in the course videos. Everything is included with the course materials: all software; slides; R scripts; data sets; exercises and solutions; in fact, everything that you see utilized in any of the 200+ course videos are included with the downloadable course materials.

The course is structured for both the novice R user, as well as for the more experienced R user who seeks a refresher course in the benefits, tools and capabilities that exist in R as a software suite appropriate for statistical analysis and manipulation. The first half of the course is suited for novice R users and guides one through "hands-on" practice to master the input and output of data, as well as all of the major and important objects and data structures that are used within the R environment. The second half of the course is a detailed "hands-on" transcript for using R for statistical analysis including detailed data-driven examples of ANOVA, regression, and generalized linear and additive models. Finally, the course concludes with a multitude of "hands-on" instructional videos on how to create elegant and elaborate statistical (and other) graphics visualizations using both the base and gglot visualization packages in R.

The course is very useful for any quantitative analysis professional who wishes to "come up to speed" on the use of R quickly. It would also be useful for any graduate student or college or university faculty member who also seeks to master these data analysis skills using the popular R package.

## Who this course is for:

- This course will benefit anyone wishing to learn R and especially those who seek an in-depth "hands-on" tutorial on performing statistical analyses with R.
- The course is useful for graduate students, college and university faculty, and working quantitative analysis professionals.

## Instructor

Dr. Geoffrey Hubona has held full-time tenure-track, and tenured, assistant and associate professor faculty positions at 4 major state universities in the United States since 1993. Currently, he is an associate professor of MIS at Texas A&M International University where he teaches for-credit courses on Business Data Visualization (undergrad), Advanced Programming using R (graduate), and Data Mining and Business Analytics (graduate). In previous academic faculty positions, he taught dozens of various statistics, business information systems, and computer science courses to undergraduate, master's and Ph.D. students. He earned a Ph.D. in Business Administration (Information Systems and Computer Science) from the University of South Florida (USF) in Tampa, FL; an MA in Economics, also from USF; an MBA in Finance from George Mason University in Fairfax, VA; and a BA in Psychology from the University of Virginia in Charlottesville, VA. He is the founder of the Georgia R School (2010-2014) and of R-Courseware (2014-Present), online educational organizations that teach research methods and quantitative analysis techniques. These research methods techniques include linear and non-linear modeling, multivariate methods, data mining, programming and simulation, and structural equation modeling and partial least squares (PLS) path modeling.