Data Mining with R: Go from Beginner to Advanced!

Learn to use R software for data analysis, visualization, and to perform dozens of popular data mining techniques.
3.6 (128 ratings) Instead of using a simple lifetime average, Udemy calculates a
course's star rating by considering a number of different factors
such as the number of ratings, the age of ratings, and the
likelihood of fraudulent ratings.
2,337 students enrolled
$19
$60
68% off
Take This Course
  • Lectures 80
  • Length 12 hours
  • Skill Level All Levels
  • Languages English
  • Includes Lifetime access
    30 day money back guarantee!
    Available on iOS and Android
    Certificate of Completion
Wishlisted Wishlist

How taking a course works

Discover

Find online courses made by experts from around the world.

Learn

Take your courses with you and learn anywhere, anytime.

Master

Learn and practice real-world skills and achieve your goals.

About This Course

Published 7/2015 English

Course Description

This is a "hands-on" business analytics, or data analytics course teaching how to use the popular, no-cost R software to perform dozens of data mining tasks using real data and data mining cases. It teaches critical data analysis, data mining, and predictive analytics skills, including data exploration, data visualization, and data mining skills using one of the most popular business analytics software suites used in industry and government today. The course is structured as a series of dozens of demonstrations of how to perform classification and predictive data mining tasks, including building classification trees, building and training decision trees, using random forests, linear modeling, regression, generalized linear modeling, logistic regression, and many different cluster analysis techniques. The course also trains and instructs on "best practices" for using R software, teaching and demonstrating how to install R software and RStudio, the characteristics of the basic data types and structures in R, as well as how to input data into an R session from the keyboard, from user prompts, or by importing files stored on a computer's hard drive. All software, slides, data, and R scripts that are performed in the dozens of case-based demonstration video lessons are included in the course materials so students can "take them home" and apply them to their own unique data analysis and mining cases. There are also "hands-on" exercises to perform in each course section to reinforce the learning process. The target audience for the course includes undergraduate and graduate students seeking to acquire employable data analytics skills, as well as practicing predictive analytics professionals seeking to expand their repertoire of data analysis and data mining knowledge and capabilities.

What are the requirements?

  • Download and install no-cost R software (complete, easy-to-follow instructions are provided).
  • Download and install no-cost RStudio IDE software (complete, easy-to-follow instructions are provided).

What am I going to get from this course?

  • Use R software for data import and export, data exploration and visualization, and for data analysis tasks, including performing a comprehensive set of data mining operations.
  • Effectively use a number of popular, contemporary data mining methods and techniques in demand by industry including: (1) Decision, classification and regression trees (CART); (2) Random forests; (3) Linear and logistic regression; and (4) Various cluster analysis techniques.
  • Apply the dozens of included "hands-on" cases and examples using real data and R scripts to new and unique data analysis and data mining problems.

What is the target audience?

  • Anyone who wants to learn more about performing data analysis using a variety of popular, contemporary data mining techniques.
  • Data Mining beginners and professionals who wish to enhance their data mining knowledge and skill levels
  • Individuals seeking to gain more proficiency using the popular R and RStudio software suites.
  • Undergraduate students seeking to acquire in-demand analytics skills to enhance employment opportunities.
  • Graduate students seeking to acquire a wider repertoire of analytics skills for research data analysis tasks.

What you get with this course?

Not for you? No problem.
30 day money back guarantee.

Forever yours.
Lifetime access.

Learn on the go.
Desktop, iOS and Android.

Get rewarded.
Certificate of completion.

Curriculum

Section 1: Data Types and Structures in R
Who should take and what will you get from this course ?
Preview
08:49
Installing R and RStudio
04:03
Orientation to Data Types and Structures Section
03:33
Materials for Data Types and Structures
01:09
Vectors: The Basic Default Data Structure in R
11:45
Matrices, Lists and Dataframes: Other Important R Data Structures
10:25
Manipulating Vectors in R
Preview
07:29
Naming Vectors in R
06:35
Creating Matrices in R
Preview
05:12
Creating Lists in R
09:43
Creating Lists in R (continued)
11:25
Creating Dataframes in R
02:45
Section 2: Data and File Input and Output
Orientation to Data and File Input and Output
Preview
01:34
Materials for Data and File Input and Output
01:14
Reading in Data using scan() Function
Preview
09:23
Reading in Data with scan() Function (continued)
15:57
Using readline() Function to Prompt User for Input
01:45
Reading in Files with read.table() and read.csv() Functions
14:35
Writing R Session Files to Disk (Outputting Data)
07:52
Data Input and Output Exercise
02:23
Section 3: Visualizing (Getting to Know) your Data
Solution to Data Input and Output Exercise from Section 2 (1 of 2)
10:33
Solution to Data Input and Output Exercise from Section 2 (2 of 2)
12:11
Materials for Visualizing your Data Section 3
01:00
Preprocessing and Visualizing Birth Data
Preview
09:23
Preprocessing and Visualizing Birth Data (part 2)
Preview
14:51
Preprocessing and Visualizing Birth Data (part 3)
14:54
Visualizing Alumni Donations
12:38
Visualizing Alumni Donations (part 2)
08:22
Visualizing Alumni Donations (part 3)
12:09
Visualizing Alumni Donations (part 4)
07:03
Visualizing (Getting to Know) your Data Section Exercise
01:59
Section 4: Decision Trees and Random Forests
Solution to Visualizing Virginia Deaths Exercise
15:53
Introduction to Decision Trees and Random Forests
Preview
07:18
Training Decision Trees with party Package
09:10
Training Decision Trees with party Package (part 2)
12:32
Bodyfat Decision Tree example with Package rpart
13:01
Bodyfat Decision Tree example with Package rpart (part 2)
08:25
Bagging and Random Forests with Section Exercise
14:36
Section 5: Linear Modeling (Regression) and Generalized Linear Modeling (GLMs)
Begin Decision Tree and Random Forests Exercise Solution
09:15
Random Forests Exercise Bagging Segment Solution
09:19
Random Forests Exercise Solution (part 3)
12:36
Materials for Regression and GLMs Section
01:05
Begin Regression Example
Preview
10:38
Continue Regression Example
10:48
Finish Regression Example
06:47
Begin Regression and GLM Slides
08:19
Finish Generalized Linear Modeling Slides
Preview
08:13
Heart Data Binomial GLM Example
Preview
12:59
Epidemic Data Poisson GLM Example
04:38
Regression and GLMs Exercises
01:08
Section 6: K-Means, K-Medoids, and Hierarchical Cluster Analysis Approaches
Materials and End-of-Section-6 Exercise
01:28
Regression and GLM Exercises Solutions (part 1)
10:49
Regression and GLM Exercises Solutions (part 2)
11:27
Regression and GLM Exercises Solutions (part 3)
10:04
K-Means Iris Flower Example
Preview
12:44
K-Means Exoplanets Example
20:04
K-Medoids Iris Flower Re-Analysis Example
08:02
Hierarchical Clustering Iris Flower Example
07:18
Hierarchical Clustering Pottery Example
11:16
Section 7: Density-Based and Agglomerative Hierarchical Clustering
Materials for Density-Based and Hierarchical Agglomerative Clustering Section
01:35
Density-Based and Agglomerative Clustering Introduction and Previous Exercise
Preview
11:48
Density-Based Clustering Example
13:04
Body Measurements and Agglomerative Hierarchical Clustering Example
13:23
Continue Body Measurements Agglomerative Clustering Example
16:41
Clustering Jet Fighters Example
16:52
Section 8: More Cluster Analysis Examples, Graphics, and Detecting Outliers
Materials and End-of-Section-8 Exercise
01:08
K-Means Clustering Explained in Detail
Preview
06:25
Clustering Crime Rates Example
10:09
Clustering Crime Rates Example (part 2)
13:26
Gastroenterologist Questionnaire Model-Based Clustering Eample
14:23
Graphical Approaches to Cluster Analysis Examples
Preview
09:17
Detecting Outliers
Preview
09:09
Detecting Outliers (part 2)
11:34
Section 9: K-Means TAM Residuals Cluster Analysis Software Case example
Crime Data Exercise Solution
08:00
Crime Data Exercise Solution (part 2)
11:03
Materials for Final Data Mining Course Section
00:59
K-Means Clustering PLS-POS Capability Implementation
07:32
K-Means Clustering PLS-POS Capability Implementation Concepts
09:53
Implementing K-Means Clustering for TAM Residuals Continued
04:22
Implementing K-Means Clustering for TAM Residuals in R Software
10:54

Students Who Viewed This Course Also Viewed

  • Loading
  • Loading
  • Loading

Instructor Biography

Geoffrey Hubona, Ph.D., Professor of Information Systems

Dr. Geoffrey Hubona held full-time tenure-track, and tenured, assistant and associate professor faculty positions at 3 major state universities in the Eastern United States from 1993-2010. In these positions, he taught dozens of various statistics, business information systems, and computer science courses to undergraduate, master's and Ph.D. students. He earned a Ph.D. in Business Administration (Information Systems and Computer Science) from the University of South Florida (USF) in Tampa, FL (1993); an MA in Economics (1990), also from USF; an MBA in Finance (1979) from George Mason University in Fairfax, VA; and a BA in Psychology (1972) from the University of Virginia in Charlottesville, VA. He was a full-time assistant professor at the University of Maryland Baltimore County (1993-1996) in Catonsville, MD; a tenured associate professor in the department of Information Systems in the Business College at Virginia Commonwealth University (1996-2001) in Richmond, VA; and an associate professor in the CIS department of the Robinson College of Business at Georgia State University (2001-2010). He is the founder of the Georgia R School (2010-2014) and of R-Courseware (2014-Present), online educational organizations that teach research methods and quantitative analysis techniques. These research methods techniques include linear and non-linear modeling, multivariate methods, data mining, programming and simulation, and structural equation modeling and partial least squares (PLS) path modeling. Dr. Hubona is an expert of the analytical, open-source R software suite and of various PLS path modeling software packages, including SmartPLS. He has published dozens of research articles that explain and use these techniques for the analysis of data, and, with software co-development partner Dean Lim, has created a popular cloud-based PLS software application, PLS-GUI.

Ready to start learning?
Take This Course