Data Wrangling and Exploratory Data Analysis with R
What you'll learn
- How to import data.
- How to clean and tidy data.
- How to manipulate data.
- How to visualize data.
Requirements
- Basic and fundamental understanding of R concepts such as variables and functions.
Description
The tidyverse is a collection of R packages designed for Data Science.
The scope of this course is restricted to:
ggplot2 package
dplyr package
tidyr package
readr package
readxl package
tibbles package
This course is going to teach how you can use some of these packages for data analysis covering 7 sections as follows.
Section 1: Introduction
In this section, you are going to know what the course is all about and also get a glimpse of some of the tools we will be using throughout the course.
Section 2: Data visualization with ggplot2
In this section, you will learn how to use the ggpot2 package for data visualization, using the diamonds dataset as a case study.
This section will cover major data visualizations such as:
Barplots
Boxplots
Scatterplots
Line plots
Histogram
At the end of this section, should be able to know how to plot various visualizations and also give meaningful interpretations of them.
Section 3: Data manipulation with dplyr
In this section, you will learn all about the dplyr package and how you can manipulate your data with the available functions in the dplyr package using the New York flights database of 2013.
At the end of this package, you should be able to perform tasks on the dataset such as
filtering
arranging
renaming
variable creation
selection
Table/Dataset joining
The Practical Quiz at the end of this section will test your understanding of the various concepts treated in the section.
Section 4: Data tidying with tidyr
This section is aimed at showing you how you can tidy a dirty dataset when you come across one.
You are going to learn how to make datasets longer or wider.
You will also be learning how you can separate or unite columns together.
Section 5: Importing data
In this section, you will learn about modernized data frame called tibbles.
This section will also show you how you can import various structured data formats in R such as CSV and XLSX files.
Section 6: Case Study: Adventure Works Database
In this section, you will learn how you can combine various concepts you have learnt in this course and apply them to the Adventure Works database, and also what a data analyst's workflow process looks like.
Section 7: EXAM
This section consists of 20 multiple choice questions which you are expected to answer to get your final course certificate. It covers everything covered in this course.
Who this course is for:
- Beginner and Experienced Data Analysts who want to use R Programming for data analysis
- Data Scientists and Machine Learning Engineers who want to learn how to use R for exploratory data analysis
Instructor
In an era where there is a huge and vast amount of data, the need for professionals who can manage and make sense of this data is needed, either by describing, modelling or identifying patterns in data.
As someone who has worked as a freelance Data Scientist for over 5 years in the data industry, I aim to train and educate you on how you can become one of the best data professionals out there either as a Data Analyst, Data Scientist or Machine Learning Engineer.
I graduated with a BSc in Statistics and currently working as a Biostatistician at a public hospital assisting medical researchers in analysing health data. As a technical writer, I have written various articles and have thousands of reads on medium.
I am proficient in Mathematics, Statistics and Programming (R and Python) and have a strong domain knowledge of the health sector.
As a data scientist, my day-to-day activities range from performing exploratory data analysis to building machine learning or deep learning models.
My biggest happiness comes when students benefit from my courses and I hope as you are reading this, you have either enrolled or are about to enroll in one of my courses.
I wish you all the best in your path to becoming a world-class data professional!