If you are looking for that one course that includes everything about data visualization with R, this is it. Let’s get on this data visualization journey together.
This course is a blend of text, videos, code examples, and assessments, which together makes your learning journey all the more exciting and truly rewarding. It includes sections that form a sequential flow of concepts covering a focused learning path presented in a modular manner. This helps you learn a range of topics at your own speed and also move towards your goal of learning data visualization with R.
The R language is a powerful open source functional programming language. R is becoming the go-to tool for data scientists and analysts. Its growing popularity is due to its open source nature and extensive development community. R is increasingly being used by experienced data science professionals instead of Python and it will remain the top choice for data scientists in 2017. Large companies continue to use R for their data science needs and this course will make you ready for when these opportunities come your way.
This course has been prepared using extensive research and curation skills. Each section adds to the skills learned and helps us to achieve mastery of data visualization. Every section is modular and can be used as a standalone resource. This course covers different visualization techniques in R and assorted R graphs, plots, maps, and reports. It is a practical and interactive way to learn about R graphics, all of which are discussed in an easy-to-grasp manner. This course has been designed to include topics on every possible data visualization requirement from a data scientist and it does so in a step-by-step and practical manner.
We will start by focusing on “ggplot2” and show you how to create advanced figures for data exploration. Then, we will move on to customizing the plots and then cover interactive plots. We will then cover time series plots, heat maps, dendograms. Following that, we will look at maps and how to make them interactive. We will then turn our attention to building an interactive report using the “ggvis” package and publishing reports and plots using Shiny. Finally, we will cover data in higher dimensions which will complete our extensive tour of the data visualization capabilities possible using R.
This course has been authored by some of the best in their fields:
Dr. Fabio Veronesi
In his career, Dr. Veronesi worked at several topics related to environmental research: digital soil mapping, cartography and shaded relief, renewable energy and transmission line siting. During this time Dr. Veronesi has specialized in the application of spatial statistical techniques to environmental data.
Atmajitsinh Gohil works as a senior consultant at a consultancy firm in New York City. He writes about data manipulation, data exploration, visualization, and basic R plotting functions on his blog. He has a master's degree in financial economics from the State University of New York (SUNY), Buffalo.
Yu-Wei, Chiu (David Chiu)
Yu-Wei, Chiu (David Chiu) is the founder of LargitData, a start-up company that mainly focuses on providing Big Data and machine learning products. In addition to being a start-up entrepreneur and data scientist, he specializes in using Spark and Hadoop to process big data and apply data mining techniques for data analysis. Yu-Wei is also a professional lecturer and has delivered lectures on big data and machine learning in R and Python, and given tech talks at a variety of conferences.
Creating professional looking plots, both static and interactive, may seem hard; however, with R, we can create fully customizable plots with a few lines of code. In this lecture, we will look at the potential applications of R for visualizing data in static and interactive plots.
It is not always easy to import data in R using the default settings. To do it successfully, several parameters need to be set. In this lecture, we will learn how to set the working directory. We will understand the important settings of the
read.table function. Finally, we will import the data and check its structure.
Importing Excel tables in R is sometimes tricky. However, with the right knowledge, the proper package can be installed and everything should work out fine. In this lecture, we will see how to install the
xlsx package. We will understand the format of the code to import Excel files and finally, we'll import and check the data.
Exporting data in R may seem difficult, since we have many options to choose from. However, R has powerful exporting functions that, with few options, can do the job successfully. In this lecture, we will learn to subset our data so that have something to export. Then, we will learn how to export data in R. Finally, we will see how to export data into multiple Excel sheets.
In this lecture, we will demonstrate how to use The Grammar of Graphics to construct our very first
ggplot2chart with the superstore sales dataset.
Aesthetics mapping describes how data variables are mapped to the visual property of a plot. In this lecture, we will discuss how to modify aesthetics mapping on geometric objects.
Geometric objects are elements that we mark on the plot. One can use the geometric objects in
ggplot2 to create either a line, bar, or box chart. Moreover, one can integrate these simple geometric objects and aesthetic mapping to create a more professional plot. In this lecture, we will introduce how to use geometric objects to create various charts.
Besides mapping particular variables to either the x or y axis, one can first perform statistical transformations on variables, and then remap the transformed variable to a specific position. In this lecture, we will introduce how to perform variable transformations with
Besides setting the aesthetic mapping for each plot or geometric object, one can use scale to control how variables are mapped to the visual property. In this lecture, we will introduce how to adjust the scale of aesthetics in
When performing data exploration, it is essential to compare data across different groups. Faceting is a technique that enables the user to create graphs for subsets of data. In this lecture, we will demonstrate how to use the
facet function to create a chart for multiple subsets of data.
To create an overview of a dataset, we may need to combine separate individual plots into one. In this lecture, we will introduce how to combine individual subplots into one plot.
Producing elegant plots in
ggplot2 may seem difficult but it is actually quite easy to do. In fact,
ggplot2 takes care, by default, of much of the graphical design of the plot, meaning that we can produce beautiful histograms with just a few lines of code. In this lecture, we'll load
ggplot2 and then import the dataset. We'll plot a simple histogram, using the default settings. Finally, we'll plot multiple distributions with faceting.
Histograms are useful for certain tasks, but for comparing several variables at once, they are not the best. Box plots can be used instead, since they allow the comparison of the distribution of multiple variables side by side. In this lecture, we'll see what a box plot is and what it represents. We'll create multiple box plots with just two lines of code and then we'll order the plot to achieve better results.
Categorical variables are invariably difficult to visualize in meaningful ways. Bar charts are important for plotting categorical variables and defining their characteristics. In this lecture, we'll talk about bar charts and create simple bar charts in
ggplot2. We'll also learn how to automatically order a
data.frame and plot ordered bar charts.
In many cases, we are interested in comparing multiple variables at once and checking their correlation. Scatterplots allow us to do just that and are an important tool in a data analyst's toolbox. In this lecture, we'll understand the importance of scatterplots and create simple scatterplots in
ggplot2. We'll then create more complex visualizations by tweaking some basic options.
In many cases, the variable time is underestimated. However, time series are extremely useful to determine the temporal pattern of a variable. In this lecture, we'll understand the structure of time-series plots. We'll then plot a simple time-series plot in
ggplot2 and customize the plots with color and size.
Many datasets are affected by uncertainty and people not always know how to show this in plots. This lecture will present ways to solve this and take uncertainty into account. In this lecture, we'll see how to handle uncertainty. We'll present simple ways to include uncertainty in bar-charts. Finally, we'll look at scatterplots with double error bars.
ggplot2 creates plots with a grayish background, and without axes lines and white gridlines. This is not the standard look you normally find in scientific manuscripts. In this lecture, we'll understand the graphical elements of the standard theme. We'll then look at changing the default theme and explore the differences between the default theme and other themes.
The default color scale is not always appropriate to spot all the differences in the data we are trying to plot. In many cases, we have to change it so that our plots can become more informative. In this lecture, we'll see how to change the default two colors for plotting continuous variables. Additionally, we'll explore ways to include more colors in the color scale and present a discrete color scale for categorical variables.
ggplot2 uses the names of the columns as labels, meaning that if these are not self-explanatory, the plot will not provide a good framework to understand its meaning. By adding some lines of code, we can customize the plot in order to change the labels and make the plot more informative. In this lecture, we'll see how to add a title for the plot, change the title of the legend, and change the axes labels.
The default plots created by
ggplot2 lack several elements that are, in many cases, useful to provide additional information to your audience. However, there are simple functions that can be used to add supplementary elements to the plot. In this lecture, we'll see how to add trend lines to scatterplots. We'll also learn how to add vertical and horizontal lines to plots and how to customize the lines.
In many cases, it is crucial to be able to include textual labels on plots to provide your audience with additional information. This can be done in
ggplot2 in both static and dynamic ways. In this lecture, we'll see how to add fixed text labels and dynamic text labels. We'll finish the lecture by learning how to add text outside the plot and change the axis labels.
facet_wrap function, it is only possible to create a grid of plots of the same type. However, in some cases, it is necessary to create side-by-side graphs with diverse plots. This can be done using the
gridExtra package. In this lecture, we'll review the
facet_wrap function. We'll then see how to install the
gridExtra package and create multi-plots.
Static plots are the standard for publishing in traditional media, such as journal papers. However, the world is moving towards an Internet-based presentation of results and even scientific journals are quickly adapting it. Many now offer the possibility of including interactive plots. In R, we can create plots for the Web with the
rCharts package, which is a bit more difficult to install than
ggplot2. In this lecture, we'll discuss the
rCharts package. We'll install
devtools first and then install
rCharts from GitHub.
rCharts features a syntax more similar to standard plotting in R than what we saw with
ggplot2. However, it is easy to pick it up by by understanding simple examples and then including additional details. In this lecture, we'll explain the syntax used in
Even though we know nothing about HTML and CSS, we can still obtain beautiful bar charts using templates created by other users. In this lecture, we'll see how to plot basic interactive bar charts and then add axis labels to it. We'll then use a template for an elegant finish.
If too many data points are present in our dataset, scatterplot visualization may become very confusing in static plots. However, in interactive plots, this limitation no longer applies, since we can select to visualize only a part of the dataset. In this lecture, we'll create basic interactive scatterplots. We'll understand the interactivity and then add elements and controls.
Time-series plots are a great way to visualize the temporal pattern of a variable. However, sometimes, we cannot fully understand the exact date of each point based only on the values on the x axis. Interactive visualization can solve this problem by adding tooltips in which we can take a look at the raw data. In this lecture, we'll see how to set the data in the correct format. We'll then plot a basic time-series plot and then add interactive elements to it.
Being able to plot spatial data on web maps is certainly helpful and a crucial skill to have, but it can be difficult since it requires knowledge of different technologies. R makes this process very easy with dedicated functions that allow us to plot on web GIS services very easily. In this lecture, we will understand what web mapping is. We will cover the available mapping platforms and the required packages.
Plotting data with the
plotGoogleMaps function is not as easy as using the
plot function. With a simple step-by-step guide, we can achieve good command of the function enabling us to plot whatever data we choose. In this lecture, we'll first install
plotGoogleMaps and then create our first map. We'll then look at customizing the plotting window.
An interactive map with just one layer is hardly useful for our purpose. Many a times, we are faced with the challenge of plotting several data at once. This requires some additional work and understanding, but it is definitely not hard in R. In this lecture, we'll understand the layer system. Then, we'll add layers with the right options and then check the results.
Plotting raster data on Google Maps can be tricky. The
plotGoogleMaps function does not handle rasters very well and if not done correctly, the visualization will fail. This lecture will show users how to plot rasters successfully. In this lecture, we'll first download the seismic risk map. We'll then understand the limitations of plotting rasters on Google Maps and then see how to plot rasters successfully.
Plotting on Google Maps is easy but Google Maps is a commercial product. If we want to use them on a commercial website, we would need to pay for it. OpenStreetMaps is free to use, knowing how to use it is certainly an advantage. In this lecture, we'll first install LeafletR. We'll then see how LeafletR works with geoJSON data. Finally, we'll plot and customize our map.
Using open data for our analysis requires a deep knowledge of the data provider and the actual data we are using. Without this knowledge, we may end up with erroneous results. In this lecture, we'll try to understand the World Bank data. We'll see what data is available from them and then look at the R package to download this data.
Downloading data from the World Bank can be difficult since it requires users to know the acronyms used to refer to the different datasets. However, with some help, this process becomes very easy. In this lecture, we'll understand the import process. We'll search the correct indicator and then download the data.
To create a spatial map of the World Bank data, we just have to download the data and then transform it into spatial data. However, in the dataset, there are no coordinates or any other information that will help us do that. The solution is to use the geocoding information from another dataset for this purpose. In this lecture, we'll use the natural earth data to transform our data into spatial objects. We'll understand the transformation process and then plot the results in a map.
Using the World Bank data just to plot a static spatial map is very limitative. There are tons of other functionalities that you can perform with this data and this lecture serves to provide some guidance into these additional avenues of research. In this lecture, we'll see how to download more than one dataset. We'll then see how to perform correlation analysis and end this lecture with how to turn our map interactive.
RStudio has an R Markdown workflow built in; we can use its GUI to create markdown reports in HTML, PDF, slides, or the Microsoft Word format. In this lecture, we will introduce how to build an R Markdown report with RStudio.
In an R Markdown report, one can embed R code chunks into the report with the
knitr syntax. In this lecture, we will introduce how to create and control the output with different code chunk configurations.
You need to have created and opened a new R Markdown (
.rmd) file in RStudio to proceed with the remaining lectures.
In order to interact with the reports' figures, one can create interactive graphics with
ggvis. In this lecture, we will demonstrate how to build our first interactive plot from the real estate dataset.
ggvis package uses similar grammar and syntax to
ggplot2, and we can use this basic syntax to create figures. In this lecture, we will cover how to use the
ggvis syntax and grammar to build advanced plots.
Besides making different plots with various layer types, one can control the axes and legends of a
ggvis plot. We can also rescale the mapping of the data and determine how it should be displayed on the plot with the
scale function. In this lecture, we will demonstrate how to set the appearance properties of both axes and legends and how to scale data in
One of the most attractive features of
ggvis is that it can be used to create an interactive web form. This allows the user to subset the data, or even change the visual properties of the plot, through interacting with the web form. In this lecture, we will introduce how to add interactivity to a
Packt has been committed to developer learning since 2004. A lot has changed in software since then - but Packt has remained responsive to these changes, continuing to look forward at the trends and tools defining the way we work and live. And how to put them to work.
With an extensive library of content - more than 4000 books and video courses -Packt's mission is to help developers stay relevant in a rapidly changing world. From new web frameworks and programming languages, to cutting edge data analytics, and DevOps, Packt takes software professionals in every field to what's important to them now.
From skills that will help you to develop and future proof your career to immediate solutions to every day tech challenges, Packt is a go-to resource to make you a better, smarter developer.
Packt Udemy courses continue this tradition, bringing you comprehensive yet concise video courses straight from the experts.