
Explore the course structure from basics to machine learning, covering data types, import/export, data manipulation, visualization, and deep learning across supervised, unsupervised, and reinforcement learning.
Install and configure R and RStudio for a local data science setup. Use open source, cross-platform support, graphics, and over ten thousand packages; enable a dedicated library and dark theme.
Visit the home page and open the material section to access course files. Download the static zip or clone the GitHub repository, then extract or work with the local files.
Set up an RStudio project, load and organize files with clear naming, and use the four windows—coding, console, environment, and history—to run code and manage packages.
Explore file formats for R workflows, from scripts and notebooks to markdown, html, and pdf, with interactive graphs, latish documents, and speedups using c++.
Learn to build reproducible reports with rmarkdown by combining code chunks and text in chapters with a table of contents; render interactive html documents with plots and interactive tables.
Explore the six basic data types in R: numeric, integer, complex, logical, character, and raw, and how coercion governs flexible representations like vectors, matrices, data frames, and arrays.
Explore basic data types in R, build and inspect vectors of numeric, integer, boolean, and character values, and practice type coercion and casting with as.numeric and as.integer.
Explore matrices and arrays in R by creating vectors, converting to matrices with specified rows and columns, and building multi-dimensional arrays with adjustable dimensions, including byrow options.
Learn to create and explore lists in R by combining vectors of different types, naming elements, and accessing items with brackets and the dot operator.
Create and manipulate factors in r by defining fixed levels and enforcing valid values, and learn to handle data import issues when comma decimals affect factor and numeric conversions.
Learn to create and manipulate data frames in R, comparing base data.frame and Tybalt options, access columns, and delete columns to manage tabular data.
Learn string handling in R using a 1984 text template to clean, tokenize, remove numbers and hyphens, and analyze word lengths and character frequencies, including Winston and O'Brien.
Explore time and date handling in R by converting strings to date and time formats, creating and parsing timestamps with lubridate, and visualizing incoming versus outgoing flow over 15 weeks.
Explore arithmetic, logical, and special operators in R, using vectors A and B to perform element-wise calculations, comparisons, integer division, modulus, and the %in% operator.
Explore loops in R for data science, focusing on for loops for known iterations, while loops for unknown iterations, and do-while loops; learn vectorization to avoid explicit loops.
Explore the three loop types in r, including the for and repeat loops, by defining sequences, building nested loops, and printing letters from a to t with break-based exits.
Explore how functions encapsulate reusable code with inputs, logic, and outputs. Learn to bundle functions into packages and use default values, named parameters, and parameter order.
Learn to code functions in R by constructing the Fibonacci sequence, where each term equals the sum of the two preceding ones, and return the first n elements on request.
Build and test a fibonacci function in R by creating and expanding a fibonacci vector, using length and indexing, and handling edge cases like 1, 2, 0, and negative inputs.
Import data into R from CSV, Excel, JSON, and SPSS formats using dedicated packages, handling separators and options. Explore exporting workflows to save results.
Learn how to export data in R using write and safeguard, specifying the object and file, creating CSV exports without headers, and exporting SPSS formats and multiple objects.
Learn web scraping in R by extracting data from web resources like Wikipedia and converting tables into a data frame, avoiding manual copy-paste.
Learn to perform web scraping in R by downloading a Wikipedia wind power table, extracting it with XPath, cleaning numeric values, and displaying the first five columns.
Piping 101 shows how piping passes the left-hand input to the next function, turning vector, log, differences, exponential, and rounding to one decimal place into a readable workflow.
Filter data in R by indexing vectors and data frames. Learn one-based indexing, slicing, selecting elements with c(), and using $ or brackets to access columns.
Master filtering in R with the diamonds dataset using the deployer package, applying filter, sample, slice, top, and select operations, including whitelist and blacklist options to shape columns and rows.
Explore data aggregation using group by and summarize to compute group-level statistics, such as means, across multiple groupings in R.
Explore data aggregation in R using a population dataset from the World Health Organization. Group by country, calculate min, max, absolute and relative increases from 1995 to 2015.
Learn how to reshape data into tidy format, where each variable has its own column, each observation its own row, and each value its own cell, enabling easier analysis.
Explore data reshaping in R by converting between wide and tidy formats, then regroup and plot results using tidy data principles, with practical examples of group by and summarize.
Explore set operations in R using intersect, union, and setdiff to compare two sets, identify overlaps, include all observations, and obtain left-hand side results based on argument order.
Explore set operations in R with two vectors a and b, using intersect to find the overlap, setdiff to identify unique values, and union to combine all values.
Explore how to join datasets in R using left join, inner join, and other variants by aligning on indices, keeping left data intact, and handling missing values.
Learn to prepare and join data frames A and B with the deployer package, performing left, right, and full joins on a common index and renaming value columns.
Explore data visualization in R, from base plotting to plotty, with hover and zoom, and leaflet for geospatial plots and digraphs for time series. Also cover Sankei diagrams and TriMet.
Discover ggplot 101 by mapping data with aesthetics, building scatter, histograms, and box plots from the diamonds dataset, and using facets, scales, and jitter to reveal patterns.
Explore and visualize the diamonds data set using ggplot to create one- and two-variable plots, including discrete and continuous variables, density, violin, dot plots, and faceted color and size mappings.
Explore Plotly lab intro to create interactive visualizations with World Happiness Report data, linking GDP and life expectancy to happiness scores, featuring hover details, zoom, area highlight, and easy sharing.
Build interactive scatter plots in R using plotly, combining 2015–2016 happiness data with GDP, life expectancy, and freedom to reveal patterns through subplots and hover text.
Explore leaflet for interactive geospatial visualizations by analyzing the 1854 London cholera data, mapping deaths with red circles and pumps with green circles to reveal waterborne transmission.
Load the dataset, compute the median longitude and latitude, and create an interactive geospatial visualization with Leaflet, mapping deaths and pumps as red and green circles.
Explore dynamic time series with the digraphs package. Interactively view metal prices, hover for current values, and adjust periods to study time windows.
Load metal price data from a web API into xts time series, then create dynamic dygraphs plots with rebasing to 100, a range selector, and linked plots.
Explore univariate and multivariate outlier detection in R using z-score, extreme value analyses, and Deep Skin. Learn to handle outliers with imputation, trimming, or top, bottom or zero coating.
Explore outlier detection using box plots of the iris data set, identify outliers with z-score analysis, and compare the scan technique to reveal observations that stand out.
Explore outlier detection in iris data using box plots and z-score thresholds. Implement per-species limits with Q1, Q3, IQR, and compare with a clustering approach and PCA visualization.
Master missing data handling in R via data imputation, using visual patterns to guide simple deletion, univariate imputation, and models like mice, miss forest, and miss ranger.
Explore missing data handling in a credit approval dataset by analyzing patterns and applying techniques such as univariate and multivariate imputation, and removing observations as needed.
Advance your missing data handling skills by analyzing patterns in credit approval data, then apply univariate and multivariate imputations with Miss Ranger and Miss Forest to compare results.
Master string handling with Stringer and regular expressions by detecting patterns, locating their positions, replacing matches, and extracting values using anchors.
Learn to use the Stringer package in R to detect, locate, replace, split, and test patterns with regular expressions, including anchors, quantifiers, and lookaround.
Explore artificial intelligence, its relation to machine learning and deploying, with examples like Google Maps and Google Search, and study deep learning, neural networks, image classification, and ape species distinction.
Explore how machine learning differs from classical programming, and learn supervised, unsupervised, and reinforcement learning, including classification and regression, with accuracy and r-squared as evaluation metrics.
Explore building and evaluating a machine learning model with supervised learning, using training data, validation data, and hold-out testing data to predict the target variable from independent predictors.
Explore regression as a supervised learning technique for continuous targets, including univariate and multivariate regression. See how linear, quadratic, and higher-order relationships shape models, from univariate regression to multivariate surfaces.
Explore univariate linear regression with one independent variable to predict a dependent variable using a linear model, and see how size and price relate via slope, intercept, and squared error.
Explore univariate linear regression through an interactive dashboard, adjust the estimated slope and offset, and observe how noise affects mean squared error and model fit.
Explore univariate linear regression by building a model to predict mass from height, handling an outlier, and evaluating predictions with r-squared in a Star Wars data set.
Explore univariate regression with a linear model linking speed to distance using Hubble telescope data, calculate adjusted r-squared, assess correlation, make predictions, and estimate the Habu constant.
Load univariate regression data, fit a linear model of velocity versus distance (adjusted R-squared 0.978), generate predictions across speeds, and compute the Hubble constant as velocity over distance (about 72.1).
Explore polynomial regression to capture non-linear patterns using higher-order terms within a linear regression framework. Learn how to balance model complexity and data fit to avoid overfitting and misleading R-squared.
Create and analyze synthetic data to explore polynomial regression in r, plotting observations and residuals. Compare linear and higher-order models with poly and as.is, using adjusted r-squared to avoid overfitting.
Explore multivariate linear regression, using multiple predictors to predict the target y, and verify linearity, patternless residuals, low multicollinearity, and normal residuals via scatter and correlation plots.
Explore multivariate regression to predict wine quality using 11 chemical properties, build a linear model, visualize correlations with a correlation matrix, and assess residuals and r-squared for evaluation.
Practice multivariate regression on a nighttime air force noise data set to predict airfoil sound pressure level from frequency and angle of attack, then compare all-variable and three-independent-variable models.
Analyze multivariate regression on airfoil noise data, identify key predictors via correlation, compare full and three-variable models, and evaluate using r-squared and post-resample metrics.
Explore underfitting and overfitting in regression and classification, and learn how bias and variance balance training and validation data to minimize overall error.
Learn how to split data into training, validation, and test sets—the gold standard for model evaluation—train models, validate performance, and obtain an unbiased final evaluation with matching distributions.
Explore training, validation, and testing splits using random sampling to avoid bias, compare linear versus random selection, and learn how data size influences the ideal split.
Build an R function to split a data frame into training, validation, and test sets using 0.6, 0.2, and 0.2 ratios, with floor-based sampling and index-based assignment.
Explore resampling techniques, including five-fold, tenfold, and leave-one-out cross-validation, to balance training and validation data, compare algorithms, and manage computational cost for stable model performance.
Join the resampling techniques lab to apply 10-fold cross-validation and leave-one-out cross validation to wine quality data, comparing training and test performance of a linear model using post resample metrics.
You want to be able to perform your own data analyses with R? You want to learn how to get business-critical insights out of your data? Or you want to get a job in this amazing field? In all of these cases, you found the right course!
We will start with the very Basics of R, like data types and -structures, programming of loops and functions, data im- and export.
Then we will dive deeper into data analysis: we will learn how to manipulate data by filtering, aggregating results, reshaping data, set operations, and joining datasets. We will discover different visualisation techniques for presenting complex data. Furthermore find out to present interactive timeseries data, or interactive geospatial data.
Advanced data manipulation techniques are covered, e.g. outlier detection, missing data handling, and regular expressions.
We will cover all fields of Machine Learning: Regression and Classification techniques, Clustering, Association Rules, Reinforcement Learning, and, possibly most importantly, Deep Learning for Regression, Classification, Convolutional Neural Networks, Autoencoders, Recurrent Neural Networks, ...
You will also learn to develop web applications and how to deploy them with R/Shiny.
For each field, different algorithms are shown in detail: their core concepts are presented in 101 sessions. Here, you will understand how the algorithm works. Then we implement it together in lab sessions. We develop code, before I encourage you to work on exercise on your own, before you watch my solution examples. With this knowledge you can clearly identify a problem at hand and develop a plan of attack to solve it.
You will understand the advantages and disadvantages of different models and when to use which one. Furthermore, you will know how to take your knowledge into the real world.
You will get access to an interactive learning platform that will help you to understand the concepts much better.
In this course code will never come out of thin air via copy/paste. We will develop every important line of code together and I will tell you why and how we implement it.
Take a look at some sample lectures. Or visit some of my interactive learning boards. Furthermore, there is a 30 day money back warranty, so there is no risk for you taking the course right now. Don’t wait. See you in the course.