2023 CORE: Data Science and Machine Learning

Name: 2023 CORE: Data Science and Machine Learning
Rating: 4.6 (352 reviews)

A complete survey of all core skills required on the job

Created byDr. Isaac Faber

Last updated 8/2023

English

What you'll learn

Learn all necessary core skills for Data Analysis, Data Science, and Machine Learning
Understand the first principles of data science and why it is so popular and important
Learn how to use, from scratch, Python, R, SQL, Tableau, and MS Excel for data science
Learn about a broad range of data science and machine learning libraries and resources
Build and host a personal resume and portfolio of data science projects using GitHub Pages
Learn about key supporting skills like Git/version-control, Kaggle, Databases, Command Line tools, and much more!
Learn how to setup development environments from scratch in R and Python
Learn about important related technologies like cloud, docker, and web development,
Learn to deploy a machine learning model using docker

Course content

18 sections • 267 lectures • 28h 33m total length

Introduction1:30
Course Overview0:33
Course Structure3:58
Course Philosophy6:57
First Principles - Who?5:31
Discover first principles in data science and machine learning, and learn the ideal student background, including math, coding experience, and industry-focused practicality.
First Principles - Why? 1/35:32
First Principles - Why? 2/34:24
First Principles - Why? 3/39:48
Data science explains decisions under uncertainty by turning observable data into informed estimates of unseen factors. It uses exploration to drive hypotheses and improve outcomes, such as pricing a house.
Reading Assignment0:07
First Principles - What?8:46
First Principles - What? Data Analyst Example Product2:56
First Principles - What? Data Scientist Example Product4:45
First Principles - What? Machine Learning Engineer Example Product3:19
First Principles - What? Data & Sources4:32
First Principles - What? Kaggle Introduction2:39
First Principles - How?6:08
Data Science Battle Station2:09
Section Wrap Up4:36
Assignments0:12

Data Analyst Overview6:49
Explore the data analyst role as a gateway to data science, delivering data products and real-time dashboards to decision makers using spreadsheets, SQL, and Tableau.
Spreadsheets Overview2:57
Explore what spreadsheets are—electronic grids of data used for calculations. The course centers on Microsoft Excel as the widely used, feature-rich business calculator, with context on Google Sheets and Numbers.
Introduction to MS Excel2:26
Setting up MS Excel
Overview of MS Excel8:49
Excel Templates1:46
Workbook Overview6:10
Protecting Workbooks1:25
Sharing Workbooks1:42
Operators4:04
Explore how Excel uses operators to perform arithmetic, concatenation, and logical comparisons, and learn the order of operations and parentheses to build powerful formulas.
Built-in Functions5:59

Math - Summary Statistics15:33
Calculating Summary Statistics from Scratch6:16
Learn to compute summary statistics in Excel from scratch and with built-in functions, including mean, variance, standard deviation, covariance, and correlation, while distinguishing population versus sample calculations.
Import a Text File7:48
Data Tables7:17
Summary Statistics on Tables9:07
Explore summary statistics on a data table in Excel, calculating mean, population vs. sample standard deviation, and correlations between income, bedrooms, and housing value, with 3D maps for visualization.
Summary Statistics Dashboard
Assignment Review4:59
Analyze a California home price data set from 1990, computing summary statistics and correlations among numeric variables, with visualizations and guidance on publishing as web pages or Power BI.
Importing Data - Intermediate2:25
Lookups and Matches7:04
Calculating Churn and Customer Lifetime Value4:09
Financial Forecasting (Time Series)5:21
Data Visualization Introduction4:04
Data Visualization Excel6:53
Explore how to visualize Microsoft stock data in Excel using line and candlestick charts, create chart sheets, adjust formats, and compare chart types, noting data layout for meaningful visuals.
Dashboards Best Practices5:48
Discover how dashboards provide a single, focused view of analysis that delivers quick insights, using Excel to model data, design simple templates, and separate data analysis from visuals.
Build a Dashboard
Assignment Solution2:29

Importing Data - Power Query8:56
Pivot Tables7:26
Explore how to create and analyze pivot tables in Excel to summarize data, compare Titanic survival by sex and class, and visualize findings with charts.
Mathematical Modeling - Linear Programming8:58
Solver - Linear Programming in Excel15:18
Analysis Toolpack4:27
Visual Basic for Applications (VBA) - Introduction6:21
Spreadsheet Conclusion3:44
Complete LinkedIn Excel Assessment

SSI - databases20:32
SQL Text Editor - Sublime3:34
SQL Syntax6:45
Explore SQL syntax for data science by learning comments, commands, keywords, identifiers, and literals, and practice crafting select statements that fetch all columns from a table with semicolon termination.
Introduction to SQLite Databases3:18
Explore SQLite, a free, open-source, local SQL database engine that is fast, self-contained, and cross-platform, with practical setup on Windows, Mac, and Linux.
SQLite Install3:02
SQLite Database Creation6:47
Basic SQL - SELECT, FROM, WHERE statements11:28
Master core sql basics to query a single table using select, from, limit, and where. Understand selecting all columns and filtering results with operators.
Basic SQL - BETWEEN, LIKE statements2:27
Basic SQL - AND, OR, NOT, EXISTS, NULL statements8:20
Basic SQL - ORDER BY, DISTINCT statements4:21
Learn to use order by and distinct in basic SQL queries to sort employees by first name and salary, with ascending or descending order and multi-column sorting.

Intermediate SQL - Aggregate Functions7:48
Explore intermediate SQL aggregate functions such as count, sum, average, min, max, and group by to compute department level salaries; learn aliases and rounding for clean results.
Intermediate SQL - Joins7:09
Intermediate SQL - WITH and subqueries6:51
Advanced SQL - Inserting, Updating, and Deleting data9:11
Explore how to insert, update, and delete data in a database, using insert into values, update set where, and delete from clauses on the dependence table.
Advanced SQL - Views5:39
Connecting SQLite to Excel4:09
Kaggle SQL Course

Introduction to Business Intelligence (BI)10:41
Why Tableau?7:52
Installing Tableau Public2:08
Tableau Overview9:58
Explore how to import CO2 emissions data from Kaggle into Tableau, create geospatial maps and time-series visuals, and build interactive dashboards with filters.
Tableau Data Types2:30
Tableau Basic Viz9:18
Tableau Filters3:45
Explore Tableau filters and meta filters that update entire data sources, including continuous vs discrete filters, handling nulls, and customizing dropdown, slider, or checkbox lists for country selections.
Connecting Tableau to OData Sources7:03
Joining Data in Tableau8:41

Tableau Intermediate Bar Charts8:34
Tableau Dates3:06
Tableau Visualizing Comparisons4:56
Tableau Visualizing Distributions7:05
Explore how to visualize distributions in Tableau using circle charts, average lines, and box plots to analyze Titanic survival by family size, class, and age.
Tableau Multiple Axis4:18
Master Tableau dual axis visualizations by comparing high and low prices by date and overlaying stock price with volume, using date parts to reveal their relationship.
Tableau Formating6:06
Tableau Calculations and Parameters11:46
Tableau Dashboards and Stories15:47
Learn to build Tableau dashboards and stories by joining GDP and CO2 emissions data, crafting maps and charts, configuring filters and actions, and previewing dashboards across devices.
Tableau Advanced Analysis7:42
Sharing with Tableau Public3:41
Learn to save and publish to Tableau Public, sign in, share dashboards, and build a Tableau story for your project portfolio.
Tableau Desktop Pro Overview4:57
Assignment: Portfolio, and Resume Updates0:06

Introduction to the Data Scientist (Generalist)7:18
Overview of R7:46
Intro to CRAN and installing base R4:18
Installing RStudio5:30
Overview of RStudio6:06
Explore the RStudio environment through a hands-on tour of the console, plots pane, and script editor. Learn to run code, manage objects, install packages like ggplot2, and use keyboard shortcuts.
Calculations in Base R6:02
Use R as a calculator for basic operations (addition, subtraction, multiplication, division, powers), with immediate interpreted output, and learn to write comments with # and to manage simple chained calculations.
Objects in Base R9:20
Functions in Base R14:02
The Basics of R Scripts4:45
Base R Datasets2:48
Base R Help and Plots5:47
Installing R Packages - More on Plots and Objects9:30
Install R packages from cran or local mirrors, manage dependencies with install.packages, explore the tidyverse, and use c(), library, qplot, ggplot, rnorm, and runif to create and plot data.
Atomic Vectors10:38
Object Attributes3:58
Matrix and Array Objects3:27
Classes4:03
Factors3:07
Explore how factors transform categorical variables into numeric representations by wrapping a vector in a factor, revealing levels and integers that support regression, classification, and visualization.
Coercion3:25
Lists5:30
Data Frames6:57
Data frames are two-dimensional, tabular structures—like Excel tables—for storing heterogeneous data. Access columns with the dollar sign, flatten lists into frames, and use table counts in exploratory data analysis.
Loading and Saving Data Part 110:43
Loading and Saving Data Part 22:50
Selecting Values from Data Frames11:56
Changing Values in Data Frames8:31
Sub Setting Data Frames8:17
Missing Values5:15
More on Selecting Values1:28
Programming Flow Controls6:57

An Introduction to EDA7:56
EDA Example on Kaggle2:42
Explore exploratory data analysis on a Kaggle house price dataset using ggplot visualizations, summary statistics, and regression modeling to guide feature engineering and prediction.
Expanding Summary Statistics - Location7:55
Location Examples in R2:25
Expanding Summary Statistics - Spread4:24
Spread Examples in R1:44
Important EDA Tools4:03
Explore essential exploratory data analysis tools, from box plots and Tukey's quartiles to frequency tables, histograms, density plots, and scatter and correlation visuals, using the tidyverse.
Introduction to the Tidyverse and ggplot26:38
Tidyverse website1:03
Explore the tidyverse website for resources, documentation, and package installation guidance, including ggplot2, and bookmark cheat sheets and the data science roadmap repository.
ggplot - Mapping Aesthetics5:36
ggplot - Facets4:09
Learn to use ggplot facets, including wrap and grid, to create subplots by survived and passenger class, and facet by two variables; coercing numeric to character or factor.
ggplot - Multiple Geom4:22
ggplot - Stat Transforms2:57
ggplot - Position Adjustments3:31
Explore how to use ggplot position adjustments (identity, dodge, and fill) to tell stories by coercing variables into factors, coloring and stacking survival by family size on the Titanic dataset.
ggplot - Coord Systems2:59
ggplot - Summary1:28
ggplot - Gallery Book1:22
R Object Names3:22
dplyr - Overview4:39
dplyr - Filter6:48
Use dplyr filter to create explicit subsets, compose criteria with and/or and in operators, and handle missing values to examine pclass, age, and cabin.
dplyr - Arrange and Select5:12
dplyr - Mutate3:23
Employ mutate to add new columns by computing values from existing data or external vectors, such as the absolute distance from the column mean, enhancing data frame flexibility.
dplyr - Pipes, group_by, and summarise9:19
Master the pipe syntax to chain data operations in the tidyverse, using group_by and summarize to compute counts and averages across subgroups like sex, passenger class, and family size.
stringr - Basics11:53
stringr - Matching9:09
lubridate - Basics9:15
Learn to handle dates and times in R using the lubridate package, converting strings to date objects, formatting dates, and visualizing time-based data with ggplot for exploratory data analysis.
Intro to Markdown10:25
Discover how markdown helps data scientists share Kaggle-style exploratory data analysis on the web, using headers, lists, links, images, and code blocks in RStudio and r markdown.
Intro to RMarkdown8:16
Quick Overview of Notebooks3:17
Explore notebooks in Kaggle as interactive computing environments that mix code and markdown, show outputs, and enable iterative analysis and easy sharing.
EDA Assignment

Requirements

Comfortable with high school (primary school) math.
It will be much easier (but not required) if you have some familiarity with some type of computer programming

Description

This is an ambitious course. The goal here is simple: Only teach what you need to know for day 1 of your first data science job. No fluff, nothing out of context, no topics that are not relevant to real world applications. We will cover EVERY core topic and tool required for those new to data science: Python, R, SQL, Useful Math/Stats/Algorithms, Tableau, and Excel in depth. The course will cover skills that align with three different job types:

- Data Analyst

- General Data Scientist

- Machine Learning Engineer

You can expect to learn from first principles the foundational topics and tools used in practice today. We will avoid topics that are not useful or are simply too advanced when starting out. Your journey will be guided by the Data Science Road Map, a collection of the best resources gathered through years of experience by the instructor.

In addition, we will survey every important technology required on the job including GitHub, Kaggle, the basics of cloud, web development and docker. With over 200 videos, readings, and assignments, you can be sure you will be well prepared to join the data community.

If you are just getting started or want to fill in some of your knowledge gaps this course is for you!

Who this course is for:

Those who feels like they don't know where to start with data science and machine learning
Those tired of courses that don't show the entire picture of data science and leave them asking 'now what?'
Those interested in starting a journey into the data science and machine learning career field.
For those wanting to super-charge an existing skill set with the latest techniques and tools.

2023 CORE: Data Science and Machine Learning

What you'll learn

Explore related topics

Course content

Introduction - First Principals19 lectures • 1hr 18min

Data Analyst - Case Study - Intro & Basic Spreadsheets10 lectures • 42min

Data Analyst - Case Study - Intermediate Spreadsheets14 lectures • 1hr 29min

Data Analyst - Case Study - Advanced Spreadsheets7 lectures • 55min

Data Analyst - Case Study - SQL Basics10 lectures • 1hr 11min

Data Analyst - Case Study - SQL Intermediate and Advanced6 lectures • 41min

Data Analyst - Case Study - Business Intelligence and Tableau Introduction9 lectures • 1hr 2min

Data Analyst - Case Study - Tableau Intermediate and Advanced Topics12 lectures • 1hr 18min

Data Scientist - Case Study - Introduction to R28 lectures • 3hr

Data Scientist - Case Study - Exploratory Data Analysis and the R Tidyverse29 lectures • 2hr 30min

Requirements

Description

Who this course is for: