Deep Learning Prerequisites: The Numpy Stack in Python (V2+)
What you'll learn
- Understand supervised machine learning (classification and regression) with real-world examples using Scikit-Learn
- Understand and code using the Numpy stack
- Make use of Numpy, Scipy, Matplotlib, and Pandas to implement numerical algorithms
- Understand the pros and cons of various machine learning models, including Deep Learning, Decision Trees, Random Forest, Linear Regression, Boosting, and More!
- Understand linear algebra and the Gaussian distribution
- Be comfortable with coding in Python
- You should already know "why" things like a dot product, matrix inversion, and Gaussian probability distributions are useful and what they can be used for
Welcome! This is Deep Learning, Machine Learning, and Data Science Prerequisites: The Numpy Stack in Python.
One question or concern I get a lot is that people want to learn deep learning and data science, so they take these courses, but they get left behind because they don’t know enough about the Numpy stack in order to turn those concepts into code.
Even if I write the code in full, if you don’t know Numpy, then it’s still very hard to read.
This course is designed to remove that obstacle - to show you how to do things in the Numpy stack that are frequently needed in deep learning and data science.
So what are those things?
Numpy. This forms the basis for everything else. The central object in Numpy is the Numpy array, on which you can do various operations.
The key is that a Numpy array isn’t just a regular array you’d see in a language like Java or C++, but instead is like a mathematical object like a vector or a matrix.
That means you can do vector and matrix operations like addition, subtraction, and multiplication.
The most important aspect of Numpy arrays is that they are optimized for speed. So we’re going to do a demo where I prove to you that using a Numpy vectorized operation is faster than using a Python list.
Then we’ll look at some more complicated matrix operations, like products, inverses, determinants, and solving linear systems.
Pandas. Pandas is great because it does a lot of things under the hood, which makes your life easier because you then don’t need to code those things manually.
Pandas makes working with datasets a lot like R, if you’re familiar with R.
The central object in R and Pandas is the DataFrame.
We’ll look at how much easier it is to load a dataset using Pandas vs. trying to do it manually.
Then we’ll look at some dataframe operations useful in machine learning, like filtering by column, filtering by row, and the apply function.
Pandas dataframes will remind you of SQL tables, so if you have an SQL background and you like working with tables then Pandas will be a great next thing to learn about.
Since Pandas teaches us how to load data, the next step will be looking at the data. For that we will use Matplotlib.
In this section we’ll go over some common plots, namely the line chart, scatter plot, and histogram.
We’ll also look at how to show images using Matplotlib.
99% of the time, you’ll be using some form of the above plots.
I like to think of Scipy as an addon library to Numpy.
Whereas Numpy provides basic building blocks, like vectors, matrices, and operations on them, Scipy uses those general building blocks to do specific things.
For example, Scipy can do many common statistics calculations, including getting the PDF value, the CDF value, sampling from a distribution, and statistical testing.
It has signal processing tools so it can do things like convolution and the Fourier transform.
If you’ve taken a deep learning or machine learning course, and you understand the theory, and you can see the code, but you can’t make the connection between how to turn those algorithms into actual running code, this course is for you.
"If you can't implement it, you don't understand it"
Or as the great physicist Richard Feynman said: "What I cannot create, I do not understand".
My courses are the ONLY courses where you will learn how to implement machine learning algorithms from scratch
Other courses will teach you how to plug in your data into a library, but do you really need help with 3 lines of code?
After doing the same thing with 10 datasets, you realize you didn't learn 10 things. You learned 1 thing, and just repeated the same 3 lines of code 10 times...
Python coding: if/else, loops, lists, dicts, sets
you should already know "why" things like a dot product, matrix inversion, and Gaussian probability distributions are useful and what they can be used for
WHAT ORDER SHOULD I TAKE YOUR COURSES IN?:
Check out the lecture "Machine Learning and AI Prerequisite Roadmap" (available in the FAQ of any of my courses)
Who this course is for:
- Students and professionals with little Numpy experience who plan to learn deep learning and machine learning later
- Students and professionals who have tried machine learning and data science but are having trouble putting the ideas down in code
Today, I spend most of my time as an artificial intelligence and machine learning engineer with a focus on deep learning, although I have also been known as a data scientist, big data engineer, and full stack software engineer.
I received my first masters degree over a decade ago in computer engineering with a specialization in machine learning and pattern recognition. I received my second masters degree in statistics with applications to financial engineering.
Experience includes online advertising and digital media as both a data scientist (optimizing click and conversion rates) and big data engineer (building data processing pipelines). Some big data technologies I frequently use are Hadoop, Pig, Hive, MapReduce, and Spark.
I've created deep learning models to predict click-through rate and user behavior, as well as for image and signal processing and modeling text.
My work in recommendation systems has applied Reinforcement Learning and Collaborative Filtering, and we validated the results using A/B testing.
I have taught undergraduate and graduate students in data science, statistics, machine learning, algorithms, calculus, computer graphics, and physics for students attending universities such as Columbia University, NYU, Hunter College, and The New School.