Do you want to explore the various arenas of machine learning and deep learning by creating insightful and interesting projects? If yes, then this Learning Path is ideal for you!
Packt’s Video Learning Paths are a series of individual video products put together in a logical and stepwise manner such that each video builds on the skills learned in the video before it.
Machine learning and deep learning gives you unimaginably powerful insights into data. Both of these fields are increasingly pervasive in the modern data-driven world.
This Learning Path begins with covering the basic-to-advanced-level concepts of Python. Then, you’ll explore a range of real-life scenarios where machine learning can be used. Throughout the Learning Path, you will use Python to implement a wide range of machine learning algorithms that solve real-world problems. You’ll also learn a range of regression techniques, classification algorithms, predictive modeling, data visualization techniques, recommendation engines, and more with the help of real-world examples. There are six different independent projects that will help you master machine learning in Python. Finally, you’ll learn to build intelligent systems using deep learning with Python. By the end of this Learning Path, you should be able to build your own machine learning and deep learning models.
Meet Your Experts:
We have combined the best works of the following esteemed authors to ensure that your learning journey is smooth:
Daniel Arbuckle got his Ph.D. In Computer Science from the University of Southern California. He has published numerous papers, along with several books and video courses, and is both a teacher of computer science and a professional programmer.
Prateek Joshi is an artificial intelligence researcher, published author of five books, and TEDx speaker. He is the founder of Pluto AI, a venture-funded Silicon Valley startup building an analytics platform for smart water management powered by deep learning. His tech blog has received more than 1.2 million page views from 200 over countries and has over 6,600+ followers.
Alexander T. Combs is an experienced data scientist, strategist, and developer with a background in financial data extraction, natural language processing and generation, and quantitative and statistical modeling.
Eder Santana is a PhD candidate on Electrical and Computer Engineering. His thesis topic is on Deep and Recurrent neural networks. After working for 3 years with Kernel Machines (SVMs, Information Theoretic Learning, and so on), Eder moved to the field of deep learning 2.5 years ago, when he started learning Theano, Caffe, and other machine learning frameworks.
The goal of this video is to provide a basic understanding of the Python language constructs.
The goal of this video is to ensure we have the basic ability to operate with Python's most common data structures.
People used to lower-level languages are unfamiliar with First-class functions and classes; we'll take a look at them in this video.
Python comes with a lot of tools. We need to be familiar with them in order to make the best use of the language. That's the topic we will be covering in this video.
The goal of this video is to look at the overview of recent language changes in Python.
In this video, we are going to download and install the correct version of Python and make sure it's functional.
The fastest way to get rolling is to use a textual interface, so we need to make sure we know how to use it. That's the goal in this video.
Python has good support for installing and managing third-party code packages; let's take a look at them in this video.
There's a lot of good code available. How do we find it? Let's take a look in this video.
The goal of this video is to know how to create the structure of a Python program.
A package containing no code isn't of much use. How do we make it do something? Let's take a look at it in this video.
The goal in this video is to understand how we can make code in separate modules of the package interact.
A complete program usually involves static data files along with the code. How do we integrate non-python files in a package? Let's find out!
The computer can read our code as long as the syntax is correct, but humans require more care. Let's see how we can write readable code in Python.
If more than one programmer is working on a project, version control is a necessity. Even for a single programmer, version control is useful. We will take a look at it in this video.
In this video, we will see how writing a program can take a while, and it's best that external changes to the development system do not impact the process.
Properly formatted docstrings can be automatically used by IDEs and transformed into HTML. Let's take a peek into it in this video.
The goal of this video is to understand why it's important that examples in the documentation stay consistent with the code.
Writing Python code packages is all well and good, but how to we make a program? We will take a look at it in this video.
It's useful to be able to read data from the program command line. The argparse package makes it easy. Let's see it in action in this video.
What do we do when we need information to flow both ways with the user?
It's often useful to have a launcher file for a program, which can be double-clicked in a GUI or used a shorthand from the command line. A simple script or batch file does the trick, let's see how.
Parallel processing has pitfalls, but Python's high-level parallel processing framework makes it easy to dodge them. Let's look at it in this video.
When our program doesn't fit the conceptual model of concurrent.futures, we can use multiprocessing to define our own model. Let's see how.
At first glance, Python's coroutine scheduling looks a lot like multiprocessing or multithreading. What's the difference? Let's get to know this better.
The goal of this video is to set up and run the asyncio asyncio coroutine scheduler
In this video we will see how do we actually use asyncio's Future objects.
The goal of this video is to understand how to keep coroutines from jamming each other up and locking the program.
In this video, we will see how do asyncio's actual I/O facilities work.
Functions are very useful, but it would be nice to be able to tweak them to suit our specific needs. Enter function decorators.
If we're going to be using functions as data, it would be nice to have metadata for the functions too. Function annotations give us that.
Class objects are basically data structures, so it would be useful to be able to rewrite them as they're defined. Class decorators let us do that.
Class decorators take effect after Python has created a class object. If we want our code involved before the class object is created, we need to use a metaclass.
Pieces of code often come in matched pairs, with one piece needing to run before something happens and the matching piece after. Context managers make sure the pieces match up.
We can plug code into Python's attribute access mechanism to control what it means to read, write, or delete a member variable.
Testing is critical, but it can feel like a burden. Automated unit testing and test-driven development can solve this problem.
The unittest package provides a framework for writing automatic tests; let's check it out.
We need to separate the code we're testing from everything else. When the code interacts with other objects, we can replace them with mock objects. Let's see how.
As a test suite grows, running the tests individually becomes impractical. We need the system to find and run them in large swaths.
Sometimes we want a more flexible test runner than the one built in to unittest. Nose to the rescue.
For the newcomer, it can be difficult to figure out exactly what other people are talking about when they say "reactive programming".
Many of the discussions of reactive programming are highly theoretical. Building a reactive programming system for ourselves will help show how simple the basic ideas are.
Creating a complete and correct reactive programming framework takes a lot of time and effort. Since somebody's already done it, let's take a look at the result.
We need our web server to scale up and maintain high availability. Let's see how we can do that.
Sometimes, we'd rather have all of the protocol-level stuff handled automatically for our microservice.
Sometimes we need to access or create compiled code and sometimes we don't need it as much as we think. How do we know which is which?
There's a lot of useful code out that wasn't written for Python, but most of it is accessible to C code.
Sometimes we just need to write a code that runs close to the metal. Cython gives us that option.
Machine learning algorithms need processed data for operation. Let’s explore how to process raw data in this video.
Algorithms need data in numerical form to use them directly. But we often label data with words. So, let’s see how we transform word labels into numerical form.
Linear regression uses a linear combination of input variables to estimate the underlying function that governs the mapping from input to output. Our aim would be to identify that relationship between input data and output data.
There are some cases where there is difference between actual values and values predicted by regressor. We need to keep a check on its accuracy. This video will enable us to do that.
Linear regressors tend to be inaccurate sometimes, as outliers disrupt the model. We need to regularize this. We will see that in this video.
Linear model fails to capture the natural curve of datapoints, which makes it quite inaccurate. So, let’s go through polynomial regressor to see how we can improve that.
Applying regression concepts to solve real-world problems can be quite tricky. We will explore how to do it successfully.
We don’t really have an idea on which feature contributes to the output and which doesn’t. It becomes critical to know that, in case we’ve to omit one. This video will help you compute their relative importance.
There might be some problems where the basic regression methods we’ve learned won’t help. One such problem is bicycle demand distribution. You will see how to solve that here.
Evaluating the accuracy of a classifier is an important step in the world of machine learning. We need to learn how to use the available data to get an idea as to how this model will perform in the real world. This is what we are going to learn in this section.
Despite the word regression being present in the name, logistic regression is actually used for classification purposes. Given a set of datapoints, our goal is to build a model that can draw linear boundaries between our classes. It extracts these boundaries by solving a set of equations derived from the training data.
Bayes’ Theorem, which has been widely used in probability to determine the outcome of an event, enables us to classify the data in a smarter way. Let us use its concept to make our classifier more amazing.
While working with data, splitting data correctly and logically is an important task. Let’s see how we can achieve this in Python.
In order to make splitting of dataset more robust, we repeat the process of splitting with different subsets. If we just fine-tune it for a particular subset, we may end up over fitting the model, which may fail to perform well on unknown data. Cross validation ensures accuracy in such a situation.
When we want to fine-tune our algorithms, we need to understand how the data gets misclassified before we make these changes. Some classes are worse than others, and the confusion matrix will help us understand this.
Let's see how we can apply classification techniques to a real-world problem. We will use a dataset that contains some details about cars, such as number of doors, boot space, maintenance costs, and so on, to analyze this problem.
Let’s see how the performance gets affected as we change the hyperparameters. This is where validation curves come into the picture. These curves help us understand how each hyperparameter influences the training score.
Learning curves help us understand how the size of our training dataset influences the machine learning model. This is very useful when you have to deal with computational constraints. Let's go ahead and plot the learning curves by varying the size of our training dataset.
Let’s see how we can build a classifier to estimate the income bracket of a person based on 14 attributes.
Building regressors and classifiers can be a bit tedious. Supervised learning models like SVM help us to a great extent. Let’s see how we can work with SVM.
There are various kernels used to build nonlinear classifiers. Let’s explore some of them and see how we can build a nonlinear classifier.
A classifier often gets biased when there are more datapoints in a certain class. This can turn out to be a big problem. We need a mechanism to deal with this. Let’s explore how we can do that.
Let’s explore how we can train SVM to compute the output confidence level of a new datapoint when it is classified into a known category
It’s critical to evaluate the performance of a classifier. We need certain hyper parameters to do so. Let’s explore how to find those parameters.
Now that we’ve learned the concepts of SVM thoroughly, let’s see if we can apply them to real-world problems.
We’ve already used SVM as a classifier to predict events. Let’s explore whether or not we can use it as a regressor for estimating traffic.
The k-means algorithm is one of the most popular clustering algorithms, which is used to divide the input data into k subgroups using various attributes of the data. Let’s see how we can implement it in Python for Clustering data.
Vector quantization is popularly used in image compression, where we store each pixel using fewer bits than the original image to achieve compression.
The Mean Shift is a powerful unsupervised learning algorithm that's used to cluster datapoints. It considers the distribution of datapoints as a probabilitydensity function and tries to find the modes in the feature space. Let’s see how to use it in Python.
Many a times, we need to segregate data and group them for the purpose of analysis and much more. We can achieve this in Python using theagglomerative clustering. Let’s see how we can do it.
In supervised learning, we just compare the predicted values with the original labels to compute their accuracy. In unsupervised learning, we don't have any labels. Therefore, we need a way to measure the performance of our algorithms. Let’s see how we could evaluate their performance.
Wouldn't it be nice if there were a method that can just tell us the number of clusters in our data? This is where Density-Based Spatial Clustering of Applications with Noise (DBSCAN) comes into the picture. Let us see how we can work with it.
How will we operate with the assumption that we don't know how many clusters there are. As we don't know the number of clusters, we can use an algorithm called Affinity Propagation to cluster. Let's see how we can use unsupervised learning for stock market analysis with this.
What could we do when wedon't have labeled data available all the time but it's important to segment the market so that people can target individual groups? Let’s learn to build a customer segmentation model for this situation.
One of the major parts of any machine learning system is the data processing pipeline. Instead of calling functions in a nested way, it's better to use the functional programming paradigm to build the combination. Let's take a look at how to combine functions to form a reusable function composition.
The scikit-learn library has provisions to build machine learning pipelines. We just need to specify the functions, and it will build a composed object that makes the data go through the whole pipeline. Let’s see how to build it in Python.
While working with the training dataset, we need to make a decision based on the number of nearest neighbors in it. This can be achieved with the help of the NearestNeighbor method in Python. Let’s see how to do it.
When we want to find the class to which an unknown point belongs, we find the k-nearest neighbors and take a majority vote. Let's take a look at how to construct this.
A good thing about the k-nearest neighbors algorithm is that it can also be used as a regessor. Let’s see how to do this!
In order to find users in the database who are similar to a given user we need to define a similarity metric. Euclidean distance score is one such metric that we can use to compute the distance between data points. Let’s look at this in more detail in this video.
The Euclidean distance score is a good metric, but it has some shortcomings. Hence, Pearson correlation score is frequently used in recommendation engines. Let's see how to compute it.
One of the most important tasks in building a recommendation engine is finding users that are similar. Let's see how to do this in this video.
Now that we’ve built all the different parts of a recommendation engine, we are ready to generate movie recommendations. Let’s see how to do that in this video.
With tokenization we can define our own conditions to divide the input text into meaningful tokens. This gives us the solution for dividing a chunk of text into words or into sentences. Let's take a look at how to do this.
During text analysis, it's useful to extract the base form of the words to extract some statistics to analyze the overall text. This can be achieved with stemming, which uses a heuristic process to cut off the ends of words. Let's see how to do this in Python.
Sometimes the base words that we obtained using stemmers don't really make sense. Lemmatization solves this problem by doing things using a vocabulary and morphological analysis of words and removes inflectional word endings. Let's take a look at how to do this in this video.
When you deal with a really large text document, you need to divide it into chunks for further analysis. In this video, we will divide the input text into a number of pieces, where each piece has a fixed number of words.
When we deal with text documents that contain millions of words, we need to convert them into some kind of numeric representation so as to make them usable for machine learning algorithms. A bag-of- words model is what helps us achieve this task quite easily.
The goal of text classification is to categorize text documents into different classes. This is an extremely important analysis technique in NLP. Let us see how we can build a text classifier for this purpose.
Identifying the gender of a name is an interesting task in NLP. Also gender recognition is a part of many artificial intelligence technologies. Let us see how to identify gender in Python.
How could we discover the feelings or sentiments of different people about a particular topic? This video helps us to analyze that.
With topic modeling, we can uncover some hidden thematic structure in a collection of documents. This will help us in organizing our documents in a better way so that we can use them for analysis. Let’s see how we can do it!
Reading an audio file and visualizing the signal is a good starting point that gives us a good understanding of the basic structure of audio signals. So let us see in this video how we could do it!
Audio signals consist of a complex mixture of sine waves of different frequencies, amplitudes and phases. There is a lot of information that is hidden in the frequency content of an audio signal. So it’s necessary to transform the audio signal into a frequency domain. Let’s see how to do this.
We can use NumPy to generate audio signals. As we know, audio signals are complex mixtures of sinusoids. Let’s see how we can generate audio signals with custom parameters.
Music has been explored since centuries and technology has set new horizons to play with it. We can also create music notes in Python. Let’s see how we can do this.
When we deal with signals and we want to use them as input data and perform analysis, we need to convert them into frequency domain. So, let’s get hands-on with it!
A hidden Markov Model represents probability distributions over sequences of observations. It allows you to find the hidden states so that you can model the signal. Let us explore how we can use it to perform speech recognition.
This video will walk you through building a speech recognizer by using the audio files in a database. We will use seven different words, where each word has 15 audio files. Let’s go ahead and do it!
Let’s understand how to convert a sequence of observations into time series data and visualize it. We will use a library called pandas to analyze time series data. At the end of this video, you will be able to transform data into the time series format.
Extracting information from various intervals in time series data and using dates to handle subsets of our data are important tasks in data mining. Let’s see how we can slice time series data using Python.
You can filter the data in many different ways. The pandas library allows you to operate on time series data in any way that you want. Let's see how to operate on time series data.
One of the main reasons that we want to analyze time series data is to extract interesting statistics from it. This provides a lot of information regarding the nature of the data. Let’s see how to extract these stats.
Hidden Markov Models are really powerful when it comes to sequential data analysis. They are used extensively in finance, speech analysis, weather forecasting, sequencing of words, and so on. We are often interested in uncovering hidden patterns that appear over time. Let’s see how we can use it.
The Conditional Random Fields (CRFs) are probabilistic models used to analyze structured data and also to label and segment sequential data. Let us see how we can use it to work on our input dataset!
This video will get you hands-on with analyzing stock market data and understanding the fluctuations in the stocks of different companies. So let’s see how to do this!
OpenCV is the world's most popular library for computer vision. It enables us to analyze images and do a lot of stuff with it. Let’s see how to operate it!
When working with images, it is essential to detect the edges to process the image and perform different operations with it. Let’s see how to detect edges of the input image in Python.
The human eye likes contrast! This is the reason that almost all camera systems use histogram equalization to make images look nice. This video will walk you through the use of histogram equalization in Pyhton.
One of the essential steps in image analysis is to identify and extract the salient features for the purpose of computer vision. This can be achieved with a corner detection technique and SIFT feature point in Python. This video will enable you to achieve this goal!
When we build object recognition systems, we may want to use a different feature detector before we extract features using SIFT; that will give us the flexibility to cascade different blocks to get the best possible performance. Let’s see how to do it with Star feature detector.
Have you ever wondered how you could build image signatures? If yes, this video will take you through creating features by using visual codebook, which will enable you to achieve this goal. So, let’s dive in and watch it!
We can construct a bunch of decision trees that are based on our image signatures, and then train the forest to make the right decision. Extremely Random Forests (ERFs) are used extensively for this purpose. Let’s dive in and see how to do it!
While dealing with images, we tend to tackle problems with the contents of unknown images. This video will enable you to build an object recognizer which allows you to recognize the content of unknown images. So, let’s see it!
Webcams are widely used for real-time communications and for biometric data analysis. This video will walk you through capturing and processing video from your webcam.
Haar cascade extracts a large number of simple features from the image at multiple scales. The simple features are basically edge, line, and rectangle features that are very easy to compute. It is then trained by creating a cascade of simple classifiers. Let’s see how we can detect a face with it!
The Haar cascades method can be extended to detect all types of objects. Let's see how to use it to detect the eyes and nose in the input video.
Principal Components Analysis (PCA) is a dimensionality reduction technique that's used very frequently in computer vision and machine learning. It’s used to reduce the dimensionality of the data before we can train a system. This video will take you through the use of PCA.
What if you need to reduce the number of dimensions in unorganized data? PCA, which we used in the last video, is inefficient in such situations. Let us see how we can tackle this situation.
When we work with data or signals, they are generally received in a raw form. Or rather we can say they are a mixture of some unwanted stuff. It is essential for us to segregate them, so as to work on these signals. This video will enable you to achieve this goal.
We are now finally ready to build a face recognizer! Let’s see how to do it!
Let’s start our neural network adventure with a perceptron, which is a single neuron that performs all the computations.
Now that we know how to create a perceptron, let's create a single-layer neural network which will consist of multiple neurons in a single layer.
Let’s build a deep neural network, which will have multiple layers. There will be some hidden layers between the input and output layers. So, let us explore it.
Let us see how we can use vector quantization in machine learning and computer vision.
This video will walk you through analyzing sequential and time series data and enable you to extend generic models for them.
Let us look at how to use neural networks to perform optical character recognition to identify handwritten characters in images.
Let's build a neural-network-based optical character recognition system.
3D visualization is very important in data representation. So we need to learn the simple yet effective method of plotting 3D plots.
You are going to learn to plot bubble plots in this video.
Sometimes we just need to write a code that runs close to the metal. Cython gives us that option.
When there are distribution tables and various labels, pie charts are handy to express data.
To keep a track of data with respect to time, we need to plot date-formatted time series data.
When we need to compare data of two different entities, we need to plot histograms. This video is going to help you do that.
Heat maps are useful when data in two groups is associated point by point.
When we visualize real-time signals, it becomes imperative to animate the dynamic signals so that they are updated continuously. This video will help you do that.
We need the air pricing data from a website to work with. You will learn to do that in this section.
After determining the source of the data, we need to retrieve the data.
DOM is the structure of elements that form the web page. We need to get some details of the structure by parsing it.
To get real-time alerts when a particular event occurs, we need to use IFTTT.
To deploy our app, we'll move on to working in a text editor. You will put together the entire code to get the final result.
Before deciding strategies for the IPO market, we need to study the IPO market and derive inferences from it.
The consideration and inclusion of all factors affecting the market is called feature engineering. Modeling this is as important as the data used in building the model.
Instead of giving the value of the return, you can predict the IPO for a trade you will buy or not buy. The model used is logistic regression.
It is important to know which features will make the offering successful. You can find that out in this section.
To create a model, we have to first have a training dataset. We will use the pocket app for this.
You can't move forward with just the URLs of the stories. You would need the full article. So let's check out how to do that in this video.
Machine learning models work on numerical data. So we will need to transform our text into numerical data using NLP.
You will learn about the linear support vector machine in this video. The SVM algorithm separates data points linearly into classes.
We have provided a training dataset. But we also need a stream of articles as a testing dataset to run our model against.
It would make life easier if you get a personalized e-mail of your stories, right? So you will learn how to do that in this video.
Research is the most important thing before we start working on designing a strategy.
Once you have studied the various aspects of the market, it is time to develop a trading strategy. You will learn it in this video.
Now that we have our baseline, we will build our first regression model for prediction of stocks.
Another algorithm to work with is dynamic time warping. It provides us a metric which will inform us about the similarity between two time series.
It is very important to understand machine learning's concepts before working with it.
In order to work with images, we need to transform them into a matrix form, that is, numerical form.
We will use algorithms to find similar images in the database.
We will combine what we have studied so far to build an image similarity engine.
Design of chatbots consists of parameters like mode of communication, the content, and so on. You will look at that in this video.
Having looked at the working of a chatbot, we will now build a chatbot.
What is Deep Learning, and when is it the way to go?
How to avoid programming Deep Learning from scratch? Let’s take a look at it in this video.
How to get our first deep neural network trained?
How can we avoid making a differentiation of functions and make backpropagation easier?
How do Keras and other libraries use Theano work behind the scenes?
How does Keras work? How does one write a basic, fully connected neural network layer in Keras?
Understand what convolutional neural networks are and how to use them. How can we write convolutional layers with Python?
How can we solve complex image datasets (for example, cats versus dogs) without training a full model from scratch?
How does Keras work?
We will solve complete image datasets with pretrained models: classifying cats versus dogs.
How can one define neural network layers with internal states?
Recurrent or convolutional: How can one know which layer they should use?
How can we classify sentiments from text?
How can we automatically describe an image in English?
What is TensorFlow?
Packt has been committed to developer learning since 2004. A lot has changed in software since then - but Packt has remained responsive to these changes, continuing to look forward at the trends and tools defining the way we work and live. And how to put them to work.
With an extensive library of content - more than 4000 books and video courses -Packt's mission is to help developers stay relevant in a rapidly changing world. From new web frameworks and programming languages, to cutting edge data analytics, and DevOps, Packt takes software professionals in every field to what's important to them now.
From skills that will help you to develop and future proof your career to immediate solutions to every day tech challenges, Packt is a go-to resource to make you a better, smarter developer.
Packt Udemy courses continue this tradition, bringing you comprehensive yet concise video courses straight from the experts.