What you'll learn

Use Keras for classification and regression in typical data science problems
Use Keras for image classification
Define Convolutional neural networks
Train LSTM models for sequences
Process the data in order to achieve to the specific shape that Keras expects for each problem
Code neural networks directly in Theano using tensor multiplications
Understand what are the different layers that we have in Keras
Design neural networks that mitigate the effect of overfitting using specific layers
Understand how backpropagation and stochastic gradient descent work

Course content

7 sections • 39 lectures • 10h 4m total length

Introduction17:23
Brief introduction to this course
Installing Keras9:27
We explain how to install Keras and Theano and we explain the basics behind Keras. If you want to use Tensorflow instead of Theano, a very similar approach is used.
Theano and Tensorflow16:13
We show some basic symbolic code in Theano which is useful for explaining what Keras will do when we build a model. In fact Keras, will use Theano/Tensorflow to do all the tensor operations necessary for the neural network that we build in Keras.
Running high performance code in AWS15:30
Running complex neural networks on our machines is sometimes not feasible due to either memory or speed requirements. AWS (Amazon Web Services) provide a cheap and scalable solution, specially because there are existing images that we can use (which contain all the necessary software - Python - Keras - Cuda) simplifying the installation process. We show how to create an instance on AWS, how to run code there, and how to upload and download files

Introduction to the Sequential Model8:15
Keras provides two ways of constructing models: The Sequential approach and the Model API. We introduce the Sequential approach. It allows to construct models by easily stacking layers together
Activation functions12:57
Every layer of a neural network works by multiplying the different weights by the inputs/neurons and after each sum is computed, an activation function is applied. In general, these activations are nonlinear and we can choose among several ones in Keras: sigmoid, elu, relu, tanh, etc.
Layers16:51
We explain the different layers that we have in Keras. The most fundamental one is the Dense() layer which is a fully connected layer. But there are certainly other very important ones. We review the most important ones.
Training20:00
We explain how to train a model in Keras
Loss functions10:09
Loss functions are used in Keras to compute the final loss for our models (how well our model is performing?). Keras minimizes these loss functions by using special algorithms. Of course, the loss functions depend on which specific problem we are trying to solve. We need specific loss functions for classification problems, other ones for regression problems, etc.
Overfitting: Gaussian Noise and Dropout layers9:49
Overfitting occurs when our model tends to fit too much to our data. The problem is that when this happens, the model will perform very badly in an out-of-sample scenario. Remember that we use our data to fit a model, and we then use that model to make predictions for real (out of sample) observations.
Wine classification19:58
We use a real dataset containing information about several wines; in particular we have different chemical measurements about them. And we want to classify these wines into each one of three categories. We finally achieve an excellent accuracy using a neural network with several layers. This is a good example to introduce the categorical cross entropy loss, which is designed to tackle multi-label classification problems.
Mushroom classification20:01
We use a real dataset containing information about different mushrooms. We want to predict whether they are edible or not. We use a neural network with a binary cross entropy loss, because we have just two categories. We achieve an excellent in-sample accuracy.
House Prices in the US13:31
In this case, we want to predict the house prices for a particular county in the US. This is our first example of neural networks used for a regression problem (when the variable that we want to predict is numeric). In this case, we naturally need to use a different loss function: we can choose among the mse, mae, and several other ones.
Stochastic gradient descent18:22
We explain how SGD works. We discuss how the learning rate affects the results, and how does the minimzation algorithm that Keras uses works
Backpropagation: How Neural Nets are trained19:39
We explain how backpropagation works. It is the fundamental technique used for training neural networks. And we review how the inner math works (how the chain rule is used). This is the most technical lecture of this course.
Clipvalue and learning rate13:36
There are several optimizers that can be used in Keras. All of them, are variations of the stochastic gradient descent. We explain the two general parameters that can be used with all of them.
Optimizers13:21
We explain the basics behind the different optimizers in Keras. And we show how to tweak the different parameters that each optimizer has.
Locally connected layers and 1D Convolutions19:46
Pulling weights from Layers7:13
We show how to pull the different weights from each layer. This is particularly useful when we want to understand what each layer has inside, which is very relevant when our model is not being trained properly.
Car Prices in Germany: Batch processing19:56
Sometimes, due to the size of the data, it might not be possible to fit everything into a single Keras model. But, what we can do, is to feed our Keras model with several batches of data. We use this for predicting car prices in Ebay- Germany.
The model API: Merging layers and more complex models16:05
Keras' wonderful model api allows us to define very complex architectures. In this case, we use it to merge several layers.
Videogames: Multi output predictions20:01
We use a multi-output model to predict the sales for video games for North America, Europe and Japan. We do this using the model API.
Keras layers

Scikit-learn with Keras: Comparing deep learning models19:58
We show how to wrap Keras inside Scikit-learn to compare different models using cross validation. This is particularly relevant for neural networks, because they tend to overfit. So comparing different models is not feasible using the very same dataset that we used for training. Cross validation provides an elegant solution to this.
Determining best parameters in Neural Networks using GridSearchCV16:34
We show how to wrap Keras inside Scikit-learn to identify the best parameters via cross-validation. This is a robust way of identifying the appropriate epoch value, batch size, etc. Cross validation (and in particular k fold cross validation) uses every observation for both training and testing, so it is a good idea to use, specially when your sample is rather small. In particular, we use GridSearchCV, which constructs a grid containing different parameter values: we even use it to identify what's the optimal amount of neurons in a hidden layer!

A class that maps BW images to Python objects17:01
Images are used frequently in machine learning, both for deep neural networks and for traditional algorithms (SVM, random forests, etc). We review the basics behind image loading and we present a class that can be used to read an entire directory and build the proper matrices needed for doing machine learning. This class is useful for transforming images in RGB channels (3 tensors) into black and white (0,1) matrices. It should only be used when reading images already in black and white format
A class that maps RGB Images to Python objects5:36
We present a similar class, but now it is designed to accommodate 3 channel image data (RGB Images), which we typically need to treat as a 5-dim tensor. This class will be useful for doing convolutional neural nets in the next section

Structure16:21
We introduce the multilayer perceptron neural network. It is a feedforward network using non-linear activation functions.In its simplest case, with only one hidden layer, it is called a "logistic regression" model
Coding a Multilayer Perceptron in pure Theano: Part119:54
We actually code a multilayer perceptron in pure Theano (doing the appropriate Tensor operations). In fact this is what Keras is doing for us, when we code an MLP network. In general, for other network configurations, Keras does a very similar thing: it builds the appropriate code in Theano/Tensorflow. This lecture is rather technical, so it's only necessary if you want to understand the inner workings of Keras.
Coding a Multilayer Perceptron in pure Theano: Part216:40
We continue with our previous lecture, coding a pure MLP neural network in Theano doing the Tensor operations
Multilayer Perceptron in Keras19:46
We code a Multilayer Perceptron Network in Keras. It builds exactly the same structure that we used in the two previous lectures. We use this network for classifying shapes in drawings: squares and triangles. We achieve an excellent (100% percent) accuracy.

Introduction3:52
Introduction to Convolutional Neural Networks
Convolutions and Max-Pooling19:58
How do Convolutions and Max-Pooling work? What are necessary dimensions for a 2d-Convolution and what are the dimensions of its output?
Predicting Hand Gestures19:52
Using convolutional neural networks to predict different hand gestures
Classifying bolts and nuts15:50
We use a very similar framework to identify nuts and bolts using images containing these pieces on a wooden desk
Classifying Pictures in park vs home16:49
We show how to use neural networks to classify images from the river Thames vs images taken from the streets. This is similar to how many automatic tagging technologies work (software that tells you if your image was taken at the beach, or park, or in the mountains). We achieve a 100% accuracy both in in-sample and out-of-sample scenarios
Convolutional Neural Nets

Recurrent Neural Networks4:53
Brief introduction to recurrent neural networks
The vanishing gradient17:14
Backpropagation uses the chain rule from calculus to compute the partial derivatives of the loss function with respect to the weights. This has an undesired consequence, when we have multiple layers and we need to do many multiplications, it can well happen that the gradient fades to zero. The practical consequence is that the training for the initial layers' weights can take just too long, because the gradient is not properly propagated. This is particularly relevant for recurrent neural networks, as they reuse previous layers (from previous time periods).
LSTM: Predicting House Prices in London19:59
We introduce the lstm model, which solves the vanishing gradient problem by intelligently reformulating the neural network model. It uses gates which are used for forgetting information, adding new information, and mixing new information with the information from previous periods. We use this model to predict house prices in London using AWS (Amazon Web Services)
Predicting global temperatures16:13
We use an lstm model to model global temperatures using real data

Requirements

Python
Some previous experience with data science/machine learning in Python is desirable
Basic data processing in Excel
Some knowledge on probability is advisable

Description

Do you want to build complex deep learning models in Keras? Do you want to use neural networks for classifying images, predicting prices, and classifying samples in several categories?

Keras is the most powerful library for building neural networks models in Python. In this course we review the central techniques in Keras, with many real life examples. We focus on the practical computational implementations, and we avoid using any math.

The student is required to be familiar with Python, and machine learning; Some general knowledge on statistics and probability is recommended, but not strictly necessary.

Among the many examples presented here, we use neural networks to tag images belonging to the River Thames, or the street; to classify edible and poisonous mushrooms, to predict the sales of several video games for multiple regions, to identify bolts and nuts in images, etc.

We use most of our examples on Windows, but we show how to set up an AWS machine, and run our examples there. In terms of the course curriculum, we cover most of what Keras can actually do: such as the Sequential model, the model API, Convolutional neural nets, LSTM nets, etc. We also show how to actually bypass Keras, and build the models directly in Theano/Tensorflow syntax (although this is quite complex!)

After taking this course, you should feel comfortable building neural nets for time sequences, images classification, pure classification and/or regression. All the lectures here can be downloaded and come with the corresponding material.

Who this course is for:

Students beginning with machine learning but who already are comfortable with Python
Business analytics professionals aiming to expand their toolkit of analytical techniques

What you'll learn

Explore related topics

Course content

Introduction4 lectures • 59min

Keras fundamentals18 lectures • 4hr 40min

Scikit-learn and Keras2 lectures • 37min

Classes for images2 lectures • 23min

Multilayer Perceptron4 lectures • 1hr 13min

Convolutional Neural Nets5 lectures • 1hr 16min

Recurrent neural networks4 lectures • 58min

Requirements

Description

Who this course is for: