Keras: Deep Learning in Python
4.0 (57 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
491 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Keras: Deep Learning in Python to your Wishlist.

Add to Wishlist

Keras: Deep Learning in Python

Build complex deep learning algorithms easily in Python
4.0 (57 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
491 students enrolled
Created by Francisco Juretig
Last updated 7/2017
English [Auto-generated]
Current price: $10 Original price: $50 Discount: 80% off
5 hours left at this price!
30-Day Money-Back Guarantee
  • 10 hours on-demand video
  • 44 Supplemental Resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Use Keras for classification and regression in typical data science problems
  • Use Keras for image classification
  • Define Convolutional neural networks
  • Train LSTM models for sequences
  • Process the data in order to achieve to the specific shape that Keras expects for each problem
  • Code neural networks directly in Theano using tensor multiplications
  • Understand what are the different layers that we have in Keras
  • Design neural networks that mitigate the effect of overfitting using specific layers
  • Understand how backpropagation and stochastic gradient descent work
View Curriculum
  • Python
  • Some previous experience with data science/machine learning in Python is desirable
  • Basic data processing in Excel
  • Some knowledge on probability is advisable

Do you want to build complex deep learning models in Keras? Do you want to use neural networks for classifying images, predicting prices, and classifying samples in several categories?

Keras is the most powerful library for building neural networks models in Python. In this course we review the central techniques in Keras, with many real life examples. We focus on the practical computational implementations, and we avoid using any math.

The student is required to be familiar with Python, and machine learning; Some general knowledge on statistics and probability is recommended, but not strictly necessary.

Among the many examples presented here, we use neural networks to tag images belonging to the River Thames, or the street; to classify edible and poisonous mushrooms, to predict the sales of several video games for multiple regions, to identify bolts and nuts in images, etc.

We use most of our examples on Windows, but we show how to set up an AWS machine, and run our examples there. In terms of the course curriculum, we cover most of what Keras can actually do: such as the Sequential model, the model API, Convolutional neural nets, LSTM nets, etc. We also show how to actually bypass Keras, and build the models directly in Theano/Tensorflow syntax (although this is quite complex!)

After taking this course, you should feel comfortable building neural nets for time sequences, images classification, pure classification and/or regression. All the lectures here can be downloaded and come with the corresponding material.

Who is the target audience?
  • Students beginning with machine learning but who already are comfortable with Python
  • Business analytics professionals aiming to expand their toolkit of analytical techniques
Compare to Other Deep Learning Courses
Curriculum For This Course
39 Lectures
4 Lectures 58:33

Brief introduction to this course

Preview 17:23

We explain how to install Keras and Theano and we explain the basics behind Keras. If you want to use Tensorflow instead of Theano, a very similar approach is used.

Installing Keras

We show some basic symbolic code in Theano which is useful for explaining what Keras will do when we build a model. In fact Keras, will use Theano/Tensorflow to do all the tensor operations necessary for the neural network that we build in Keras.

Theano and Tensorflow

Running complex neural networks on our machines is sometimes not feasible due to either memory or speed requirements. AWS (Amazon Web Services) provide a cheap and scalable solution, specially because there are existing images that we can use (which contain all the necessary software - Python - Keras - Cuda) simplifying the installation process. We show how to create an instance on AWS, how to run code there, and how to upload and download files

Running high performance code in AWS
Keras fundamentals
18 Lectures 04:39:30

Keras provides two ways of constructing models: The Sequential approach and the Model API. We introduce the Sequential approach. It allows to construct models by easily stacking layers together

Introduction to the Sequential Model

Every layer of a neural network works by multiplying the different weights by the inputs/neurons and after each sum is computed, an activation function is applied. In general, these activations are nonlinear and we can choose among several ones in Keras: sigmoid, elu, relu, tanh, etc.

Preview 12:57

We explain the different layers that we have in Keras. The most fundamental one is the Dense() layer which is a fully connected layer. But there are certainly other very important ones. We review the most important ones.


We explain how to train a model in Keras


Loss functions are used in Keras to compute the final loss for our models (how well our model is performing?). Keras minimizes these loss functions by using special algorithms. Of course, the loss functions depend on which specific problem we are trying to solve. We need specific loss functions for classification problems, other ones for regression problems, etc.

Loss functions

Overfitting occurs when our model tends to fit too much to our data. The problem is that when this happens, the model will perform very badly in an out-of-sample scenario. Remember that we use our data to fit a model, and we then use that model to make predictions for real (out of sample) observations.

Overfitting: Gaussian Noise and Dropout layers

We use a real dataset containing information about several wines; in particular we have different chemical measurements about them. And we want to classify these wines into each one of three categories. We finally achieve an excellent accuracy using a neural network with several layers. This is a good example to introduce the categorical cross entropy loss, which is designed to tackle multi-label classification problems.

Wine classification

We use a real dataset containing information about different mushrooms. We want to predict whether they are edible or not. We use a neural network with a binary cross entropy loss, because we have just two categories. We achieve an excellent in-sample accuracy.

Mushroom classification

In this case, we want to predict the house prices for a particular county in the US. This is our first example of neural networks used for a regression problem (when the variable that we want to predict is numeric). In this case, we naturally need to use a different loss function: we can choose among the mse, mae, and several other ones.

Preview 13:31

We explain how SGD works. We discuss how the learning rate affects the results, and how does the minimzation algorithm that Keras uses works

Stochastic gradient descent

We explain how backpropagation works. It is the fundamental technique used for training neural networks. And we review how the inner math works (how the chain rule is used). This is the most technical lecture of this course. 

Preview 19:39

There are several optimizers that can be used in Keras. All of them, are variations of the stochastic gradient descent. We explain the two general parameters that can be used with all of them.

Clipvalue and learning rate

We explain the basics behind the different optimizers in Keras. And we show how to tweak the different parameters that each optimizer has. 


Locally connected layers and 1D Convolutions

We show how to pull the different weights from each layer. This is particularly useful when we want to understand what each layer has inside, which is very relevant when our model is not being trained properly. 

Pulling weights from Layers

Sometimes, due to the size of the data, it might not be possible to fit everything into a single Keras model. But, what we can do, is to feed our Keras model with several batches of data. We use this for predicting car prices in Ebay- Germany.

Car Prices in Germany: Batch processing

Keras' wonderful model api allows us to define very complex architectures. In this case, we use it to merge several layers. 

The model API: Merging layers and more complex models

We use a multi-output model to predict the sales for video games for North America, Europe and Japan. We do this using the model API.

Videogames: Multi output predictions

Keras layers
4 questions
Scikit-learn and Keras
2 Lectures 36:32

We show how to wrap Keras inside Scikit-learn to compare different models using cross validation. This is particularly relevant for neural networks, because they tend to overfit. So comparing different models is not feasible using the very same dataset that we used for training. Cross validation provides an elegant solution to this.

Scikit-learn with Keras: Comparing deep learning models

We show how to wrap Keras inside Scikit-learn to identify the best parameters via cross-validation. This is a robust way of identifying the appropriate epoch value, batch size, etc. Cross validation (and in particular k fold cross validation) uses every observation for both training and testing, so it is a good idea to use, specially when your sample is rather small. In particular, we use GridSearchCV, which constructs a grid containing different parameter values: we even use it to identify what's the optimal amount of neurons in a hidden layer!

Determining best parameters in Neural Networks using GridSearchCV
Classes for images
2 Lectures 22:37

Images are used frequently in machine learning, both for deep neural networks and for traditional algorithms (SVM, random forests, etc). We review the basics behind image loading and we present a class that can be used to read an entire directory and build the proper matrices needed for doing machine learning. This class is useful for transforming images in RGB channels (3 tensors)  into black and white (0,1) matrices. It should only be used when reading images already in black and white format

A class that maps BW images to Python objects

We present a similar class, but now it is designed to accommodate 3 channel image data (RGB Images), which we typically need to treat as a 5-dim tensor. This class will be useful for doing convolutional neural nets in the next section

A class that maps RGB Images to Python objects
Multilayer Perceptron
4 Lectures 01:12:41

We introduce the multilayer perceptron neural network. It is a feedforward network using non-linear activation functions.In its simplest case, with only one hidden layer, it is called a "logistic regression" model


We actually code a multilayer perceptron in pure Theano (doing the appropriate Tensor operations). In fact this is what Keras is doing for us, when we code an MLP network. In general, for other network configurations, Keras does a very similar thing: it builds the appropriate code in Theano/Tensorflow. This lecture is rather technical, so it's only necessary if you want to understand the inner workings of Keras.

Coding a Multilayer Perceptron in pure Theano: Part1

We continue with our previous lecture, coding a pure MLP neural network in Theano doing the Tensor operations

Coding a Multilayer Perceptron in pure Theano: Part2

We code a Multilayer Perceptron Network in Keras. It builds exactly the same structure that we used in the two previous lectures. We use this network for classifying shapes in drawings: squares and triangles. We achieve an excellent (100% percent) accuracy.

Multilayer Perceptron in Keras
Convolutional Neural Nets
5 Lectures 01:16:21

Introduction to Convolutional Neural Networks


How do Convolutions and Max-Pooling work? What are necessary dimensions for a 2d-Convolution and what are the dimensions of its output?

Convolutions and Max-Pooling

Using convolutional neural networks to predict different hand gestures

Preview 19:52

We use a very similar framework to identify nuts and bolts using images containing these pieces on a wooden desk

Classifying bolts and nuts

We show how to use neural networks to classify images from the river Thames vs images taken from the streets. This is similar to how many automatic tagging technologies work (software that tells you if your image was taken at the beach, or park, or in the mountains). We achieve a 100% accuracy both in in-sample and out-of-sample scenarios

Classifying Pictures in park vs home

Convolutional Neural Nets
3 questions
Recurrent neural networks
4 Lectures 58:19

Brief introduction to recurrent neural networks

Recurrent Neural Networks

Backpropagation uses the chain rule from calculus to compute the partial derivatives of the loss function with respect to the weights. This has an undesired consequence, when we have multiple layers and we need to do many multiplications, it can well happen that the gradient fades to zero. The practical consequence is that the training for the initial layers' weights can take just too long, because the gradient is not properly propagated. This is particularly relevant for recurrent neural networks, as they reuse previous layers (from previous time periods).

The vanishing gradient

We introduce the lstm model, which solves the vanishing gradient problem by intelligently reformulating the neural network model. It uses gates which are used for forgetting information, adding new information, and mixing new information with the information from previous periods. We use this model to predict house prices in London using AWS (Amazon Web Services)

LSTM: Predicting House Prices in London

We use an lstm model to model global temperatures using real data

Preview 16:13
About the Instructor
Francisco Juretig
3.9 Average rating
154 Reviews
1,348 Students
8 Courses

I worked for 7+ years exp as statistical programmer in the industry. Expert in programming, statistics, data science, statistical algorithms. I have wide experience in many programming languages. Regular contributor to the R community, with 3 published packages. I also am expert SAS programmer. Contributor to scientific statistical journals. Latest publication on the Journal of Statistical Software.