Introduction to Variational Autoencoders

Lazy Programmer Inc.
A free video tutorial from Lazy Programmer Inc.
Artificial intelligence and machine learning engineer
4.6 instructor rating • 27 courses • 383,019 students

Learn more from the full course

Deep Learning: GANs and Variational Autoencoders

Generative Adversarial Networks and Variational Autoencoders in Python, Theano, and Tensorflow

07:42:45 of on-demand video • Updated June 2020

  • Learn the basic principles of generative models
  • Build a variational autoencoder in Theano and Tensorflow
  • Build a GAN (Generative Adversarial Network) in Theano and Tensorflow
English [Auto] Everyone and welcome back to this class unsupervised the learning part to in this lecture. We're going to discuss some of the background behind the variational auto and Carter the variational auto encoder is a neural network that can both learn to reproduce its input but also map a training data to a latent space and then draw samples from the data distribution by sampling from the latent space. If that sounds complicated Don't worry as we will be going through that process in much more detail throughout the next few letters. First let's discuss the name a little bit. Variational on coater. If you've been following my courses then you probably recognize one of these terms but not the other. So let's discuss the easy one first. We know what an auto encoder is. In fact that we practice building one as a warm up to this course. And auto encoder is just a neural network that learns to reproduce its input. One important feature of auto encoders is that typically we choose the number of hidden units to be smaller than the input dimensionality. This creates a bottleneck and it forces the neural network to learn a more compact representation of the data. So what do we mean by that. Well suppose we can teach this encoder to faithfully reproduce its input. Let's assume that the input size to the Sarno encoder is eighty four dimensions and the hidden layer size is 200 dimensions. That means we have learned to represent whatever image we've passed in which consists of seven hundred eighty four numbers as a much smaller code instead 200 numbers. Well what does this mean. That means out of the original 784 numbers many of them were redundant. The true amount of information contained in the original data must then be much less than 784 we typically call these 200 numbers in the head and they're the latest variable representation and we typically use the letter Z to represent it. They shouldn't come as a surprise since we've consistently used the letter Z to represent the hidden layers of the neural network and latent variables in the past. We'll touch on this concept again later. The other term in variational auto encoder which you probably haven't heard of is variational. This refers to variational inference or variational Bayes these techniques fall into the realm of Bazy machine learning which you'd have to have taken a difficult graduate course in machine learning to learn about one way to think of variational in France is that it's an extension of the expectation maximisation or yena algorithm just a warning a lot of this information is not directly related to how variation of all encoders were. It's just sort of historical information to give you a better sense of the big picture. So if you don't quite get it don't worry it's not required to understand the underlying mechanics of the variational auto encoder as you may recall. The expectation maximisation algorithm is used when we have a latent variable model and we can't maximize the effects directly. An example of this is the Gaussian mixture model which we've seen already. The expectation maximisation algorithm gives us a point estimate of the parameters. In other words it can be seen as a frequentist statistical method what variational inference does is it extends this idea to the Bazy and realm where instead of learning point estimates of the parameters we learn their distributions instead. So does this mean you have to understand variational inference in order to understand this course. Luckily no. You must be happy to hear that there is finally one topic that is not a prerequisite to this course. So why don't we need to understand variational inference in order to understand variational auto and courters. Well it would certainly help to have a deeper understanding of variational variation and coder's but learning how one works mechanically and implementing one in code do not require this deep understanding. And as I mentioned earlier if you do decide one day that you want to learn about variational inference and similar Bazy methods realize that it will be extremely challenging very difficult conceptually and mathematically so it wouldn't be practical to make it a prerequisite to this course. So that's just some background on what elements combine to give us variation on auto encoders. It takes one thing we know about auto and Calder's And one thing we probably don't know anything about variational inference and combines them together. In the next few lectures We'll look at the mechanics of training and sampling from a variation of auto encoder and an implementation in code.