What is unsupervised learning used for?

Lazy Programmer Team
A free video tutorial from Lazy Programmer Team
Artificial Intelligence and Machine Learning Engineer
4.6 instructor rating • 14 courses • 151,366 students

Lecture description

This lecture describes what unsupervised machine learning (not just clustering) is used for in general.

There are 2 major categories:


1) density estimation

If we can figure out the probability distribution of the data, not only is this a model of the data, but we can then sample from the distribution to generate new data.

For example, we can train a model to read lots of Shakespeare and then generate writing in the style of Shakespeare.


2) latent variables

This allows us to find the underlying cause of the data we've observed by reducing it to a small set of factors.

For example, if we measure the heights of all the people in our class and plot them on a histogram, we may notice 2 "bumps".

These "bumps" correspond to male heights and female heights.

Thus, being male or female is the hidden cause of higher / lower height values.

Clustering does exactly this - it tells us how the data can be split up into distinct groups / segments / categories.


Unsupervised machine learning can also be used for:

  • dimensionality reduction - modern datasets can have millions of features, but many of them may be correlated

  • visualization - you can't see a million-dimensional dataset, but if you reduce the dimensionality to 2, then it can be visualized

Learn more from the full course

Cluster Analysis and Unsupervised Machine Learning in Python

Data science techniques for pattern recognition, data mining, k-means clustering, and hierarchical clustering, and KDE.

07:54:19 of on-demand video • Updated January 2021

  • Understand the regular K-Means algorithm
  • Understand and enumerate the disadvantages of K-Means Clustering
  • Understand the soft or fuzzy K-Means Clustering algorithm
  • Implement Soft K-Means Clustering in Code
  • Understand Hierarchical Clustering
  • Explain algorithmically how Hierarchical Agglomerative Clustering works
  • Apply Scipy's Hierarchical Clustering library to data
  • Understand how to read a dendrogram
  • Understand the different distance metrics used in clustering
  • Understand the difference between single linkage, complete linkage, Ward linkage, and UPGMA
  • Understand the Gaussian mixture model and how to use it for density estimation
  • Write a GMM in Python code
  • Explain when GMM is equivalent to K-Means Clustering
  • Explain the expectation-maximization algorithm
  • Understand how GMM overcomes some disadvantages of K-Means
  • Understand the Singular Covariance problem and how to fix it
English [Auto] In this lecture we are going to answer the question what is Unsupervised Learning used for this lecture is in only about clustering but rather unsupervised learning as a whole. This lecture will give you the big picture of where clustering fits into the overall family of algorithms that we call machine learning. So if I were to put it into a single sentence I would say that unsupervised learning is all about taking a data set in finding the hidden structure in that dataset. There are many different synonyms you can use for the phrase hidden structure. Another thing I could say is I'm looking for patterns in the data set. Another thing I could say is I'm looking for hidden causes for the data that I observed. Another thing I could say is I'm looking for latent variables latent means the same thing as hidden. So if you hear me say either. Remember that they mean the same thing. So what exactly do we mean by hidden structure or hidden causes or latent variables. The basic idea is that there is some underlying and fundamental pattern in our observations. As an example of that consider the height and weight of a human being. Generally speaking the taller you are the more you are going to weigh. You're not going to have a five foot tall human being that weighs 300 pounds although there are definitely outliers that exist overall there is a pattern that's followed an underlying cause that underlying cause might be thought of as the person size. Maybe that's due to some size Gene. There is some size variable that not only means you are taller but also means you weigh more. You have bigger bones bigger organs and so forth another way to think of this is that what we observe is just a noisy measurement of what really is. One example of that is missile tracking. The common filter is an algorithm that takes as input a noisy measurement of the missiles location and predicts the true location which would be the noisy location minus the noise aside from the obvious immediate application. It could be said that all measurements we make are noisy and as such. Finding the hidden structure or the latent variables in our data is equivalent to noise removal or finding out what does the true data look like. If we take away that noise. Finally I want to get a little more technical and talk about the different kinds of unsupervised learning at a high level. You can organize unsupervised learning into two categories. One is where the latent variables are continuous and the other is where the latent variables are categorical a missile tracking out of them would be an example of where the latent variables are continuous since position is a continuous real valued variable. On the other hand suppose we're taking as input a set of newspaper articles and we split them up into distinct groups as we might do in this course. This would be an example of categorical latent variables. Thus it can be said that in this course which is all about clustering we are interested in finding the hidden structure of data where that hidden structure is assumed to be categorical in nature finally let's discuss some applications of unsupervised learning that are a little more abstract. It's easy to talk about missile tracking which is a military application. But what about actually working with data. Suppose you are a data analyst or a software engineer. How can you use unsupervised learning. Well here are some examples. One is density estimation. That is how can we given a data set find the probability distribution of where that dataset came from. That is density estimation we want to find the true probability density which our data was sampled from. As a side note we'll be looking at Density Estimation in this course. Another example is dimensionality reduction. Clearly it's not possible for us to see anything beyond two or three dimensions since three is the dimensionality of the physical world. We haven't evolved to see anything beyond these three dimensions. On the other hand real world data is often hundreds and sometimes millions of dimensions. So one thing we can do with dimensionality reduction is take a data set and say actually out of these million dimensions only three of them are really hidden causes. So let's reduce the dimensionality of this dataset and then we can look at it in a picture that helps to not only visualize but also to process that data faster. As you can imagine a machine learning algorithm that needs only three inputs is going to be a lot faster than an algorithm that needs 1 million inputs. Of course there are many other applications of unsupervised learning some of which we'll discuss in this course. But in this lecture The hope is that you now understand that a high level what we can do with unsupervised learning in general and why it can be useful.