# What is unsupervised learning used for?

A free video tutorial from Lazy Programmer Team
Artificial Intelligence and Machine Learning Engineer
4.6 instructor rating • 14 courses • 151,366 students

## Lecture description

This lecture describes what unsupervised machine learning (not just clustering) is used for in general.

There are 2 major categories:

1) density estimation

If we can figure out the probability distribution of the data, not only is this a model of the data, but we can then sample from the distribution to generate new data.

For example, we can train a model to read lots of Shakespeare and then generate writing in the style of Shakespeare.

2) latent variables

This allows us to find the underlying cause of the data we've observed by reducing it to a small set of factors.

For example, if we measure the heights of all the people in our class and plot them on a histogram, we may notice 2 "bumps".

These "bumps" correspond to male heights and female heights.

Thus, being male or female is the hidden cause of higher / lower height values.

Clustering does exactly this - it tells us how the data can be split up into distinct groups / segments / categories.

Unsupervised machine learning can also be used for:

• dimensionality reduction - modern datasets can have millions of features, but many of them may be correlated

• visualization - you can't see a million-dimensional dataset, but if you reduce the dimensionality to 2, then it can be visualized

Cluster Analysis and Unsupervised Machine Learning in Python

Data science techniques for pattern recognition, data mining, k-means clustering, and hierarchical clustering, and KDE.

07:54:19 of on-demand video • Updated January 2021

• Understand the regular K-Means algorithm
• Understand and enumerate the disadvantages of K-Means Clustering
• Understand the soft or fuzzy K-Means Clustering algorithm
• Implement Soft K-Means Clustering in Code
• Understand Hierarchical Clustering
• Explain algorithmically how Hierarchical Agglomerative Clustering works
• Apply Scipy's Hierarchical Clustering library to data
• Understand how to read a dendrogram
• Understand the different distance metrics used in clustering