
Explore how to rank items with ratings using confidence intervals, compare average rating versus lower bound via Wilson's interval, extend to five-star ratings, and apply upvote/downvote semantics.
The lecture presents a suggestion box to collect feedback and improve the course by inviting input on background, course difficulty, missing explanations, and topics like CNNs and transformers.
Explore collaborative filtering in Python by modeling the user-item rating matrix, distinguishing sparse data, and predicting user-specific scores with regression and mean squared error evaluation.
Prepare the rating data by zero-based user IDs and indexing movie IDs. Shrink the data to active users and movies, then build user-to-movie, movie-to-user, and rating dictionaries.
Explore user-user collaborative filtering in python by building a simple, data-driven predictor using averages, deviations, and pearson-based weights, then evaluate with mean squared error on train and test sets.
Learn item-item collaborative filtering and its relation to user-user filtering, using Pearson correlation to find similar items and predict ratings with neighborhood-based scoring for faster, accurate recommendations.
Apply common sense to choose the right model by weighing accuracy, maintainability, and training time, through three practical quizzes; decide what makes sense for your business.
Explore matrix factorization by decomposing the user-item matrix into W and U to predict ratings via dot products, using sparse representations and latent features for dimensionality reduction with small k.
Practice matrix factorization for recommender systems in Python by loading data, initializing w, b, u, c, and mu; train until convergence, plot cost, and report train and test MSE.
Re-implement matrix factorization in Keras using user and movie embeddings, dot-product predictions, and optional biases, with gradient descent training and connections to word embeddings.
Implement auto rec in code by preprocessing ratings data, building a two-layer autoencoder, and training with a custom masked loss to predict missing ratings.
Show how energy and probability in a restricted Boltzmann machine lead to neural network equations, deriving p(h|v) as the sigmoid of W·v plus bias via Bayes rule.
Speed up RBM code by moving one-hot encoding, masking, and dot-product computations into TensorFlow, eliminating numpy preprocessing and improving runtime.
Learn to set up Spark locally across macOS, Ubuntu, and Windows workflows by installing Java, Scala, and Spark (or PySpark via pip) and testing with the Spark shell.
Learn to run a Python Spark job with spark submit, creating a spark context and coordinating a local master. Grasp the overhead and memory considerations when processing large gzipped datasets.
Set up a spark cluster on aws using spark ec2, create key pair and credentials, launch master and slaves, run the job, then terminate to avoid costs.
Learn to use Spark for real world predictions, serving content via an API while batch Spark jobs feed a database, with mean squared error evaluation and fallback when jobs fail.
Believe it or not, almost all online businesses today make use of recommender systems in some way or another.
What do I mean by “recommender systems”, and why are they useful?
Let’s look at the top 3 websites on the Internet, according to Alexa: Google, YouTube, and Facebook.
Recommender systems form the very foundation of these technologies.
Google: Search results
They are why Google is the most successful technology company today.
YouTube: Video dashboard
I’m sure I’m not the only one who’s accidentally spent hours on YouTube when I had more important things to do! Just how do they convince you to do that?
That’s right. Recommender systems!
Facebook: So powerful that world governments are worried that the newsfeed has too much influence on people! (Or maybe they are worried about losing their own power... hmm...)
Amazing!
This course is a big bag of tricks that make recommender systems work across multiple platforms.
We’ll look at popular news feed algorithms, like Reddit, Hacker News, and Google PageRank.
We’ll look at Bayesian recommendation techniques that are being used by a large number of media companies today.
But this course isn’t just about news feeds.
Companies like Amazon, Netflix, and Spotify have been using recommendations to suggest products, movies, and music to customers for many years now.
These algorithms have led to billions of dollars in added revenue.
So I assure you, what you’re about to learn in this course is very real, very applicable, and will have a huge impact on your business.
For those of you who like to dig deep into the theory to understand how things really work, you know this is my specialty and there will be no shortage of that in this course. We’ll be covering state of the art algorithms like matrix factorization and deep learning (making use of both supervised and unsupervised learning - Autoencoders and Restricted Boltzmann Machines), and you’ll learn a bag full of tricks to improve upon baseline results.
As a bonus, we will also look how to perform matrix factorization using big data in Spark. We will create a cluster using Amazon EC2 instances with Amazon Web Services (AWS). Most other courses and tutorials look at the MovieLens 100k dataset - that is puny! Our examples make use of MovieLens 20 million.
Whether you sell products in your e-commerce store, or you simply write a blog - you can use these techniques to show the right recommendations to your users at the right time.
If you’re an employee at a company, you can use these techniques to impress your manager and get a raise!
I’ll see you in class!
NOTE:
This course is not "officially" part of my deep learning series. It contains a strong deep learning component, but there are many concepts in the course that are totally unrelated to deep learning.
"If you can't implement it, you don't understand it"
Or as the great physicist Richard Feynman said: "What I cannot create, I do not understand".
My courses are the ONLY courses where you will learn how to implement machine learning algorithms from scratch
Other courses will teach you how to plug in your data into a library, but do you really need help with 3 lines of code?
After doing the same thing with 10 datasets, you realize you didn't learn 10 things. You learned 1 thing, and just repeated the same 3 lines of code 10 times...
Suggested Prerequisites:
For earlier sections, just know some basic arithmetic
For advanced sections, know calculus, linear algebra, and probability for a deeper understanding
Be proficient in Python and the Numpy stack (see my free course)
For the deep learning section, know the basics of using Keras
For the RBM section, know Tensorflow
WHAT ORDER SHOULD I TAKE YOUR COURSES IN?:
Check out the lecture "Machine Learning and AI Prerequisite Roadmap" (available in the FAQ of any of my courses, including the free Numpy course)
UNIQUE FEATURES
Every line of code explained in detail - email me any time if you disagree
No wasted time "typing" on the keyboard like other courses - let's be honest, nobody can really write code worth learning about in just 20 minutes from scratch
Not afraid of university-level math - get important details about algorithms that other courses leave out