Artificial Intelligence: Reinforcement Learning in Python
4.6 (395 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
5,538 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Artificial Intelligence: Reinforcement Learning in Python to your Wishlist.

Add to Wishlist

Artificial Intelligence: Reinforcement Learning in Python

Complete guide to artificial intelligence and machine learning, prep for deep reinforcement learning
4.6 (395 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
5,538 students enrolled
Last updated 5/2017
English
English
Current price: $10 Original price: $180 Discount: 94% off
30-Day Money-Back Guarantee
Includes:
  • 5.5 hours on-demand video
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Apply gradient-based supervised machine learning methods to reinforcement learning
  • Understand reinforcement learning on a technical level
  • Understand the relationship between reinforcement learning and psychology
  • Implement 17 different reinforcement learning algorithms
View Curriculum
Requirements
  • Calculus
  • Probability
  • Markov Models
  • The Numpy Stack
  • Have experience with at least a few supervised machine learning methods
  • Gradient descent
  • Good object-oriented programming skills
Description

When people talk about artificial intelligence, they usually don’t mean supervised and unsupervised machine learning.

These tasks are pretty trivial compared to what we think of AIs doing - playing chess and Go, driving cars, and beating video games at a superhuman level.

Reinforcement learning has recently become popular for doing all of that and more.

Much like deep learning, a lot of the theory was discovered in the 70s and 80s but it hasn’t been until recently that we’ve been able to observe first hand the amazing results that are possible.

In 2016 we saw Google’s AlphaGo beat the world Champion in Go.

We saw AIs playing video games like Doom and Super Mario.

Self-driving cars have started driving on real roads with other drivers and even carrying passengers (Uber), all without human assistance.

If that sounds amazing, brace yourself for the future because the law of accelerating returns dictates that this progress is only going to continue to increase exponentially.

Learning about supervised and unsupervised machine learning is no small feat. To date I have over SIXTEEN (16!) courses just on those topics alone.

And yet reinforcement learning opens up a whole new world. As you’ll learn in this course, the reinforcement learning paradigm is more different from supervised and unsupervised learning than they are from each other.

It’s led to new and amazing insights both in behavioral psychology and neuroscience. As you’ll learn in this course, there are many analogous processes when it comes to teaching an agent and teaching an animal or even a human. It’s the closest thing we have so far to a true general artificial intelligence.

What’s covered in this course?

  • The multi-armed bandit problem and the explore-exploit dilemma
  • Ways to calculate means and moving averages and their relationship to stochastic gradient descent
  • Markov Decision Processes (MDPs)
  • Dynamic Programming
  • Monte Carlo
  • Temporal Difference (TD) Learning
  • Approximation Methods (i.e. how to plug in a deep neural network or other differentiable model into your RL algorithm)

If you’re ready to take on a brand new challenge, and learn about AI techniques that you’ve never seen before in traditional supervised machine learning, unsupervised machine learning, or even deep learning, then this course is for you.

See you in class!


NOTES:

All the code for this course can be downloaded from my github:

/lazyprogrammer/machine_learning_examples

In the directory: rl

Make sure you always "git pull" so you have the latest version!


HARD PREREQUISITES / KNOWLEDGE YOU ARE ASSUMED TO HAVE:

  • Calculus
  • Probability
  • Object-oriented programming
  • Python coding: if/else, loops, lists, dicts, sets
  • Numpy coding: matrix and vector operations
  • Linear regression
  • Gradient descent


TIPS (for getting through the course):

  • Watch it at 2x.
  • Take handwritten notes. This will drastically increase your ability to retain the information.
  • Write down the equations. If you don't, I guarantee it will just look like gibberish.
  • Ask lots of questions on the discussion board. The more the better!
  • Realize that most exercises will take you days or weeks to complete.
  • Write code yourself, don't just sit there and look at my code.


USEFUL COURSE ORDERING:

  • (The Numpy Stack in Python)
  • Linear Regression in Python
  • Logistic Regression in Python
  • (Supervised Machine Learning in Python)
  • (Bayesian Machine Learning in Python: A/B Testing)
  • Deep Learning in Python
  • Practical Deep Learning in Theano and TensorFlow
  • (Supervised Machine Learning in Python 2: Ensemble Methods)
  • Convolutional Neural Networks in Python
  • (Easy NLP)
  • (Cluster Analysis and Unsupervised Machine Learning)
  • Unsupervised Deep Learning
  • (Hidden Markov Models)
  • Recurrent Neural Networks in Python
  • Artificial Intelligence: Reinforcement Learning in Python
  • Natural Language Processing with Deep Learning in Python


Who is the target audience?
  • Anyone who wants to learn about artificial intelligence, data science, machine learning, and deep learning
  • Both students and professionals
Curriculum For This Course
71 Lectures
05:43:05
+
Introduction and Outline
4 Lectures 28:45


Where to get the Code
02:41

Strategy for Passing the Course
05:56
+
Return of the Multi-Armed Bandit
9 Lectures 38:57
Problem Setup and The Explore-Exploit Dilemma
03:55

Epsilon-Greedy
01:48

Updating a Sample Mean
01:22

Comparing Different Epsilons
04:06

Optimistic Initial Values
02:56

UCB1
04:56

Bayesian / Thompson Sampling
09:52

Thompson Sampling vs. Epsilon-Greedy vs. Optimistic Initial Values vs. UCB1
05:11

Nonstationary Bandits
04:51
+
Build an Intelligent Tic-Tac-Toe Agent
11 Lectures 01:07:21
Naive Solution to Tic-Tac-Toe
03:50

Components of a Reinforcement Learning System
08:00

Notes on Assigning Rewards
02:41

The Value Function and Your First Reinforcement Learning Algorithm
16:33

Tic Tac Toe Code: Outline
03:16

Tic Tac Toe Code: Representing States
02:56

Tic Tac Toe Code: Enumerating States Recursively
06:14

Tic Tac Toe Code: The Environment
06:36

Tic Tac Toe Code: The Agent
05:48

Tic Tac Toe Code: Main Loop and Demo
06:02

Tic Tac Toe Summary
05:25
+
Markov Decision Proccesses
7 Lectures 24:37
Gridworld
02:13

The Markov Property
04:36

Defining and Formalizing the MDP
04:10

Future Rewards
03:16

Value Functions
04:38

Optimal Policy and Optimal Value Function
04:09

MDP Summary
01:35
+
Dynamic Programming
10 Lectures 40:17
Intro to Dynamic Programming and Iterative Policy Evaluation
03:06

Gridworld in Code
05:47

Iterative Policy Evaluation in Code
06:24

Policy Improvement
02:51

Policy Iteration
02:00

Policy Iteration in Code
03:46

Policy Iteration in Windy Gridworld
04:57

Value Iteration
03:58

Value Iteration in Code
02:14

Dynamic Programming Summary
05:14
+
Monte Carlo
9 Lectures 35:42
Monte Carlo Intro
03:10

Monte Carlo Policy Evaluation
05:45

Monte Carlo Policy Evaluation in Code
03:35

Policy Evaluation in Windy Gridworld
03:38

Monte Carlo Control
05:59

Monte Carlo Control in Code
04:04

Monte Carlo Control without Exploring Starts
02:58

Monte Carlo Control without Exploring Starts in Code
02:51

Monte Carlo Summary
03:42
+
Temporal Difference Learning
8 Lectures 24:40
Temporal Difference Intro
01:42

TD(0) Prediction
03:46

TD(0) Prediction in Code
02:27

SARSA
05:15

SARSA in Code
03:38

Q Learning
03:05

Q Learning in Code
02:13

TD Summary
02:34
+
Approximation Methods
9 Lectures 37:37
Approximation Intro
04:11

Linear Models for Reinforcement Learning
04:16

Features
04:02

Monte Carlo Prediction with Approximation
01:54

Monte Carlo Prediction with Approximation in Code
02:58

TD(0) Semi-Gradient Prediction
04:22

Semi-Gradient SARSA
03:08

Semi-Gradient SARSA in Code
04:08

Course Summary and Next Steps
08:38
+
Appendix
4 Lectures 45:09
How to install Numpy, Scipy, Matplotlib, Pandas, IPython, Theano, and TensorFlow
17:32

How to Code by Yourself (part 1)
15:54

How to Code by Yourself (part 2)
09:23

Where to get discount coupons and FREE deep learning material
02:20
About the Instructor
Lazy Programmer Inc.
4.6 Average rating
11,144 Reviews
59,617 Students
18 Courses
Data scientist and big data engineer

I am a data scientist, big data engineer, and full stack software engineer.

For my masters thesis I worked on brain-computer interfaces using machine learning. These assist non-verbal and non-mobile persons communicate with their family and caregivers.

I have worked in online advertising and digital media as both a data scientist and big data engineer, and built various high-throughput web services around said data. I've created new big data pipelines using Hadoop/Pig/MapReduce. I've created machine learning models to predict click-through rate, news feed recommender systems using linear regression, Bayesian Bandits, and collaborative filtering and validated the results using A/B testing.

I have taught undergraduate and graduate students in data science, statistics, machine learning, algorithms, calculus, computer graphics, and physics for students attending universities such as Columbia University, NYU, Humber College, and The New School. 

Multiple businesses have benefitted from my web programming expertise. I do all the backend (server), frontend (HTML/JS/CSS), and operations/deployment work. Some of the technologies I've used are: Python, Ruby/Rails, PHP, Bootstrap, jQuery (Javascript), Backbone, and Angular. For storage/databases I've used MySQL, Postgres, Redis, MongoDB, and more.