Artificial Intelligence: Reinforcement Learning in Python
4.5 (7,059 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
36,385 students enrolled

Artificial Intelligence: Reinforcement Learning in Python

Complete guide to Reinforcement Learning, with Stock Trading and Online Advertising Applications
Bestseller
4.5 (7,059 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
36,385 students enrolled
Last updated 5/2020
English
English [Auto-generated], French [Auto-generated], 4 more
  • German [Auto-generated]
  • Italian [Auto-generated]
  • Portuguese [Auto-generated]
  • Spanish [Auto-generated]
Price: $29.99
30-Day Money-Back Guarantee
This course includes
  • 12 hours on-demand video
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
Training 5 or more people?

Get your team access to 4,000+ top Udemy courses anytime, anywhere.

Try Udemy for Business
What you'll learn
  • Apply gradient-based supervised machine learning methods to reinforcement learning
  • Understand reinforcement learning on a technical level
  • Understand the relationship between reinforcement learning and psychology
  • Implement 17 different reinforcement learning algorithms
Course content
Expand all 104 lectures 11:46:47
+ Welcome
5 lectures 34:34
Course Outline and Big Picture
07:55
Where to get the Code
04:36
How to Succeed in this Course
03:13
Warmup
15:36
+ Return of the Multi-Armed Bandit
24 lectures 02:51:41
Section Introduction: The Explore-Exploit Dilemma
10:17
Epsilon-Greedy Theory
07:04
Calculating a Sample Mean (pt 1)
05:56
Epsilon-Greedy Beginner's Exercise Prompt
05:05
Designing Your Bandit Program
04:09
Epsilon-Greedy in Code
07:12
Comparing Different Epsilons
06:02
Optimistic Initial Values Theory
05:40
Optimistic Initial Values Beginner's Exercise Prompt
02:26
Optimistic Initial Values Code
04:18
UCB1 Theory
14:32
UCB1 Beginner's Exercise Prompt
02:14
UCB1 Code
03:28
Bayesian Bandits / Thompson Sampling Theory (pt 1)
12:43
Bayesian Bandits / Thompson Sampling Theory (pt 2)
17:35
Thompson Sampling Beginner's Exercise Prompt
02:50
Thompson Sampling Code
05:03
Thompson Sampling With Gaussian Reward Theory
11:24
Thompson Sampling With Gaussian Reward Code
06:18
Why don't we just use a library?
05:40
Nonstationary Bandits
07:11
Bandit Summary, Real Data, and Online Learning
06:29
(Optional) Alternative Bandit Designs
10:05
+ High Level Overview of Reinforcement Learning
3 lectures 23:00
On Unusual or Unexpected Strategies of RL
06:10
From Bandits to Full Reinforcement Learning
08:42
+ Markov Decision Proccesses
14 lectures 01:58:51
MDP Section Introduction
06:19
Gridworld
12:35
Choosing Rewards
03:58
The Markov Property
06:12
Markov Decision Processes (MDPs)
14:42
Future Rewards
09:34
Value Functions
05:07
The Bellman Equation (pt 1)
08:46
The Bellman Equation (pt 2)
06:42
The Bellman Equation (pt 3)
06:09
Bellman Examples
22:24
Optimal Policy and Optimal Value Function (pt 1)
09:17
Optimal Policy and Optimal Value Function (pt 2)
04:08
MDP Summary
02:58
+ Dynamic Programming
11 lectures 45:17
Intro to Dynamic Programming and Iterative Policy Evaluation
03:06
Gridworld in Code
05:47
Designing Your RL Program
05:00
Iterative Policy Evaluation in Code
06:24
Policy Improvement
02:51
Policy Iteration
02:00
Policy Iteration in Code
03:46
Policy Iteration in Windy Gridworld
04:57
Value Iteration
03:58
Value Iteration in Code
02:14
Dynamic Programming Summary
05:14
+ Monte Carlo
9 lectures 35:42
Monte Carlo Intro
03:10
Monte Carlo Policy Evaluation
05:45
Monte Carlo Policy Evaluation in Code
03:35
Policy Evaluation in Windy Gridworld
03:38
Monte Carlo Control
05:59
Monte Carlo Control in Code
04:04
Monte Carlo Control without Exploring Starts
02:58
Monte Carlo Control without Exploring Starts in Code
02:51
Monte Carlo Summary
03:42
+ Temporal Difference Learning
8 lectures 24:40
Temporal Difference Intro
01:42
TD(0) Prediction
03:46
TD(0) Prediction in Code
02:27
SARSA
05:15
SARSA in Code
03:38
Q Learning
03:05
Q Learning in Code
02:13
TD Summary
02:34
+ Approximation Methods
9 lectures 37:37
Approximation Intro
04:11
Linear Models for Reinforcement Learning
04:16
Features
04:02
Monte Carlo Prediction with Approximation
01:54
Monte Carlo Prediction with Approximation in Code
02:58
TD(0) Semi-Gradient Prediction
04:22
Semi-Gradient SARSA
03:08
Semi-Gradient SARSA in Code
04:08
Course Summary and Next Steps
08:38
+ Stock Trading Project with Reinforcement Learning
9 lectures 01:06:57
Stock Trading Project Section Introduction
05:13
Data and Environment
12:22
How to Model Q for Q-Learning
09:37
Design of the Program
06:45
Code pt 1
07:59
Code pt 2
09:40
Code pt 3
04:28
Code pt 4
07:16
Stock Trading Project Discussion
03:37
+ Appendix / FAQ
12 lectures 02:28:28
What is the Appendix?
02:48
Windows-Focused Environment Setup 2018
20:20
How to install Numpy, Scipy, Matplotlib, Pandas, IPython, Theano, and TensorFlow
17:32
How to Code by Yourself (part 1)
15:54
How to Code by Yourself (part 2)
09:23
How to Succeed in this Course (Long Version)
10:24
Is this for Beginners or Experts? Academic or Practical? Fast or slow-paced?
22:04
Proof that using Jupyter Notebook is the same as not using it
12:29
Python 2 vs Python 3
04:38
What order should I take your courses in? (part 1)
11:18
What order should I take your courses in? (part 2)
16:07
BONUS: Where to get discount coupons and FREE deep learning material
05:31
Requirements
  • Calculus (derivatives)
  • Probability / Markov Models
  • Numpy, Matplotlib
  • Beneficial ave experience with at least a few supervised machine learning methods
  • Gradient descent
  • Good object-oriented programming skills
Description

When people talk about artificial intelligence, they usually don’t mean supervised and unsupervised machine learning.

These tasks are pretty trivial compared to what we think of AIs doing - playing chess and Go, driving cars, and beating video games at a superhuman level.

Reinforcement learning has recently become popular for doing all of that and more.

Much like deep learning, a lot of the theory was discovered in the 70s and 80s but it hasn’t been until recently that we’ve been able to observe first hand the amazing results that are possible.

In 2016 we saw Google’s AlphaGo beat the world Champion in Go.

We saw AIs playing video games like Doom and Super Mario.

Self-driving cars have started driving on real roads with other drivers and even carrying passengers (Uber), all without human assistance.

If that sounds amazing, brace yourself for the future because the law of accelerating returns dictates that this progress is only going to continue to increase exponentially.

Learning about supervised and unsupervised machine learning is no small feat. To date I have over SIXTEEN (16!) courses just on those topics alone.

And yet reinforcement learning opens up a whole new world. As you’ll learn in this course, the reinforcement learning paradigm is more different from supervised and unsupervised learning than they are from each other.

It’s led to new and amazing insights both in behavioral psychology and neuroscience. As you’ll learn in this course, there are many analogous processes when it comes to teaching an agent and teaching an animal or even a human. It’s the closest thing we have so far to a true general artificial intelligence. What’s covered in this course?

  • The multi-armed bandit problem and the explore-exploit dilemma

  • Ways to calculate means and moving averages and their relationship to stochastic gradient descent

  • Markov Decision Processes (MDPs)

  • Dynamic Programming

  • Monte Carlo

  • Temporal Difference (TD) Learning (Q-Learning and SARSA)

  • Approximation Methods (i.e. how to plug in a deep neural network or other differentiable model into your RL algorithm)

  • Project: Apply Q-Learning to build a stock trading bot

If you’re ready to take on a brand new challenge, and learn about AI techniques that you’ve never seen before in traditional supervised machine learning, unsupervised machine learning, or even deep learning, then this course is for you.

See you in class!


Suggested Prerequisites:

  • Calculus

  • Probability

  • Object-oriented programming

  • Python coding: if/else, loops, lists, dicts, sets

  • Numpy coding: matrix and vector operations

  • Linear regression

  • Gradient descent


TIPS (for getting through the course):

  • Watch it at 2x.

  • Take handwritten notes. This will drastically increase your ability to retain the information.

  • Write down the equations. If you don't, I guarantee it will just look like gibberish.

  • Ask lots of questions on the discussion board. The more the better!

  • Realize that most exercises will take you days or weeks to complete.

  • Write code yourself, don't just sit there and look at my code.


WHAT ORDER SHOULD I TAKE YOUR COURSES IN?:

  • Check out the lecture "What order should I take your courses in?" (available in the Appendix of any of my courses, including the free Numpy course)



Who this course is for:
  • Anyone who wants to learn about artificial intelligence, data science, machine learning, and deep learning
  • Both students and professionals