Udemy
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
Development
Web Development Data Science Mobile Development Programming Languages Game Development Database Design & Development Software Testing Software Engineering Development Tools No-Code Development
Business
Entrepreneurship Communications Management Sales Business Strategy Operations Project Management Business Law Business Analytics & Intelligence Human Resources Industry E-Commerce Media Real Estate Other Business
Finance & Accounting
Accounting & Bookkeeping Compliance Cryptocurrency & Blockchain Economics Finance Finance Cert & Exam Prep Financial Modeling & Analysis Investing & Trading Money Management Tools Taxes Other Finance & Accounting
IT & Software
IT Certification Network & Security Hardware Operating Systems Other IT & Software
Office Productivity
Microsoft Apple Google SAP Oracle Other Office Productivity
Personal Development
Personal Transformation Personal Productivity Leadership Career Development Parenting & Relationships Happiness Esoteric Practices Religion & Spirituality Personal Brand Building Creativity Influence Self Esteem & Confidence Stress Management Memory & Study Skills Motivation Other Personal Development
Design
Web Design Graphic Design & Illustration Design Tools User Experience Design Game Design Design Thinking 3D & Animation Fashion Design Architectural Design Interior Design Other Design
Marketing
Digital Marketing Search Engine Optimization Social Media Marketing Branding Marketing Fundamentals Marketing Analytics & Automation Public Relations Advertising Video & Mobile Marketing Content Marketing Growth Hacking Affiliate Marketing Product Marketing Other Marketing
Lifestyle
Arts & Crafts Beauty & Makeup Esoteric Practices Food & Beverage Gaming Home Improvement Pet Care & Training Travel Other Lifestyle
Photography & Video
Digital Photography Photography Portrait Photography Photography Tools Commercial Photography Video Design Other Photography & Video
Health & Fitness
Fitness General Health Sports Nutrition Yoga Mental Health Dieting Self Defense Safety & First Aid Dance Meditation Other Health & Fitness
Music
Instruments Music Production Music Fundamentals Vocal Music Techniques Music Software Other Music
Teaching & Academics
Engineering Humanities Math Science Online Education Social Science Language Teacher Training Test Prep Other Teaching & Academics
AWS Certification Microsoft Certification AWS Certified Solutions Architect - Associate AWS Certified Cloud Practitioner CompTIA A+ Cisco CCNA Amazon AWS AWS Certified Developer - Associate CompTIA Security+
Photoshop Graphic Design Adobe Illustrator Drawing Digital Painting InDesign Character Design Canva Figure Drawing
Life Coach Training Neuro-Linguistic Programming Mindfulness Personal Development Personal Transformation Meditation Life Purpose Neuroscience Coaching
Web Development JavaScript React CSS Angular PHP Node.Js WordPress Python
Google Flutter Android Development iOS Development Swift React Native Dart Programming Language Mobile Development Kotlin SwiftUI
Digital Marketing Google Ads (Adwords) Social Media Marketing Google Ads (AdWords) Certification Marketing Strategy Internet Marketing YouTube Marketing Email Marketing Google Analytics
SQL Microsoft Power BI Tableau Business Analysis Business Intelligence MySQL Data Analysis Data Modeling Big Data
Business Fundamentals Entrepreneurship Fundamentals Business Strategy Online Business Business Plan Startup Freelancing Blogging Home Business
Unity Game Development Fundamentals Unreal Engine C# 3D Game Development C++ 2D Game Development Unreal Engine Blueprints Blender
30-Day Money-Back Guarantee

This course includes:

  • 8 hours on-demand video
  • 59 downloadable resources
  • Full lifetime access
  • Access on mobile and TV
Development Data Science Reinforcement Learning

Modern Reinforcement Learning: Actor-Critic Methods

How to Implement Cutting Edge Artificial Intelligence Research Papers in the Open AI Gym Using the PyTorch Framework
Rating: 4.4 out of 54.4 (104 ratings)
704 students
Created by Phil Tabor
Last updated 10/2020
English
English [Auto]
30-Day Money-Back Guarantee

What you'll learn

  • How to code policy gradient methods in PyTorch
  • How to code Deep Deterministic Policy Gradients (DDPG) in PyTorch
  • How to code Twin Delayed Deep Deterministic Policy Gradients (TD3) in PyTorch
  • How to code actor critic algorithms in PyTorch
  • How to implement cutting edge artificial intelligence research papers in Python

Course content

6 sections • 58 lectures • 8h 10m total length

  • Preview03:41
  • Preview03:17
  • Preview03:51

  • Preview10:27
  • Calculating State Transition Probabilities
    1 question
  • Teaching an AI about Black Jack with Monte Carlo Prediction
    20:00
  • Teaching an AI How to Play Black Jack with Monte Carlo Control
    19:41
  • Review of Temporal Difference Learning Methods
    03:50
  • Teaching an AI about Balance with TD(0) Prediction
    09:42
  • Preview24:21

  • What's so Great About Policy Gradient Methods?
    07:38
  • Combining Neural Networks with Monte Carlo: REINFORCE Policy Gradient Algorithm
    05:02
  • Introducing the Lunar Lander Environment
    03:54
  • Coding the Agent's Brain: The Policy Gradient Network
    05:29
  • Coding the Policy Gradient Agent's Basic Functionality
    05:50
  • Coding the Agent's Learn Function
    06:04
  • Coding the Policy Gradient Main Loop and Watching our Agent Land on the Moon
    09:27
  • Actor Critic Learning: Combining Policy Gradients & Temporal Difference Learning
    04:12
  • Coding the Actor Critic Networks
    03:23
  • Coding the Actor Critic Agent
    08:20
  • Coding the Actor Critic Main Loop and Watching Our Agent Land on the Moon
    09:22

  • Getting up to Speed With Deep Q Learning
    04:44
  • How to Read and Understand Cutting Edge Research Papers
    06:11
  • Analyzing the DDPG Paper Abstract and Introduction
    07:00
  • Analyzing the Background Material
    05:55
  • What Algorithm Are We Going to Implement?
    08:03
  • What Results Should We Expect?
    09:37
  • What Other Solutions are Out There?
    04:31
  • What Model Architecture and Hyperparameters Do We Need?
    03:12
  • Handling the Explore-Exploit Dilemma: Coding the OU Action Noise Class
    03:37
  • Giving our Agent a Memory: Coding the Replay Memory Buffer Class
    07:04
  • Deep Q Learning for Actor Critic Methods: Coding the Critic Network Class
    15:49
  • Coding the Actor Network Class
    10:10
  • Giving our DDPG Agent Simple Autonomy: Coding the Basic Functions of Our Agent
    12:11
  • Giving our DDPG Agent a Brain: Coding the Agent's Learn Function
    09:43
  • Coding the Network Parameter Update Functionality
    08:16
  • Coding the Main Loop and Watching Our DDPG Agent Land on the Moon
    13:11

  • Some Tips on Reading this Paper
    01:39
  • Analyzing the TD3 Paper Abstract and Introduction
    09:32
  • What Other Solutions Have People Tried?
    03:36
  • Reviewing the Fundamental Concepts
    02:53
  • Is Overestimation Bias Even a Problem in Actor-Critic Methods?
    13:16
  • Why is Variance a Problem for Actor-Critic Methods?
    06:56
  • What Results Can We Expect?
    06:06
  • Coding the Brains of the TD3 Agent - The Actor and Critic Network Classes
    13:34
  • Giving our TD3 Agent Simple Autonomy - Coding the Basic Agent Functionality
    10:57
  • Giving our TD3 Agent a Brain - Coding the Learn Function
    10:31
  • Coding the Network Parameter Update Functionality
    11:32
  • Coding the Main Loop And Watching our Agent Learn to Walk
    09:44

  • A Quick Word on the Paper
    01:00
  • Getting Acquainted With a New Framework
    05:45
  • Checking Out What Has Been Done Before
    04:44
  • Inspecting the Foundation of this New Framework
    03:37
  • Digging Into the Mathematics of Soft Actor Critic
    11:00
  • Seeing How the New Algorithm Measures Up
    07:50
  • Coding the Neural Networks
    23:25
  • Coding the Soft Actor Critic Basic Functionality
    10:59
  • Coding the Soft Actor Critic Algorithm
    12:34
  • Coding the Main Loop and Evaluating Our Agent
    12:34

Requirements

  • Understanding of college level calculus
  • Prior courses in reinforcement learning
  • Able to code deep neural networks independently

Description

In this advanced course on deep reinforcement learning, you will learn how to implement policy gradient, actor critic, deep deterministic policy gradient (DDPG), and twin delayed deep deterministic policy gradient (TD3) algorithms in a variety of challenging environments from the Open AI gym.

The course begins with a practical review of the fundamentals of reinforcement learning, including topics such as:

  • The Bellman Equation

  • Markov Decision Processes

  • Monte Carlo Prediction

  • Monte Carlo Control

  • Temporal Difference Prediction TD(0)

  • Temporal Difference Control with Q Learning

And moves straight into coding up our first agent: a blackjack playing artificial intelligence. From there we will progress to teaching an agent to balance the cart pole using Q learning.

After mastering the fundamentals, the pace quickens, and we move straight into an introduction to policy gradient methods. We cover the REINFORCE algorithm, and use it to teach an artificial intelligence to land on the moon in the lunar lander environment from the Open AI gym. Next we progress to coding up the one step actor critic algorithm, to again beat the lunar lander.

With the fundamentals out of the way, we move on to our harder projects: implementing deep reinforcement learning research papers. We will start with Deep Deterministic Policy Gradients, which is an algorithm for teaching robots to excel at a variety of continuous control tasks.

Finally, we implement a state of the art artificial intelligence algorithm: Twin Delayed Deep Deterministic Policy Gradients. This algorithm sets a new benchmark for performance in robotic control tasks, and we will demonstrate world class performance in the Bipedal Walker environment from the Open AI gym.

By the end of the course, you will know the answers to the following fundamental questions in Actor-Critic methods:

  • Why should we bother with actor critic methods when deep Q learning is so successful?

  • Can the advances in deep Q learning be used in other fields of reinforcement learning?

  • How can we solve the explore-exploit dilemma with a deterministic policy?

  • How do we get overestimation bias in actor-critic methods?

  • How do we deal with the inherent errors in deep neural networks?

This course is for the highly motivated and advanced student. To succeed, you must have prior course work in all the following topics:

  • College level calculus

  • Reinforcement learning

  • Deep learning

The pace of the course is brisk, but the payoff is that you will come out knowing how to read cutting edge research papers and turn them into functional code as quickly as possible.

Who this course is for:

  • Advanced students of artificial intelligence who want to implement state of the art academic research papers

Instructor

Phil Tabor
Machine Learning Engineer
Phil Tabor
  • 4.5 Instructor Rating
  • 513 Reviews
  • 2,121 Students
  • 2 Courses

In 2012 I received my PhD in experimental condensed matter physics from West Virginia University. Following that I was a dry etch process engineer for Intel Corporation, where I leveraged big data to make essential process improvements for mission critical products. After leaving Intel in 2015, I have worked as a contract and freelance deep learning and artificial intelligence engineer.

  • Udemy for Business
  • Teach on Udemy
  • Get the app
  • About us
  • Contact us
  • Careers
  • Blog
  • Help and Support
  • Affiliate
  • Terms
  • Privacy policy
  • Cookie settings
  • Sitemap
  • Featured courses
Udemy
© 2021 Udemy, Inc.