Deep Reinforcement Learning using python

Name: Deep Reinforcement Learning using python
Rating: 3.8 (141 reviews)

Complete guide to reinforcement learning | Stock Trading | Games

Created byRiad Almadani • 70,000+ Students

Last updated 2/2026

English

What you'll learn

Understand deep reinforcement learning and its applications
Build your own neural network
Implement 5 different reinforcement learning projects
Learn a lot of ways to improve your robot

Course content

7 sections • 63 lectures • 9h 8m total length

What is reinforcement learning?1:38
Quiz for lecture 1
Policy , Value function and Q function2:14
Learn how the policy acts as the agent's brain to choose actions for maximum rewards, and how value function and state function assign expected rewards to states.
Lecture 2 quiz
What are Neural Networks?2:13
Lecture 3 quiz
Optimal Q function2:24
Lecture 4 quiz

What is Grid World Game?13:37
Lecture 7 quiz
How to use Grid World environment ?7:09
Lecture 8 quiz
How to build your network ?3:44
Lecture 9 quiz
How to Build your first Q network using pytorch ?8:42
Build a q-network in PyTorch by defining a neural network class with three linear layers, relu activation, and a forward pass mapping a 3d grid state to four action values.
Lecture 10 quiz
How to make your neural network learn ?3:43
Lecture 11 quiz
Exploration & Exploitation using epsilon greedy1:08
explore how the agent balances exploration and exploitation with epsilon-greedy actions, starting with random actions to discover states, then gradually reducing exploration to rely on the neural network for decisions.
Training your neural network using pytorch part120:06
Train a neural network over thousands of episodes, preprocessing states, adding noise, and using an epsilon-greedy policy in PyTorch, then compute targets with a bellman equation and a target maker.
Lecture 13 quiz
Training your neural network using pytorch part27:28
Batch training2:03
train on batches python code19:24
Train a neural network on batches with a replay buffer of experiences, memory 5000 and batch size 200, leveraging GPU acceleration for faster reinforcement learning.
Lecture 15 quiz
reward metric3:41
Develop a reward metric that computes the mean of the last n rewards to track agent improvement during training and visualize mean rewards across about 10,000 episodes.
Lecture 17 quiz
Target nework2:10
train your agent with target network python code5:37
Lecture 19 quiz

Mountain car in python6:03
Lecture 20 quiz
Dynamics network3:20
Lecture 20 quiz
Epsilon Greedy strategy mountain Car game in python8:29
Dynamics Network with python5:36
Multi variate gaussian distribution4:59
Lecture 21 quiz
Multivariate gaussian distribution with python4:45
Model based exploration strategy with mountain car in python14:12
What is ICM module ?1:09
Explore how the ICM module reshapes rewards to promote visiting new or rarely seen states, and introduces filter, inverse, and forward neural networks, with each component covered in separate videos.
Lecture 27 quiz
Filter network1:24
Describe the filter neural network, the first component of the ICM module, which removes noise from any input state to create a configurable vector representation.
Lecture 28 quiz
Building Filter net python code8:29
Build a filter net in PyTorch, with three linear layers and relu activations followed by tanh, mapping the state input to a 3-node output on CUDA.
Inverse network8:42
Lecture 30 quiz
Building Inverse net python code9:02
Build an inverse neural network in PyTorch by defining an inverse net class with three linear layers, ReLU activations, and a softmax classifier for batch action probabilities.
Lecture 31 quiz
Forward network2:10
Lecture 32 quiz
Building Forward network python code7:42
Build a forward neural network to predict next state representation in deep reinforcement learning. Use three linear layers with 32 nodes, ReLU, and tanh after one-hot action encoding and concatenation.
Lecture 33 quiz
Building Agent Q network & Target Q network python code4:04
Build an agent Q network and a target Q network for the mountain car environment, using input shape two and output shape three, with three linear layers and ReLU activations.
Training Q network with ICM2:59
Training Agent Q network with ICM python code19:47
Lecture 36 quiz
What is RND module?5:34
Lecture 37 quiz
Building P net & T net python code4:21
Lecture 38 quiz
Training Agent Q network with RND module20:22

Flappy bird game2:03
Flappy bird game python code11:15
quiz
Building convolution Q network19:13
quiz
conv Q network with epsilon greedy approach python code14:22
2-steps Q network9:53
2-steps Q network python code14:01
Learn to implement a two-step convolutional q-network that processes stacked states, uses epsilon-greedy actions and a target network, and trains with two-step targets from a replay buffer, outperforming a basic q network.
Prioritized Experience Replay buffer4:18
Explore prioritized experience replay buffer to bias sampling toward experiences with higher losses, adjust with alpha and beta to balance exploration and bias correction, accelerating policy optimization.
quiz
Prioritized Experience Replay buffer python code16:47
Build a prioritized replay buffer with alpha 0.6 and beta starting at 0.4, updating priorities by losses and annealing beta to 1, improving mean rewards.
Dueling Q network15:06
quiz
Dueling Q network python code11:33
In Python, implement a dueling q-network with a convolutional backbone and two branches for state value and advantages, combine to compute q-values; show faster training than a basic q-network.
quiz

Basics of Trading7:43
Build an agent that learns when to buy and sell shares using candlestick charts, open-high-low-close data, and basic indicators to predict market direction across time frames.
Stock Data Preprocessing12:54
Process stock data with pandas by loading a CSV, renaming columns, and dropping unused fields. Filter candlesticks with epsilon checks and normalize high, low, and close relative to open prices.
Building the trading environment15:44
Building dueling conv1d Q network8:13
Train your trading robot12:56

Requirements

Numpy, Matplotlib ,Pandas
Gradient descent
object-oriented programming
General understanding of deep learning

Description

Welcome to Deep Reinforcement Learning using python!

Have you ever asked yourself how smart robots are created?

Reinforcement learning concerned with creating intelligent robots which is a sub-field of machine learning that achieved impressive results in the recent years where now we can build robots that can beat humans in very hard games like alpha-go game and chess game.

Deep Reinforcement Learning means Reinforcement learning field plus deep learning field where deep learning it is also a a sub-field of machine learning which uses special algorithms called neural networks.

In this course we will talk about Deep Reinforcement Learning and we will talk about the following things :-

Section 1: An Introduction to Deep Reinforcement Learning
In this section we will study all the fundamentals of deep reinforcement learning . These include Policy , Value function , Q function and neural network.
Section 2: Setting up the environment
In this section we will learn how to create our virtual environment and installing all required packages.
Section 3: Grid World Game & Deep Q-Learning
In this section we will learn how to build our first smart robot to solve Grid World Game.
Here we will learn how to build and train our neural network and how to make exploration and exploitation.
Section 4: Mountain Car game & Deep Q-Learning
In this section we will try to build a robot to solve Mountain Car game.
Here we will learn how to build ICM module and RND module to solve sparse reward problem in Mountain Car game.
Section 5: Flappy bird game & Deep Q-learning
In this section we will learn how to build a smart robot to solve Flappy bird game.
Here we will learn how to build many variants of Q network like dueling Q network , prioritized Q network and 2 steps Q network
Section 6: Ms Pacman game & Deep Q-Learning
In this section we will learn how to build a smart robot to solve Ms Pacman game.
Here we will learn how to build another variants of Q network like noisy Q network , double Q network and n-steps Q network.
Section 7:Stock trading & Deep Q-Learning
In this section we will learn how to build a smart robot for stock trading.

Who this course is for:

Anyone who wants to learn about artificial intelligence and deep learning
students & professionals

Deep Reinforcement Learning using python

What you'll learn

Explore related topics

Course content

An Introduction to Deep Reinforcement Learning4 lectures • 8min

Setting up the environment3 lectures • 12min

Grid World Game & Deep Q-Learning13 lectures • 1hr 39min

Mountain Car game & Deep Q-Learning20 lectures • 2hr 23min

Flappy bird game & Deep Q-learning10 lectures • 1hr 59min

Ms Pacman game & Deep Q-Learning8 lectures • 1hr 50min

Stock trading & Deep Q-Learning5 lectures • 58min

Requirements

Description

Who this course is for: