
Learn how the policy acts as the agent's brain to choose actions for maximum rewards, and how value function and state function assign expected rewards to states.
learn to create and activate a conda virtual environment, install essential packages like numpy, pandas, matplotlib, gym, retro, opencv, pygame, and pytorch, and start a jupyter notebook.
Build a q-network in PyTorch by defining a neural network class with three linear layers, relu activation, and a forward pass mapping a 3d grid state to four action values.
explore how the agent balances exploration and exploitation with epsilon-greedy actions, starting with random actions to discover states, then gradually reducing exploration to rely on the neural network for decisions.
Train a neural network over thousands of episodes, preprocessing states, adding noise, and using an epsilon-greedy policy in PyTorch, then compute targets with a bellman equation and a target maker.
Train a neural network on batches with a replay buffer of experiences, memory 5000 and batch size 200, leveraging GPU acceleration for faster reinforcement learning.
Develop a reward metric that computes the mean of the last n rewards to track agent improvement during training and visualize mean rewards across about 10,000 episodes.
Explore how the ICM module reshapes rewards to promote visiting new or rarely seen states, and introduces filter, inverse, and forward neural networks, with each component covered in separate videos.
Describe the filter neural network, the first component of the ICM module, which removes noise from any input state to create a configurable vector representation.
Build a filter net in PyTorch, with three linear layers and relu activations followed by tanh, mapping the state input to a 3-node output on CUDA.
Build an inverse neural network in PyTorch by defining an inverse net class with three linear layers, ReLU activations, and a softmax classifier for batch action probabilities.
Build a forward neural network to predict next state representation in deep reinforcement learning. Use three linear layers with 32 nodes, ReLU, and tanh after one-hot action encoding and concatenation.
Build an agent Q network and a target Q network for the mountain car environment, using input shape two and output shape three, with three linear layers and ReLU activations.
Learn to implement a two-step convolutional q-network that processes stacked states, uses epsilon-greedy actions and a target network, and trains with two-step targets from a replay buffer, outperforming a basic q network.
Explore prioritized experience replay buffer to bias sampling toward experiences with higher losses, adjust with alpha and beta to balance exploration and bias correction, accelerating policy optimization.
Build a prioritized replay buffer with alpha 0.6 and beta starting at 0.4, updating priorities by losses and annealing beta to 1, improving mean rewards.
In Python, implement a dueling q-network with a convolutional backbone and two branches for state value and advantages, combine to compute q-values; show faster training than a basic q-network.
Build an agent that learns when to buy and sell shares using candlestick charts, open-high-low-close data, and basic indicators to predict market direction across time frames.
Process stock data with pandas by loading a CSV, renaming columns, and dropping unused fields. Filter candlesticks with epsilon checks and normalize high, low, and close relative to open prices.
Welcome to Deep Reinforcement Learning using python!
Have you ever asked yourself how smart robots are created?
Reinforcement learning concerned with creating intelligent robots which is a sub-field of machine learning that achieved impressive results in the recent years where now we can build robots that can beat humans in very hard games like alpha-go game and chess game.
Deep Reinforcement Learning means Reinforcement learning field plus deep learning field where deep learning it is also a a sub-field of machine learning which uses special algorithms called neural networks.
In this course we will talk about Deep Reinforcement Learning and we will talk about the following things :-
Section 1: An Introduction to Deep Reinforcement Learning
In this section we will study all the fundamentals of deep reinforcement learning . These include Policy , Value function , Q function and neural network.
Section 2: Setting up the environment
In this section we will learn how to create our virtual environment and installing all required packages.
Section 3: Grid World Game & Deep Q-Learning
In this section we will learn how to build our first smart robot to solve Grid World Game.
Here we will learn how to build and train our neural network and how to make exploration and exploitation.
Section 4: Mountain Car game & Deep Q-Learning
In this section we will try to build a robot to solve Mountain Car game.
Here we will learn how to build ICM module and RND module to solve sparse reward problem in Mountain Car game.
Section 5: Flappy bird game & Deep Q-learning
In this section we will learn how to build a smart robot to solve Flappy bird game.
Here we will learn how to build many variants of Q network like dueling Q network , prioritized Q network and 2 steps Q network
Section 6: Ms Pacman game & Deep Q-Learning
In this section we will learn how to build a smart robot to solve Ms Pacman game.
Here we will learn how to build another variants of Q network like noisy Q network , double Q network and n-steps Q network.
Section 7:Stock trading & Deep Q-Learning
In this section we will learn how to build a smart robot for stock trading.