Users will be taken through a journey that starts by showing them the various algorithms that can be used for reward-based learning. The video describes and compares the range of model-based and model-free learning algorithms that constitute RL algorithms.
The Course starts by describing the differences in model-free and model-based approaches to Reinforcement Learning. It discusses the characteristics, advantages and disadvantages, and typical examples of model-free and model-based approaches.
We look at model-based approaches to Reinforcement Learning.We discuss State-value and State-action value functions, Model-based iterative policy evaluation, and improvement, MDP R examples of moving a pawn, how the discount factor, gamma, “works” and an R example illustrating how the discount factor and relative rewards affect policy. Next, we learn the model-free approach to Reinforcement Learning.This includes Monte Carlo approach, Q-Learning approach, More Q-Learning explanation and R examples of varying the learning rate and randomness of actions and SARSA approach. Finally, we round things up by taking a look at model-free Simulated Annealing and more Q-Learning algorithms.
The primary aim is to learn how to create efficient, goal-oriented business policies, and how to evaluate and optimize those policies, primarily using the MDPtoolbox package in R. Finally, the video shows how to build actions, rewards, and punishments with a simulated annealing approach.
About the Author :
Dr. Geoffrey Hubona held a full-time tenure-track, and tenured, assistant, and associate professor faculty positions at three major state universities in the Eastern United States from 1993-2010. In these positions, he taught dozens of various statistics, business information systems, and computer science courses to undergraduate, master's and Ph.D. students. Dr. Hubona earned a Ph.D. in Business Administration (Information Systems and Computer Science) from the University of South Florida (USF) in Tampa, FL (1993); an MA in Economics (1990), also from USF; an MBA in Finance (1979) from George Mason University in Fairfax, VA; and a BA in Psychology (1972) from the University of Virginia in Charlottesville, VA.
How do you represent the environment when you have no explicit MDP model?
How do you determine the optimal policy to “Solve” your reinforcement learning problem?
In this video, we will continue with the optimal policy to “Solve” your reinforcement learning problem.
How does one validate the model, as well as validate (and possibly update) the previously-determined optimal policy?
What are the state-value and state-action value functions?
How do MDP problem parameters affect the optimal policy solution?
How gamma affects policy improvement and optimal policy determination by diving deeper into the nature of the discount factor, gamma?
What is the nature of the Monte Carlo Model-Free approach to solving Reinforcement Learning problems?
What is the nature of the Model-Free Q-Learning approach to solve Reinforcement Learning problems?
Explore the characteristics of the SARSA algorithm.
What is the nature of the Simulated Annealing algorithm alternative to Q-Learning?
How does one incorporate the discount factor into the previous Model-Free Q-Learning Reinforcement Learning algorithm?
How does one demonstrate the effects of Q-Learning algorithm control parameters using effective visualizations?
Packt has been committed to developer learning since 2004. A lot has changed in software since then - but Packt has remained responsive to these changes, continuing to look forward at the trends and tools defining the way we work and live. And how to put them to work.
With an extensive library of content - more than 4000 books and video courses -Packt's mission is to help developers stay relevant in a rapidly changing world. From new web frameworks and programming languages, to cutting edge data analytics, and DevOps, Packt takes software professionals in every field to what's important to them now.
From skills that will help you to develop and future proof your career to immediate solutions to every day tech challenges, Packt is a go-to resource to make you a better, smarter developer.
Packt Udemy courses continue this tradition, bringing you comprehensive yet concise video courses straight from the experts.