Discover Algorithms for Reward-Based Learning in R
0.0 (0 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
0 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Discover Algorithms for Reward-Based Learning in R to your Wishlist.

Add to Wishlist

Discover Algorithms for Reward-Based Learning in R

Learn how to utilize algorithms for reward -based learning, as part of Reinforcement Learning with R
0.0 (0 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
0 students enrolled
Created by Packt Publishing
Last updated 9/2017
English [Auto-generated]
Current price: $10 Original price: $125 Discount: 92% off
5 hours left at this price!
30-Day Money-Back Guarantee
  • 2.5 hours on-demand video
  • 1 Supplemental Resource
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Learn R examples of policy evaluation and iteration
  • Implement typical applications for model-based and model-free RL
  • Understand policy evaluation and iteration
  • Execute Environment and Q-Learning functions with R
  • Learn Episode and state-action functions in R
  • Master Q-Learning with Greedy Selection Examples in R
  • Master the Simulated Annealing Changed Discount Factor through examples in R
View Curriculum
  • In this course, you will start by seeing what Model-Free and Model-Based approaches can do for them with the help of real world examples. Finally, the user will get to build actions, rewards, and punishments through these models in R for reinforcement learning

Users will be taken through a journey that starts by showing them the various algorithms that can be used for reward-based learning. The video describes and compares the range of model-based and model-free learning algorithms that constitute RL algorithms.

The Course starts by describing the differences in model-free and model-based approaches to Reinforcement Learning. It discusses the characteristics, advantages and disadvantages, and typical examples of model-free and model-based approaches.

We look at model-based approaches to Reinforcement Learning.We discuss State-value and State-action value functions, Model-based iterative policy evaluation, and improvement, MDP R examples of moving a pawn, how the discount factor, gamma, “works” and an R example illustrating how the discount factor and relative rewards affect policy. Next, we learn the model-free approach to Reinforcement Learning.This includes Monte Carlo approach, Q-Learning approach, More Q-Learning explanation and R examples of varying the learning rate and randomness of actions and SARSA approach. Finally, we round things up by taking a look at model-free Simulated Annealing and more Q-Learning algorithms.

The primary aim is to learn how to create efficient, goal-oriented business policies, and how to evaluate and optimize those policies, primarily using the MDPtoolbox package in R. Finally, the video shows how to build actions, rewards, and punishments with a simulated annealing approach.

About the Author :

Dr. Geoffrey Hubona held a full-time tenure-track, and tenured, assistant, and associate professor faculty positions at three major state universities in the Eastern United States from 1993-2010. In these positions, he taught dozens of various statistics, business information systems, and computer science courses to undergraduate, master's and Ph.D. students. Dr. Hubona earned a Ph.D. in Business Administration (Information Systems and Computer Science) from the University of South Florida (USF) in Tampa, FL (1993); an MA in Economics (1990), also from USF; an MBA in Finance (1979) from George Mason University in Fairfax, VA; and a BA in Psychology (1972) from the University of Virginia in Charlottesville, VA.

Who is the target audience?
  • This course is intended for anyone who is interested in learning how RL model-based algorithms can generate goal-oriented policies and how to evaluate and optimize those policies. You should know how to program in R, but no prior experience in Reinforcement Learning is required. They will be shown R Example of Moving a Pawn with Changed Parameters, discount Factor, and Policy Improvement. They will be able to program a Model-Free Environment Using Monte Carlo and Q-Learning. In the end, they will be able to Build actions, rewards, and punishments through Simulated Annealing Approach using Visual Q-Learning Examples
Students Who Viewed This Course Also Viewed
Curriculum For This Course
15 Lectures
What Model-Free and Model-Based Approaches Can Do for You
5 Lectures 47:39

This video provides an overview of the entire course.

Preview 05:51

How do you represent the environment when you have no explicit MDP model?

R Example – Building Model-Free Environment

How do you determine the optimal policy to “Solve” your reinforcement learning problem?

R Example – Finding Model-Free Policy

In this video, we will continue with the optimal policy to “Solve” your reinforcement learning problem.

R Example – Finding Model-Free Policy (Continued)

How does one validate the model, as well as validate (and possibly update) the previously-determined optimal policy?

R Example – Validating Model-Free Policy
Your First Model-Based Reinforcement Learning Program
3 Lectures 32:33

What are the state-value and state-action value functions?

  • Define the two value functions
  • Show how they impact policy evaluation and improvement
  • Illustrate with an R MDP example for moving a pawn
Preview 12:37

How do MDP problem parameters affect the optimal policy solution?

  • Introduction to the discount factor, “gamma”
  • Show how gamma affects policy moving a pawn
  • Show how other parameters affect policy moving a pawn
R Example – Moving a Pawn with Changed Parameters

How gamma affects policy improvement and optimal policy determination by diving deeper into the nature of the discount factor, gamma?

  • Explain how the discount factor determines the value function
  • Show how the value function determines policy
  • Present an R example of discount and rewards affecting policy
Discount Factor and Policy Improvement
Programming the Model-Free Environment Using Monte Carlo and Q-Learning
4 Lectures 39:30

What is the nature of the Monte Carlo Model-Free approach to solving Reinforcement Learning problems?

  • Describe the characteristics of the Monte Carlo approach
  • Describe random versus epsilon-greedy action selection
  • Illustrate with an R race-to-goal example
Preview 10:42

What is the nature of the Model-Free Q-Learning approach to solve Reinforcement Learning problems?

  • Describe Q-Learning as an off-policy learning concept
  • Walk through the Q-Learning update rule
  • Illustrate Q-Learning with an R example
Environment and Q-Learning Functions with R

Diving deeper into the nature of Q-Learning.
Learning Episode and State-Action Functions in R

Explore the characteristics of the SARSA algorithm.

State-Action-Reward-State-Action (SARSA)
Building Actions, Rewards, and Punishments Using Simulated Annealing Approach
3 Lectures 36:18

What is the nature of the Simulated Annealing algorithm alternative to Q-Learning?

  • Describe the characteristics of the Simulated Annealing approach
  • Describe probabilistic action selection derived from Boltzmann distribution metaheuristic
  • Illustrate with an R simulated annealing 2x2 grid example
Preview 10:36

How does one incorporate the discount factor into the previous Model-Free Q-Learning Reinforcement Learning algorithm?

  • Modify the Q-Learning algorithm to include a discount factor
  • Include the aggregation of rewards by episode
  • Illustrate modified Q-Learning algorithm with R examples
Q-Learning with a Discount Factor

How does one demonstrate the effects of Q-Learning algorithm control parameters using effective visualizations?

  • Use the popular R ggplot2 package to create visualizations
  • Examine effects of epsilon, alpha, and gamma control parameters
  • Create color-based line plots of Q-values and rewards
Visual Q-Learning Examples
About the Instructor
Packt Publishing
3.9 Average rating
8,059 Reviews
58,184 Students
686 Courses
Tech Knowledge in Motion

Packt has been committed to developer learning since 2004. A lot has changed in software since then - but Packt has remained responsive to these changes, continuing to look forward at the trends and tools defining the way we work and live. And how to put them to work.

With an extensive library of content - more than 4000 books and video courses -Packt's mission is to help developers stay relevant in a rapidly changing world. From new web frameworks and programming languages, to cutting edge data analytics, and DevOps, Packt takes software professionals in every field to what's important to them now.

From skills that will help you to develop and future proof your career to immediate solutions to every day tech challenges, Packt is a go-to resource to make you a better, smarter developer.

Packt Udemy courses continue this tradition, bringing you comprehensive yet concise video courses straight from the experts.