
What is the class about?
TOC: basics of self-driving cars, PID controller, imitation learning, reinforcement learning, Unity ML
Basic math and programming skills are required to follow the class.
Unity is the tool we use for simulations, so it helps to go through the free online tutorials in case you are not familiar with it yet: https://unity3d.com/learn/tutorials
What is a self-driving car and how does it work?
Like any other robot, a car observes the environment through sensors (camera, Lidar, radar, GPS…); then it elaborates the information and makes decisions on a central computer; and finally, it controls the actuators (motor, steering wheel) to reach the set goals.
In this lecture we look at these three fields in detail.
Download and explore the template project.
[last saved with Unity version: 2018.2.13]
Understand what all the different kinds of brains are used for
Explore the relevant scripts used in the project
All updated details for setting up the environment can be found here: https://github.com/Unity-Technologies/ml-agents
Specifically, installation: https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Installation.md
and basic guide: https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Basic-Guide.md
TensorFlowSharp plugin: https://s3.amazonaws.com/unity-ml-agents/0.5/TFSharpPlugin.unitypackage
Let's start from the easiest control method: hard-coded rules
Introducing the generic PID controller
Fine-tuning the parameters
How to implement the PID controller in Unity ML
How to calculate the cross-track error of the go-kart
Let the PID control the go-kart
Collection of tips to improve on the vanilla PID design
Why adding a prediction model helps
Limitations of traditional control
Why do we need machine learning in some cases
Supervised vs. Reinforcement Learning
Basic introduction to neural networks
A bit deeper in the details of neural networks
How do you train a neural network
Gradient descent techniques
Introducing convolutional layers
Shortcutting the training process.
Some pre-trained models can be found here for Keras: https://keras.io/applications/
Configure the teacher and student brains
Show the go-kart how to race!
See if the go-kart has learned well
A few tips for best performance
Why self-learning is important
Environment, agents, observations, actions, reward
Starting from scratch vs. pre-injecting knowledge.
NOTE: The current version of the Unity ML agents only offers the option to start from empty brains, because the graphs saved after imitation learning is not compatible with that used in reinforcement learning. This should change in the future: stay updated through the official documentation!
Different ways a policy can be trained.
For a combined approach between perturbing actions space and parameters space see this:
https://blog.openai.com/better-exploration-with-parameter-noise/
Proximal Policy Optimization:
https://blog.openai.com/openai-baselines-ppo/
https://arxiv.org/abs/1707.06347
More in general on policy gradients: http://www.scholarpedia.org/article/Policy_gradient_methods
Detour on genetic algorithms, in particular evolution strategies:
https://blog.openai.com/evolution-strategies/
https://arxiv.org/abs/1703.03864
Crafting a reward function
Let the go-kart drive on its own...
Using Tensorboard for detailed analysis of the training results
See what the go-kart was able to learn on its own!
A few ways to improve training further
Recap the class content
Where can we go on from here...
Tensorflow on RaspberryPi: https://www.raspberrypi.org/magpi/tensorflow-ai-raspberry-pi/
Fast object detection networks:
https://pjreddie.com/darknet/yolo/
https://github.com/weiliu89/caffe/tree/ssd
WARNING: take this class as a gentle introduction to machine learning, with particular focus on machine vision and reinforcement learning. The Unity project provided in this course is now obsolete because the Unity ML agents library is still in its beta version and the interface keeps changing all the time! Some of the implementation details you will find in this course will look different if you are using the latest release, but the key concepts and the background theory are still valid. Please refer to the official migrating documentation on the ml-agents github for the latest updates.
Learn how to combine the beauty of Unity with the power of Tensorflow to solve physical problems in a simulated environment with state-of-the-art machine learning techniques.
We study the problem of a go-kart racing around a simple track and try three different approaches to control it: a simple PID controller; a neural network trained via imitation (supervised) learning; and a neural network trained via deep reinforcement learning.
Each technique has its strengths and weaknesses, which we first show in a theoretical way at simple conceptual level, and then apply in a practical way. In all three cases the go-kart will be able to complete a lap without crashing.
We provide the Unity template and the files for all three solutions. Then see if you can build on it and improve performance further more.
Buckle up and have fun!