Resources in Reinforcement Learning

simonini_thomas · May 2020

Introduction to Reinforcement Learning

Reinforcement Learning (RL) is a computational approach to learning from action: an agent will interact with its environment, learn from it, perform actions, and receive rewards as feedback for these actions.

Three types of Reinforcement Learning Techniques

Value-Based RL

In value-based RL, the goal is to optimize the value function V(s) or an action value function Q(s,a).

The value function tells us the maximum expected future reward that the agent will get at each state.

The value of a given state is the total amount of the reward that an agent can expect to accumulate over the future, starting at the state.

The agent will use this value function to select which state to choose at each step. The agent selects the state with the biggest value.

In the maze example, at each step, we will take the biggest value: -7, then -6, then -5 (and so on) to attain the goal.

Policy-Based RL

In policy-based RL, we want to directly optimize the policy function π(s) without using a value function.

The policy is what defines the agent behavior at a given time.

action = policy(state)

We learn a policy function that lets us map each state to the best corresponding action.

We have two types of policies:

Deterministic: a policy at a given state will always return the same action.

Stochastic: outputs a probability distribution over actions.

Probability of taking that action at that state

As we can see here, the policy directly indicates the best action to take for each step.

Model-Based RL

In model-based RL, we model the environment. That is, we create a model that describes the behavior of the environment.

The problem with this approach is that each environment will need a different model representation. Therefore, having a general agent is not the best strategy.

Now that you understood the basics of Reinforcement Learning, you can dive deeper with this article.

RL algorithms

Algorithm	Type	Difficulty Level	Explanation	Implementation
Q-Learning	Value-based	1	Article	Implementation
Deep Q-Learning	Value-based	2	Article	Implementation
Double Dueling Deep Q-Learning	Value-based	3	Article	Implementation
Policy Gradients	Policy-based	2	Article	Implementation
Advantage Actor-Critic (A2C)	Actor-Critic	4	Article	Implementation
Asynchronous Actor-Critic (A3C)	Actor-Critic	4	Article	Implementation
Proximal Policy Optimization (PPO)	Actor-Critic	5	Article	Implementation

Advanced Topics

Article	Topic	Difficulty Level
Curiosity through next-state prediction	Curiosity	5
Curiosity through random network distillation	Curiosity	5
Episodic Curiosity through Reachability	Curiosity	6
An Introduction to Unity ML-Agents	ML-Agents	1
Diving Deeper into Unity ML-Agents	ML-Agents	2
Unity-ML Agents: The Mayan Adventure	ML-Agents	3

Learning Resources

Resource	Topic	Link
OpenAI Spinning-Up RL	Introduction to RL	https://spinningup.openai.com/
Deep Reinforcement Learning Course	Introduction to RL Hands-On	https://simoninithomas.github.io/ Deep_reinforcement_learning_Course/
Reinforcement Learning, Richard Sutton	Book	http://incompleteideas.net/book/the-book-2nd.html
WildML Reinforcement Learning	Hands-on	http://www.wildml.com/
DeepMind Advanced Deep Learning and Reinforcement Learning	Advanced topics	https://github.com/enggen/DeepMind-Advanced-Deep-Learning-and-Reinforcement-Learning
Unity ML-Agents Course	ML-Agents	http://www.simoninithomas.com/unitymlagentscourse/

Environments ️

Environment	Description	Link
OpenAI Gym ️	Gym is a toolkit for developing and comparing reinforcement learning algorithms. It supports teaching agents everything from walking to playing games like Pong or Pinball.	https://github.com/openai/gym
OpenAI Retro ️	Gym Retro lets you turn classic video games (NES, SNES, Genesis…) into Gym environments for reinforcement learning.	https://github.com/openai/retro
ML-Agents ️	Open-source Unity plugin that enables games and simulations to serve as environments for training intelligent agents.	https://github.com/Unity-Technologies/ml-agents
Vizdoom ️	Doom-based AI Research Platform for Reinforcement Learning from Raw Visual Information.	https://github.com/mwydmuch/ViZDoom
MameRL ️	A Python toolkit used to train reinforcement learning algorithms against arcade games	https://github.com/M-J-Murray/MAMEToolkit
TradingGym	Trading and Backtesting environment for training reinforcement learning agent or simple rule base algorithm.	https://github.com/Yvictor/TradingGym