Check out the first Dataiku 8 Deep Dive focusing on Productivity on October 29th

# Introduction to Reinforcement Learning

Reinforcement Learning (RL) is a computational approach to learning from action: an agent will interact with its environment, learn from it, perform actions, and receive rewards as feedback for these actions.

### Value-Based RL

In value-based RL, the goal is to optimize the value function V(s) or an action value function Q(s,a).

The value function tells us the maximum expected future reward that the agent will get at each state.

The value of a given state is the total amount of the reward that an agent can expect to accumulate over the future, starting at the state.

The agent will use this value function to select which state to choose at each step. The agent selects the state with the biggest value.

In the maze example, at each step, we will take the biggest value: -7, then -6, then -5 (and so on) to attain the goal.

### Policy-Based RL

In policy-based RL, we want to directly optimize the policy function π(s) without using a value function.

The policy is what defines the agent behavior at a given time.

action = policy(state)

We learn a policy function that lets us map each state to the best corresponding action.

We have two types of policies:

• Deterministic: a policy at a given state will always return the same action.

• Stochastic: outputs a probability distribution over actions.

Probability of taking that action at that state

As we can see here, the policy directly indicates the best action to take for each step.

### Model-Based RL

In model-based RL, we model the environment. That is, we create a model that describes the behavior of the environment.

The problem with this approach is that each environment will need a different model representation. Therefore, having a general agent is not the best strategy.

Now that you understood the basics of Reinforcement Learning, you can dive deeper with this article.

# RL algorithms 🤖

 Algorithm Type Difficulty Level Explanation Implementation Q-Learning Value-based 1 Article Implementation Deep Q-Learning Value-based 2 Article Implementation Double Dueling Deep Q-Learning Value-based 3 Article Implementation Policy Gradients Policy-based 2 Article Implementation Advantage Actor-Critic (A2C) Actor-Critic 4 Article Implementation Asynchronous Actor-Critic (A3C) Actor-Critic 4 Article Implementation Proximal Policy Optimization (PPO) Actor-Critic 5 Article Implementation

# Advanced Topics 🌟

 Article Topic Difficulty Level Curiosity through next-state prediction Curiosity 5 Curiosity through random network distillation Curiosity 5 Episodic Curiosity through Reachability Curiosity 6 An Introduction to Unity ML-Agents ML-Agents 1 Diving Deeper into Unity ML-Agents ML-Agents 2 Unity-ML Agents: The Mayan Adventure ML-Agents 3

# Learning Resources 📚

 Resource Topic Link OpenAI Spinning-Up RL Introduction to RL https://spinningup.openai.com/ Deep Reinforcement Learning Course Introduction to RL Hands-On https://simoninithomas.github.io/ Deep_reinforcement_learning_Course/ Reinforcement Learning, Richard Sutton Book http://incompleteideas.net/book/the-book-2nd.html WildML Reinforcement Learning Hands-on http://www.wildml.com/ DeepMind Advanced Deep Learning and Reinforcement Learning Advanced topics https://github.com/enggen/DeepMind-Advanced-Deep-Learning-and-Reinforcement-Learning Unity ML-Agents Course ML-Agents http://www.simoninithomas.com/unitymlagentscourse/

# Environments 🖼️

 Environment Description Link OpenAI Gym ️ Gym is a toolkit for developing and comparing reinforcement learning algorithms. It supports teaching agents everything from walking to playing games like Pong or Pinball. https://github.com/openai/gym OpenAI Retro ️ Gym Retro lets you turn classic video games (NES, SNES, Genesis…) into Gym environments for reinforcement learning. https://github.com/openai/retro ML-Agents ️ Open-source Unity plugin that enables games and simulations to serve as environments for training intelligent agents. https://github.com/Unity-Technologies/ml-agents Vizdoom ️ Doom-based AI Research Platform for Reinforcement Learning from Raw Visual Information. https://github.com/mwydmuch/ViZDoom MameRL ️ A Python toolkit used to train reinforcement learning algorithms against arcade games https://github.com/M-J-Murray/MAMEToolkit TradingGym Trading and Backtesting environment for training reinforcement learning agent or simple rule base algorithm. https://github.com/Yvictor/TradingGym

Version history
Revision #:
18 of 18
Last update:
‎05-05-2020 03:14 PM
Updated by:
Contributors