In this series of articles, we explained why the sample inefficiency is a critical limit of the Deep Reinforcement Learning and why the model-based approach can help solve it. Then we presented a state-of-the-art algorithm called PlaNet, and we used it to test the hypothesis. Today we present the obtained result.
To maintain consistency with the original PlaNet paper, we have computed all the experiments with the DeepMind Control Suite. This Suite provides continuous control tasks built for benchmarking reinforcement learning agents. We have chosen four of them: Cartpole, Cheetah, Walker, and Reacher.
The Deep Planning Network (PlaNet)
In the previous article Deep RL: a Model-Based approach (part 2), we saw how Deep Reinforcement Learning (DRL) works and how the model-based approach can improve sample efficiency. In this article, we present a specific algorithm model-based algorithm called PlaNet.
Model-based Deep Reinforcement Learning explained
In the previous article Deep RL: a Model-Based approach (part 1), we saw how Deep Reinforcement Learning (DRL) could be very effective and very inefficient. Now we examine how it works and why a model-based approach can drastically improve the sample efficiency.
In reinforcement learning, an agent acts in an unknown environment to reach an unknown goal.
Time is discretized into time steps, and for each of them, the agent receives information about the environment and takes action. Then it receives a feedback signal called reward. This reward is positive when the action brings the…
Deep Reinforcement Learning doesn’t really work… Yet
Using Deep Reinforcement Learning (RL), we can train an agent to solve a task without explicitly programming it. This approach is so general that, in principle, we can apply it to any sequential decision-making. For example, in 2015, a research team developed a DRL algorithm called DQN to play Atari games. They use the same method across 57 different games, each with particular goals to achieve, with peculiar enemies, and different agent moves. Their agent learns to solve many games. In some cases, it reaches even better human-level performance.
Checking the reliability of SHAP methods
Our previous article, A Game of Prediction (Part 1), presented four methods, based on the SHAP framework, to explain the neural network result. In this article, we compare and evaluate them using a sanity check.
Coalition Games for explaining DNN
Deep neural networks received great attention because it can be proven that under certain assumptions, they are universal approximators. However, how the network reaches a certain prediction is not well understood. Therefore, deep neural networks have been referred to as black-box algorithms.
To understand how deep neural networks produce a certain prediction and which input features were the most responsible for predicting, algorithms named Explanation methods have been introduced. The majority of these methods are based on heuristics and backpropagation.
A novel class of such explanation methods is based on the Shapley values for coalitional…
Assessing the scope and quality of results provided by eXplainable AI
In the previous article, we’ve seen a brief introduction of the why’s and the how’s of Explainable Artificial Intelligence. Here we’re are going to see the results of some Sanity Checks experiments.
The experiments conducted here aim to answer the following question: who assures us that the explanation provided by the method actually tells us reliably about what the network has learned to take that decision?
Specifically, we want to assess the sensitivity of explanation methods to model parameters: if one method really highlights the most important regions of…
How can you trust a model if you cannot understand how it reaches its conclusions?
In the last decade, Artificial Intelligence (AI) systems' performance is reaching, and sometimes surpassing, the human level on many tasks.
AI-based technologies are increasingly being used in our daily life; think about movie recommendations of Netflix, friend suggestions on Facebook, neural machine translation of Google, or speech recognition of Amazon Alexa. Not only, but AI-enabled computer applications are also being increasingly used in three important fields of our lives, health, and finance.
Deep Neural Networks (DNNs) demonstrate great success in learning complex patterns that enable…
aka Training Self-Driving with Virtual Worlds
This series's previous two articles presented some challenges in training self-driving systems and the first methods to overcome: Pixel-level Adaptation. Today we’ll see a second approach: Features-level Adaptation.
This method is based on adding a loss based on the segmentation model’s ability to fool a discriminator trained with the labeled data. If the model can do that even with never seen before images, it learned to produce very realistic results.
Moreover, this kind of training can also help to reduce the domain shift. Instead of working in pixel space, we operate with the respective…
aka Training Self-Driving with Virtual Worlds
In the previous article, I presented some challenges in training self-driving systems and two methods to overcome them: Pixel-level Adaptation and Features-level Adaptation, both based on Generative Adversarial Network. This article will explain how they work and show our experimental results.
The first method is based on a particular architecture of GAN called Cycle-Gan. This model can capture the principal style features from a set of images and then apply them to another image collection from another domain. For example, you can provide the model with two groups of images. The first one contains…
Founding Partner and CTO @ Addfor S.p.A. We develop Artificial Intelligence Solutions.