Skip to main content

2022 | Buch

Deep Reinforcement Learning

insite
SUCHEN

Über dieses Buch

Deep reinforcement learning has attracted considerable attention recently. Impressive results have been achieved in such diverse fields as autonomous driving, game playing, molecular recombination, and robotics. In all these fields, computer programs have taught themselves to understand problems that were previously considered to be very difficult. In the game of Go, the program AlphaGo has even learned to outmatch three of the world’s leading players.Deep reinforcement learning takes its inspiration from the fields of biology and psychology. Biology has inspired the creation of artificial neural networks and deep learning, while psychology studies how animals and humans learn, and how subjects’ desired behavior can be reinforced with positive and negative stimuli. When we see how reinforcement learning teaches a simulated robot to walk, we are reminded of how children learn, through playful exploration. Techniques that are inspired by biology and psychology work amazingly well in computers: animal behavior and the structure of the brain as new blueprints for science and engineering. In fact, computers truly seem to possess aspects of human behavior; as such, this field goes to the heart of the dream of artificial intelligence.

These research advances have not gone unnoticed by educators. Many universities have begun offering courses on the subject of deep reinforcement learning. The aim of this book is to provide an overview of the field, at the proper level of detail for a graduate course in artificial intelligence. It covers the complete field, from the basic algorithms of Deep Q-learning, to advanced topics such as multi-agent reinforcement learning and meta learning.

Inhaltsverzeichnis

Frontmatter
Chapter 1. Introduction
Abstract
Deep reinforcement learning studies how we learn to solve complex problems, problems that require us to find a solution to a sequence of decisions in high-dimensional states. To make bread, we must use the right flour, add some salt, yeast and sugar, prepare the dough (not too dry and not too wet), pre-heat the oven to the right temperature, and bake the bread (but not too long); to win a ballroom dancing contest we must find the right partner, learn to dance, practice, and beat the competition; to win in chess we must study, practice, and make all the right moves.
Aske Plaat
Chapter 2. Tabular Value-Based Reinforcement Learning
Abstract
This chapter will introduce the classic, tabular, field of reinforcement learning, to build a foundation for the next chapters. First, we will introduce the concepts of agent and environment. Next come Markov decision processes, the formalism that is used to reason mathematically about reinforcement learning. We discuss at some length the elements of reinforcement learning: states, actions, values, and policies.
Aske Plaat
Chapter 3. Deep Value-Based Reinforcement Learning
Abstract
The previous chapter introduced the field of classic reinforcement learning. We learned about agents and environments, and about states, actions, values, and policy functions. We also saw our first planning and learning algorithms: value iteration, SARSA, and Q-learning. The methods in the previous chapter were exact, tabular, methods, which work for problems of moderate size that fit in memory.
Aske Plaat
Chapter 4. Policy-Based Reinforcement Learning
Abstract
Some of the most successful applications of deep reinforcement learning have a continuous action space, such as applications in robotics, self-driving cars, and real-time strategy games.
Aske Plaat
Chapter 5. Model-Based Reinforcement Learning
Abstract
The previous chapters discussed model-free methods, and we saw their success in video games and simulated robotics. In model-free methods, the agent updates a policy directly from the feedback that the environment provides on its actions. The environment performs the state transitions and calculates the reward. A disadvantage of deep model-free methods is that they can be slow to train; for stable convergence or low variance, often millions of environment samples are needed before the policy function converges to a high-quality optimum.
Aske Plaat
Chapter 6. Two-Agent Self-Play
Abstract
Previous chapters were concerned with how a single agent can learn optimal behavior for its environment. This chapter is different. We turn to problems where two agents operate whose behavior will both be modeled (and, in the next chapter, more than two).
Aske Plaat
Chapter 7. Multi-Agent Reinforcement Learning
Abstract
On this planet, in our societies, millions of people live and work together. Each individual has their own individual set of goals and performs their actions accordingly. Some of these goals are shared. When we want to achieve shared goals, we organize ourselves in teams, groups, companies, organizations, and societies.
Aske Plaat
Chapter 8. Hierarchical Reinforcement Learning
Abstract
The goal of artificial intelligence is to understand and create intelligent behavior; the goal of deep reinforcement learning is to find a behavior policy for ever larger sequential decision problems.
Aske Plaat
Chapter 9. Meta-Learning
Abstract
Although current deep reinforcement learning methods have obtained great successes, training times for most interesting problems are high; they are often measured in weeks or months, consuming time and resources—as you may have noticed while doing some of the exercises at the end of the chapters.
Aske Plaat
Chapter 10. Further Developments
Abstract
We have come to the end of this book. We will reflect on what we have learned. In this chapter we will review the main themes and essential lessons, and we will look to the future.
Aske Plaat
Backmatter
Metadaten
Titel
Deep Reinforcement Learning
verfasst von
Prof. Aske Plaat
Copyright-Jahr
2022
Verlag
Springer Nature Singapore
Electronic ISBN
978-981-19-0638-1
Print ISBN
978-981-19-0637-4
DOI
https://doi.org/10.1007/978-981-19-0638-1