Google’s AI Plays Football…For Science!


Dear Fellow Scholars, this is Two Minute Papers
with Károly Zsolnai-Fehér.
Reinforcement learning is an important subfield
within machine learning research where we
teach an agent to choose a set of actions
in an environment to maximize a score.
This enables these AIs to play Atari games
at a superhuman level, control drones, robot
arms, or even create self-driving cars.
A few episodes ago, we talked about DeepMind’s
behavior suite that opened up the possibility
of measuring how these AIs perform with respect
to the 7 core capabilities of reinforcement
learning algorithms.
Among them were how well such an AI performs
when being shown a new problem, how well or
how much they memorize, how willing they are
to explore novel solutions, how well they
scale to larger problems, and more.
In the meantime, the Google Brain research
team has also been busy creating a physics-based
3D football, or for some of you, soccer simulation
where we can ask an AI to control one, or
multiple players in this virtual environment.
This is a particularly difficult task because
it requires finding a delicate balance between
rudimentary short-term control tasks, like
passing, and long-term strategic planning.
In this environment, we can also test our
reinforcement learning agents against handcrafted,
rule-based teams.
For instance, here you can see that DeepMind’s
Impala algorithm is the only one that can
reliably beat the medium and hard handcrafted
teams, specifically, the one that was run
for 500 million training steps.
The easy case is tuned to be suitable for
single-machine research works, where the hard
case is meant to challenge sophisticated AIs
that were trained on a massive array of machines.
I like this idea a lot.
Another design decision I particularly like
here is that these agents can be trained from
pixels or internal game state.
Okay, so what does that really mean?
Training from pixels is easy to understand
but very hard to perform – this simply means
that the agents see the same content as what
we see on the screen.
DeepMind’s Deep Reinforcement Learning is
able to do this by training a neural network
to understand what events take place on the
screen, and passes, no pun intended all this
event information to a reinforcement learner
that is responsible for the strategic, gameplay-related
decisions.
Now, what about the other one?
The internal game state learning means that
the algorithm sees a bunch of numbers which
relate to quantities within the game, such
as the position of all the players and the
ball, the current score and so on.
This is typically easier to perform because
the AI is given high-quality and relevant
information and is not burdened with the task
of visually parsing the entire scene.
For instance, OpenAI’s amazing DOTA2 team
learned this way.
Of course, to maximize impact, the source
code for this project is also available.
This will not only help researchers to train
and test their own reinforcement learning
algorithms on a challenging scenario, but
they can extend it and make up their own scenarios.
Now note that so far, I tried my hardest not
to comment on the names of the players and
the teams, but my will to resist just ran
out.
Go real Bayesians!
Thanks for watching and for your generous
support, and I’ll see you next time!

Leave a Reply

Your email address will not be published. Required fields are marked *