This research is the subject of a paper submitted for publication at The 2nd International Conference on Interactive Collaborative Robotics (SPECOM/ICR 2017)
Empowerment as a Generic Utility Function for Agents in a Simple Team Sport Simulation
Marcus Clements and Daniel Polani
Adaptive Systems Research Group, School of Computer Science,
of Hertfordshire, Hatfield, UK
Abstract Players in team sports cooperate in a coordinated manner to achieve common goals. Automated players in academic and commercial team sports simulations have tradi- tionally been driven by complex externally motivated value functions with heuristics based on knowledge of game tactics and strategy. Empowerment is an information-theoretic mea- sure of an agent’s potential to influence its environment, which has been shown to provide a useful intrinsic value function, without the need for external goals and motivation, for agents in single agent models. In this paper we expand on the concept of empowerment to propose the concept of team empowerment as an intrinsic, generic utility function for coop- erating agents. We show that agents motivated by team empowerment exhibit recognizable team behaviors in a simple team sports simulation based on Ultimate Frisbee.
This page includes diagrams and screen captures of the software model used to investigate Team Empowerment. It is intended as supporting material for readers of the above paper.
A few years ago I played Ultimate Frisbee for the first time. As a beginner without an armoury of learned behaviours, trying to contribute to team success whilst keeping track of the movements of the players, and the flight of the disc, is a challenging cognitive process. When I read in All Else Being Equal Be Empowered (Klyubin et al., 2005), how empowerment could be colloquially described as “keep your options open” my early Frisbee strategy came to mind. I was attracted by the idea that intelligent behaviour could emerge in agents in a computer model, motivated only by the application of an elegant mathematical theory of communication.
Calculating empowerment by constructing a tree of possible future states
The empowerment calculation was performed by constructing a graph of possible future states where the nodes of the graph represent states and the vertices represent actions. The recursive algorithm which constructs the graph keeps totals of terminal nodes (end states) as it builds the graph resulting in a total of possible states when the recursion unrolls back to level 0.
The main findings were encapsulated into repeatable Junit tests. These tests, as well as demonstrating the findings, combined with model-checking tests were run during development to ensure that changes to the code did not change the behaviour of the model in undesirable ways.
Videos of the simulation in Online Mode
The simulation prunes the state tree according to the configurable
settings visible in the user interface. Without the pruning
the simulation is too slow to watch in real-time due to the explosion of
NB. The results of the experiment as outlined in the paper are demonstrated by the offline tests, where none of the pruning options are used and the full state tree is assembled. The optimisations are only available to expedite rapid investigation and discovery in online mode.
The simulation's user interface proviodes the following controls:
- Scenario A number of pre-configured scenarios are provided and may be selected by the user in the dropdown box under "Scenario"
- Disc Velocity Number of squares the disc moves per time step.
- Include team moves [Tree pruning optimisation] If the checkbox is not ticked, the players teammates move options are not considered (pruned) from the tree.
- Throw range [Tree pruning optimisation] The dynamics of the model and the nature of empowerment ensure that with a sufficient lookahead players will always throw towards a teammate. This optimisation reduces the possible throws to within the configured number of squares of the teammates position.
- Empowerment lookahead Number of steps ahead included in the empowerment calculation.