Interesting paper presented as a talk at EvoGAMES 2017 in Amsterdam (The Netherlands) by Ivan Bravi.
Co-authored by: Ahmed Khalifa , Christoffer Holmgard , Julian Togelius
ABSTRACT:
At the core of the most popular version of the Monte Carlo Tree Search (MCTS) algorithm is the UCB1 (Upper Confidence Bound) equation. This equation decides which node to explore next, and therefore shapes the behavior of the search process. If the UCB1 equation is replaced with another equation, the behavior of the MCTS algorithm changes, which might increase its performance on certain problems (and decrease it on others). In this paper, we use genetic programming to evolve replacements to the UCB1 equation targeted at playing individual games in the General Video Game AI (GVGAI) Framework. Each equation is evolved to maximize playing strength in a single game, but is then also tested on all other games in our test set. For every game included in the experiments, we found a UCB replacement that performs significantly better than standard UCB1. Additionally, evolved UCB replacements also tend to improve performance in some GVGAI games for which they are not evolved, showing that improvements generalize across games to clusters of games with similar game mechanics or algorithmic performance. Such an evolved portfolio of UCB variations could be useful for a hyper-heuristic game-playing agent, allowing it to select the most appropriate heuristics for particular games or problems in general.
PRESENTATION:
Ivan Bravi – EvoGAMES – Evolving UCT alternatives
LINK TO THE PAPER:
https://link.springer.com/chapter/10.1007/978-3-319-55849-3_26