Peng, Gao and Soriano Marcolino, Leandro (2022) Partially Observable Monte Carlo Planning in Public Game Tree for complex imperfect information game King UP. Masters thesis, Lancaster University.
2022GaoMRes.pdf - Published Version
Available under License Creative Commons Attribution-ShareAlike.
Download (21MB)
Abstract
In the past, Game AI has been a challenging area where humans devote much effort to it. The game could be classified by whether players could fully observe the state of the game into two different game types: perfect information game and imperfect information game. While perfect information game has been studied well, like Go and chess, the research on imperfect-information games is limited to poker and dice. The board game King Up is a sort of imperfect information game with more complicated rules and attributes, such as partial observation, multiple players, partly shared goals, mixed strategy (cooperate and compete temporarily for goals), and a mixture of dynamic game and static game (vote and move). Until now, King UP has not been solved by existing methods. Extended from Partially Observable Monte Carlo Planning, we propose a novel algorithm called POMCP in a Public Game Tree to solve the problem, in which we also combine the techniques from CFR with POMCP. The public game tree is a data structure used to store belief over partial observable information and could be applied in most imperfect information games with multiple players; inspired by CFR, we propose local search and decision estimation processes to compute the value function over correlated state and action; we also implement two decision estimation methods, deterministic policy by Upper Confidence Bound and stochastic policy by Regret Matching. In the experiment, we evaluate the convergency situation of the algorithm. Also, we evaluate the algorithm’s performance by further experiments like tournaments between different decision estimations and verify the generalization by Rock-PaperScissors. At last, we rethink the motivation, insights, and contributions. Our algorithm could work with less prior knowledge and perform well in complex, imperfectinformation games with multiple players like King UP and mixture game types (dynamic and static game).