This are the presentation slides for the following paper:
Shubu Yoshida, Makoto Ishihara, Taichi Miyazaki, Yuto Nakagawa, Tomohiro Harada, and Ruck Thawonmas, "Application of Monte-Carlo Tree Search in a Fighting Game AI," accepted for presentation at the 5th IEEE Global Conference on Consumer Electronics (GCCE 2016), Kyoto, Japan, Oct. 11-14, 2016.
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Application of Monte Carlo Tree Search in a Fighting Game AI (GCCE 2016)
1. Application of Monte-
Carlo Tree Search in a
Fighting Game AI
Shubu Yoshida, Makoto Ishihara, Taichi Miyazaki,
Yuto Nakagawa, Tomohiro Harada, and Ruck Thawonmas
Intelligent Computer Entertainment Laboratory
Ritsumeikan University
2. Outline
1.Background of this research
2.Monte-Carlo Tree Search
3.Monte-Carlo Tree Search for a Fighting Game
4.Experimental Environment
5.Experimental Method
6.Result
7.Competition result in 2016
8.Conclusion
3. Background (1/2)
A Fighting Game AI Competition is held every year [1]
High-ranking AIs = Rule-based (until 2015)
Rule-based : a same action in a same situation
Human player can easily predict the AI’s action patterns and
outsmart it
[1] http://www.ice.ci.ritsumei.ac.jp/~ftgaic/
4. Background (2/2)
Apply the Monte-Carlo Tree Search (MCTS)
to a fighting game AI
Decides a next own action by stochastic simulations
Already successful in many games [2][3]
We evaluate the effectiveness of MCTS on a fighting game
[2] S. Gelly, et al. ”The Grand Challenge of Computer Go: Monte Carlo Tree Search and Extensions”, Communications of the ACM, Vol. 55, No. 3, pp. 106-113,
2012.
[3] N. Ikehata and T. Ito. ”Monte-carlo tree search in ms. pac-man”. In Computational Intelligence and Games (CIG), 2011 IEEE Conference on, pp. 39-46, 2011
5. Monte-Carlo Tree Search (1/5)
selection simulation backpropagation
repeat until the set time has elapsed
expansion
6. Monte-Carlo Tree Search (2/5)
selection simulation backpropagation
repeat until the set time has elapsed
expansion
7. Formula of UCB1
・ 𝑋𝑖 : the value of an average reward
・𝐶 : The balance parameter
・𝑁𝑖
𝑝
: The total number of times the parent node of node 𝑖 has been visited
・𝑁𝑖 : The total number of times node 𝑖 has been visited
𝑈𝐶𝐵1𝑖 = 𝑋𝑖 + 𝐶
2 ln 𝑁𝑖
𝑝
𝑁𝑖
Preferentially select
a child node that has
been visited less
The evaluation valueExploitation
Exploration
8. Monte-Carlo Tree Search (3/5)
selection simulation backpropagation
repeat until the set time has elapsed
expansion
9. Monte-Carlo Tree Search (4/5)
selection simulation backpropagation
repeat until the set time has elapsed
expansion
10. Monte-Carlo Tree Search (5/5)
selection simulation backpropagation
repeat until the set time has elapsed
expansion
12. MCTS for a Fighting Game (2/2)
・・・
Expansion
normal fighting game
・・・
・・・・・
Simulation
13. Experimental Environment
FightingICE
Used as the platform of international fighting game AI competition
1 game : 3 rounds
-1 round : 60 second
𝑚𝑦𝑆𝑐𝑜𝑟𝑒 =
𝑜𝑝𝑝𝐻𝑃
𝑚𝑦𝐻𝑃+𝑜𝑝𝑝𝐻𝑃
× 1000
Response time : 16.67ms
14. Experimental Method
MCTSAI(AI applying MCTS) vs high ranking 5 AIs of 2015
tournament
5 AIs : Rule-based
100 games (50 games each side)
TABLE I THE PARAMETERS USED IN THE EXPERIMENTS
Notations Meanings Values
C Balance parameter 3
Threshold of the number of visits 10
Threshold of the depth of tree 2
The number of simulations 60 frames
𝑁 𝑚𝑎𝑥
𝐷 𝑚𝑎𝑥
𝑇𝑠𝑖𝑚
20. Competition result in 2016
Orange 1st
Blue 2nd
Green 3rd
Total Rank
RANK
BANZAI 11
DragonSurvivor 12
iaTest 7
IchibanChan 9
JayBot2016 5
KeepYourDistanceBot 10
MctsAi 3
MrAsh 4
Poring 8
Ranezi 2
Snorkel 13
Thunder01 1
Tomatensimulator 6
Triump 14
21. Conclusion
Applied MCTS to fighting game AI
Showed that MCTS in fighting game AI is effective
Future work
In fighting game, random simulation of the enemy behavior is
not effective
Predict the behavior of the enemy and use this information in
simulation
Hello everyone. My name is shubu yoshida of Intelligent Computer Entertainment Lab, Ritsumeikan University.
I’d like to talk about “Application of Monte-Carlo Tree Search in a Fighting Game AI” .
This is the outline of my presentation.
I’d like to talk about these contents.
A Fighting Game AI Competition is held every year.
In this competition, High-ranking AIs are mainly well-tuned rule-based AIs which always conduct a same action in a same situation.
Rule-based AIs take predetermined actions.
Human players can easily predict the AI’s action patterns and outsmart it.
And if the parameters of the action changed, Rule-based AI’s strength be changed
In order to solve this problem, we apply MCTS to a Fighting Game AI.
MCTS decides a next own action by stochastic simulations.
MCTS based approach produces a significantly promising result not only in a board game like Go [2],
but also in a realtime based game like Ms.Pac-Man [3].
Then, it is expected that it performs better in a fighting game because this kind of game is similar to Ms.Pac-Man in terms of real-time based.
it is expected that it performs better in a fighting game.
In this paper, we evaluate the effectiveness of MCTS on a fighting game.
We modified traditional MCTS for a fighting game.
This figure is an overview of traditional MCTS.
I’ll explain you about this.
And after having explained this, I’ll explain you about MCTS for fighting game.
MCTS combines the game tree search and the Monte Carlo method.
Each node represents a state of the game. Each edge an action.
First, MCTS selects the child node with the highest UCB1 value until it reaches a leaf node.
Each child node has a UCB1 value.
UCB1 value is calculated by this formula.
In this formula, the first term is the evaluation value.
The second term aims that MCTS preferentially selects a child node that has been visited less.
So, this formula aims that MCTS selects a child node which not only has high evaluation value but also has been visited less to prevent local search.
In short, the first term is exploitation and the second term is exploration.
Second, after arriving at a leaf node, if its number of visits exceeds a pre-defined threshold and the depth of the tree has not reached the upper limit,
MCTS will create child nodes from it.
Third, it performs random simulation from the root node to the leaf node.
And it simulate until the end of game.
In this part, opponent actions are selected randomly and my actions are used in that path.
After do these actions, we get reward and state.
Finally, it propagates a result of simulation from the leaf node to the parent node and calculates UCB1 values and repeat propagation until the root node.
The above 4 steps are repeated during allowed time budget in MCTS. Then, the child node is chosen with the highest number of visits from the root node.
In fighting games, UCB1 is defined by this formula.
The evaluation value of node 𝑖 is the average value of the amount of the opponent character's hit-point changes subtracted by the amount of that of the player character .
This value is higher when my AI gives a lot of damage to the opponent and it is not damaged by the opponent.
Each parameter shows AI HP before and after j-th simulation.
The first term is the own score difference term before and after a simulation.
The second term is the opponent’s one.
In the expansion part, traditional MCTS expands only one node at a time.
In this paper, we expand all actions or nodes that the AI can act.
Fighting games have a lot of actions, and real time games have search time limit.
We want to explore all of the nodes once at least.
So we expand all actions that the AI can act.
In the simulation part, in board games, simulation is done until the end of the game.
But real-time games have limited thinking time.
So we put restrictions on tree depth.
These are the main changes in MCTS for fighting games.
In an experiment, We used FightingICE as the fighting game platform. FightingICE is a 2D fighting game developed by our laboratory for game ai researches.
It is used as the platform of international fighting game AI competitions recognized by IEEE CIG.
The player AI score or My score is calculated by this formula.
If more than 500, my AI’s performance is superior to the opponent AI
Next , experimental method.
We let MCTSAI fight 100 times against high ranking 5 AIs of 2015 tournament, while switching each side .
Action behaviors of each AI are rule-based.
And we used these parameters.
The average score against each AI is shown in Fig. 1.
In this figure, the horizontal axis lists the name of high ranking AI.
And from left to right, there are 1st ranked to 5th ranked Ais.
The vertical axis represents the average scores of MCTSAI againtw high ranking Ais.
From this result, the proposed AI outperformed all opponent AIs, except for the 1st ranked AI Machete.
This video is a fighting game scene where P1 is MCTSAI and P2 is RatioBot.
RatioBot is the 4th ranked ai in 2015 tournament.
As we can see from this video,
MCTSAI has been able to dodge the behavior of RatioBot.
It can be said that the simulation of Monte Carlo tree search has been working well.
So MCTS is an effective method in this fighting game.
////
But the proposed AI did not show a good performance against Machete.
This video is a fighting game scene where P1 is MCTSAI and P2 is Machete.
Machete is a well tuned rule-based AI that repeatedly conducts short actions, requiring less number of frames, which are not well simulated by MCTS RANDOM simulation.
This is the competition result in 2016.
the horizontal axis lists the name of AI.
And these numbers represent these AIs Ranking.
In this competition, our MctsAI came 3rd.
So it can be said that Mcts showed good results also in an actual tournament.
In conclusion we applied MCTS to a fighting game AI.
Results showed that MCTS in fighting game AI is effective.
In this paper, we have found that random simulation of the enemy behavior is not effective in fighting games.
So, in the future, we plan to add a mechanism such as behavior prediction of the enemy and use it in simulation.
Use of this kind of mechanism should better simulate the opponent.