As promised, here is a brief summary of the changes in paradigm I had to work out to make the min max algorithm work correctly, for my Tic Tac Toe AI model, but also more generally for any game (and eventually MTG).
The basis of any minimax strategy is to explore all possibilities of moves by every players (within a range of course — exploring the whole game is computationally impossible). Let’s say we want to explore the possible futures of a given point in the game. That “point in the game” is defined by two things:
- The state of the elements of the game (the location of the cards, the values of some properties on those cards, the life of the players, etc…)
- The state of the game flow (the point in the game logic where the game is at, waiting for some user input)
Note that when evaluating possible moves for a given point in the game, I assume the game flow is always “waiting” on a user input because this is what triggers the evaluation in the first place (the AI player has to decide what to do). So, to evaluate a possible move is to continue the game flow from the point where the evaluation starts given that move. Then, to evaluate another move from the same point, one has to rollback both types of states.
Rollbacking the state of the elements was pretty straightforward for Mox because I already was careful to establish a transactional object model from the beginning, knowing that this kind of issue would rise (it also has lots of other advantages). Another naive way to rollback the state of the elements would be to save the state before the operation and then restore it later on. That approach doesn’t scale very well when the state is complex and costly to save/restore.
Rollbacking the other type of state (the game flow state) can be somewhat more tricky. The naive approach to game flow (which I was taking up to this point) was simply to compose the gameflow from functions (i.e. the way you compose a program). Unfortunately, there is really no easy way to “rollback” a running program to an earlier function (while keeping some other state intact, like minimax data!). So you have to promote those “game flow functions” to first class citizen objects, which I will call Steps in this post.
Steps are just objects like cards or integers. The game flow is then composed of a sequence of those steps. I have a Sequencer object that coordinates the execution of those steps. Now, a crucial thing that I require of those steps is immutability. That means that, once created, a step cannot and will not ever change. This property has the nice effect that executing a step twice with the same input always yields the same results. Basically it makes steps behave like normal functions (you would not expect a function to change in the middle of your program).
Rollbacking the game flow and trying another “path” is then simply equivalent to executing the same step over and over again with different inputs (the different possibilities we want to evaluate). Although they are semantically equivalent to normal functions, I couldn’t figure out how to do this practically with normal functions. That means I’ll have to transpose the current game flow that I have in Mox to this new paradigm.
Wow, that was longer than I thought! I hope I have not bored everyone by now!
On a side note, I got alpha-beta pruning working with my Tic Tac Toe model. Good night!