Sequencing a game with AI

As promised, here is a brief summary of the changes in paradigm I had to work out to make the min max algorithm work correctly, for my Tic Tac Toe AI model, but also more generally for any game (and eventually MTG).

The basis of any minimax strategy is to explore all possibilities of moves by every players (within a range of course β€” exploring the whole game is computationally impossible). Let’s say we want to explore the possible futures of a given point in the game. That “point in the game” is defined by two things:

  • The state of the elements of the game (the location of the cards, the values of some properties on those cards, the life of the players, etc…)
  • The state of the game flow (the point in the game logic where the game is at, waiting for some user input)

Note that when evaluating possible moves for a given point in the game, I assume the game flow is always “waiting” on a user input because this is what triggers the evaluation in the first place (the AI player has to decide what to do). So, to evaluate a possible move is to continue the game flow from the point where the evaluation starts given that move. Then, to evaluate another move from the same point, one has to rollback both types of states.

Rollbacking the state of the elements was pretty straightforward for Mox because I already was careful to establish a transactional object model from the beginning, knowing that this kind of issue would rise (it also has lots of other advantages). Another naive way to rollback the state of the elements would be to save the state before the operation and then restore it later on. That approach doesn’t scale very well when the state is complex and costly to save/restore.

Rollbacking the other type of state (the game flow state) can be somewhat more tricky. The naive approach to game flow (which I was taking up to this point) was simply to compose the gameflow from functions (i.e. the way you compose a program). Unfortunately, there is really no easy way to “rollback” a running program to an earlier function (while keeping some other state intact, like minimax data!). So you have to promote those “game flow functions” to first class citizen objects, which I will call Steps in this post.

Steps are just objects like cards or integers. The game flow is then composed of a sequence of those steps. I have a Sequencer object that coordinates the execution of those steps. Now, a crucial thing that I require of those steps is immutability. That means that, once created, a step cannot and will not ever change. This property has the nice effect that executing a step twice with the same input always yields the same results. Basically it makes steps behave like normal functions (you would not expect a function to change in the middle of your program).

Rollbacking the game flow and trying another “path” is then simply equivalent to executing the same step over and over again with different inputs (the different possibilities we want to evaluate). Although they are semantically equivalent to normal functions, I couldn’t figure out how to do this practically with normal functions. That means I’ll have to transpose the current game flow that I have in Mox to this new paradigm.

Wow, that was longer than I thought! I hope I have not bored everyone by now!

On a side note, I got alpha-beta pruning working with my Tic Tac Toe model. Good night!

Advertisements

About fparadis2

Lead Game programmer
This entry was posted in Mox. Bookmark the permalink.

4 Responses to Sequencing a game with AI

  1. Incantus says:

    Hi,

    It’s very interesting to read about your progress in building an AI. Can you talk a little more about how you designed Mox to make rollback easy? Initially I never planned on doing an AI so I didn’t build an undo capability from the start, so now I’m not sure how to go about adding it to my program (especially when you consider things like triggered abilities – how do you rollback the event firing a trigger?). Another approach would be to copy subtrees and just discard them when you are done evaluating them, but there is so much state in a game of magic that this is probably unfeasible.

  2. fparadis2 says:

    Hi, thanks for your interest πŸ™‚

    The key to my object model is the command pattern (http://en.wikipedia.org/wiki/Command_pattern). Every change on the game state must be done through a command, which is basically just an object that encapsulates the change and how to roll it back (undo). It’s basically the same pattern you would use for implementing undo/redo in any typical application (I don’t know how many of those I’ve coded in my life! πŸ˜› )

    Note that those commands are invisible to the “client” of the object model. It’s all going on under the hood. For example, setting the life of a player is still done with
    thePlayer.Life = 20;
    but under the hood, there’s a really a command there being constructed, pushed on the current transaction, and executed.

    Transactions allow code like this:

    using (ITransaction transaction = game.BeginTransaction())
    {
    // Do stuff with the game
    transaction.Rollback();
    }

    After the rollback, the game is brought back exactly to where it was by undoing all the commands that were executed during the lifetime of the transaction.

    Because the commands are light (they only carry the information about the change), it’s not too costly to rollback. If you would do many many changes in a single transaction, I guess there is a point where saving/restoring the whole game state as a big chunk would be faster. Don’t think it will be an issue for me.

    You don’t have to worry about events.. something that happened because an event was fired will also be rollbacked with the rest (everything goes through the current transaction). Because the game state is exactly as it was before, you can fire events again, etc.. without any problems.

    About commands and transactions… they also are what enables easy networking for me. When a command is pushed, it is replicated through the network and also executed on the client side (this is light because the command only contains information about the atomic change that happened). Because the clients run the same commands in the same order, they see the same game.

    Hope it helps πŸ™‚

  3. telengard says:

    I think you’ll find that doing and undo’ing all those changes chew up a lot of CPU when the AI is running at high depths. That is a good portion of the time in my app.
    I did it the same way you mentioned but didn’t even realize it was a std software pattern. πŸ™‚

    Also, something I ran into that you may, is having your depth in some places go further into a turn skewing the AI’s idea of a good move. I got around this by having the AI stop when either the maximum phase depth was passed or the tree depth, whichever happened first.

    ~telengard

    • fparadis2 says:

      Thanks for the input, will keep that in mind πŸ™‚ I still have a few cards up my sleeves if it becomes a performance bottleneck!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s