Coding Hell

Programming and stuff.

Tic-tac-toe AI Part 3: Game Tree for the Win

Writing a bot for a game as simple as tic-tac-toe is not that hard. As there are only 255,168 possible games, you can easily simulate all of them to search for good moves.

So the first thing the bot does is generating all possible game endings based on the current situation. The resulting situations can be stored in a so called Game Tree:

Image source: Wikipedia/Gdr

To generate the game tree, we will use a RecursiveAction to speed things up:

SimulatingBot$SimulationTreeNode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
class SimulationTreeNode extends RecursiveAction {
  private static int FORK_THRESHOLD = 6;
  SimulationTreeNode[] possibleMoves;
  private GameGrid gameGrid;

  /**
   * Last position set that let to the situation now
   */
  private byte chosenPosition;

  /**
   * Player who should act now
   */
  private Field activePlayer;

  protected boolean isComputersTurn() {
    return activePlayer == playersField;
  }

  @Override
  protected void compute() {
    if (gameGrid.getWinner() != null || gameGrid.countFreeFields() == 0) {
      return;
    }
    int freeFields = gameGrid.countFreeFields();
    possibleMoves = new SimulationTreeNode[freeFields];
    try {
      int counter = 0;
      for (byte i = 0; i < 9; ++i) {
        if (gameGrid.getField(i) == Field.EMPTY) {
          SimulationTreeNode child = new SimulationTreeNode(
              gameGrid, i, otherPlayer(activePlayer));
          possibleMoves[counter++] = child;
          if (freeFields < FORK_THRESHOLD) {
            child.compute();
          }
        }
      }
      if (freeFields >= FORK_THRESHOLD) {
        invokeAll(Arrays.asList(possibleMoves));
      }
    } catch (EngineException e) {
      e.printStackTrace();
    }
  }
}

Each SimulationTreeNode starts a new thread to generate all possible moves. If less then 6 free fields are left, no new threads are started (wouldn’t improve performance anymore). The generated sub-situations are stored in an array.

Now when it is the computers turn, it can build the game tree or search for the new situation in the existing game tree (after players move):

SimulatingBot
1
2
3
4
5
6
7
8
9
10
11
@Override
public int getNextMove() {
  if (simulationTree == null) {
    generateSimulationTree();
  } else {
    simulationTree = simulationTree.getChild(gameGrid.hashCode());
  }
  SimulationTreeNode move = simulationTree.getBestMove();
  simulationTree = move;
  return move.getPosition();
}

What should I do?

You may noticed the getBestMove call to the simulation tree. This method searches the best possible move. Often there are multiple moves with the same “score”. A random one is returned in that situation:

SimulatingBot$SimulationTreeNode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
public SimulationTreeNode getBestMove() {
  double bestScore = -1;
  List<SimulationTreeNode> bestMoves = new ArrayList<>();
  for (SimulationTreeNode child : possibleMoves) {
    if (child.getScore() > bestScore) {
      bestScore = child.getScore();
      bestMoves.clear();
    }
    if (child.getScore() == bestScore) {
      bestMoves.add(child);
    }
  }
  int index = randomGenerator.nextInt(bestMoves.size());
  SimulationTreeNode bestMove = bestMoves.get(randomGenerator.nextInt(index);
  return bestMove;
}

What does this “score” mean? The higher, the better: A defeat would be bad for us, so it has score -1. A draw would be OK, so it has score 0. Winning would be good, so it has score 1:

SimulatingBot$SimulationTreeNode
1
2
3
4
5
6
7
8
9
10
11
12
13
private byte calculateScore() {
  if (gameGrid.getWinner() == null) {
    if (gameGrid.countFreeFields() == 0) {
      return 0; // draw
    } else {
      // Magic happens here!
    }
  } else if (gameGrid.getWinner() == playersField) {
    return 1; // win
  } else {
    return -1;
  }
}

To calculate the score of situations which are not yet completed (no winner and still empty fields), we use the Minimax algorithm.

When it is the other players turn, we assume that he will try to do the best he can. So we must assume the lowest score of all possible moves for the situation. Because when we end up with that situation, the player could do this move that would hurt us.

When it is the computers turn, it will do the best it can. That’s why we can assume the highes score of all possible moves. Because when we end up with that situation, we may have a chance to do the move with that good score.

We end up with something like this:

SimulatingBot$SimulationTreeNode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
private byte calculateScore() {
  if (gameGrid.getWinner() == null) {
    if (gameGrid.countFreeFields() == 0) {
      return 0; // draw
    } else {
      byte score = (byte) (isComputersTurn() ? -1 : 1);
      for (SimulationTreeNode child : possibleMoves) {
        if (isComputersTurn()) {
          // computer will move, so search for good choice
          if (child.getScore() > score) {
            score = child.getScore();
          }
        } else {
          // other player will move and tries to hurt us!
          if (child.getScore() < score) {
            score = child.getScore();
          }
        }
      }
      return score;
    }
  } else if (gameGrid.getWinner() == playersField) {
    return 1; // win
  } else {
    return -1; // defeated
  }
}

Comments