AI beats humans in Stratego

An artificial intelligence system from the British company DeepMind has learned to trick and defeat human opponents in the board game Stratego, a game with an unimaginably high number of possible scenarios.

An artificial intelligent system (AI) has expert human players defeated in the board game Stratego. That war-based board game has more possible game scenarios than chess, go or poker.

The system was developed by the British company DeepMind, becoming one of the highest ranked online players of the Napoleon version of Stratego. It did this, among other things, by bluffing with weaker pieces, and by sacrificing important pieces where necessary.

READ ALSO
The silent children of the Concorde

“For us, the most surprising behavior of the AI ​​was its ability to sacrifice valuable pieces to gain information about the opponent’s lineup and strategy,” says DeepMind researcher Julien Perolat.

10535 game situations

In Stratego, two players try to get hold of the opponent’s flag, which is hidden somewhere between the forty game pieces. Most of the pieces are military, numbered from one to ten. When two soldiers meet on the board, the higher-ranked soldier beats the lower-ranked (except the spy, who can beat the marshal).

As long as pieces have not yet encountered each other, the players cannot see the identity of the opponent’s pieces. This makes Stratego different from games like chess and go, where both players can see all available information from the start.

What makes Stratego even more complicated is that there are as many as 10535 possible game situations. For comparison, go has 10360 possible game states, chess and poker have even fewer.

Optimal strategy

Perolat and his colleagues at DeepMind called their AI DeepNash. They learned the system Stratego by letting it play against itself 5.5 billion times. The simulated training time roughly corresponded to a few centuries of playing stratego. The AI ​​had no knowledge whatsoever of existing human strategies. Also, the system was not trained to play against specific opponents.

It would take far too much computing time to go through all possible game scenarios in the training. Instead, DeepNash has an algorithm that continuously steers its behavior toward an optimal strategy based on game theory, says DeepMind researcher Karl Tuyls. That optimal strategy guarantees at least a 50 percent win against a flawlessly playing opponent, even if that opponent knows exactly what the AI ​​is up to.

The result is an AI that can make winning decisions despite the hidden information about the opponents position, a huge number of possible game states and plenty of possible options for each turn. “This is something we couldn’t do before,” says AI researcher Julian Togelius from New York University.

World Cup Stratego for computers

DeepNash has defeated both human and computer-controlled opponents. In fifty games on an online gaming platform against expert human players, the system achieved a win rate of 84 percent. That made it one of the best three players. The human opponents didn’t know they were playing against an AI.

In addition, the AI ​​achieved a win rate of 97 percent against a number of computer players. Among them were several who had previously won the World Stratego Championship for computers.

“Good players can remember their opponent’s pieces and predict what patterns they will move in,” says Georgios Yannakakis, computer game researcher at the University of Malta. DeepNash does both well, thanks in part to a competitive advantage in memory. He plays in interesting and unpredictable ways, with elements of bluff.”

ttn-15