This week’s red-hot “Wow, Science!” news is the pronouncement, as many articles are happy to present it, that Poker Is Solved.
Loosely speaking, what this means is that if you invite a properly-programmed computer to join your poker evenings, you’ll lose money week after week.
Actually, the idea of a game being “solved” means a bit more than that.
A top-ranked chess-playing computer will beat you every time, but chess is not “solved” to the point that every possible position that a game can reach has been enumerated, evaluated and pigeon-holed.
To win at chess, a computer works forward from the current position down as many possible paths as it can, and picks the move that seems to give the best outcome it can find, regardless of what the other player does.
For much of the game, however, there are far too many possible paths to try them all.
So a computer, like a human player, looks ahead for only a few moves and hopes it has gone far enough to keep an edge over its opponent.
There usually isn’t time (competitive chess is played against the clock) to make sure that an improvement over the next four moves, say, will inevitably lead to a further improvement in the four moves after that.
Indeed, there are said to be more than 10123 (about 2408) different possible chess games, compared to just 1020 grains of sand on Bondi beach.
But the story about poker says that it really is “solved.”
That’s because the paper in January’s Science Magazine is unambiguously entitled: Heads-up limit hold’em poker is solved.
Let’s be quite clear: that’s a truly cool title for a research paper.
But it isn’t quite the same as “Poker is solved,” because:
- As the abstract itself admits, the paper stops just short of a true solution, with the authors stating, “we announce that heads-up limit Texas hold’em is now essentially weakly solved.”
- Only one particular form of one particular flavour of poker was considered.
We’d love to give you more details from the paper itself, but it’s subscription only.
However, we can tell you a bit more about the sort of poker game that the authors analysed.
Texas Hold’em is what you most often see on TV, where the players each receive two secret cards (the hole), followed by three face-up “community cards” called the the flop, shared by everyone.
Players bet after the hole cards are dealt, and then when the flop has been, well, flopped.
Two more rounds of betting follow the dealing of a fourth community card (the turn) and then a fifth (the river).
Your hand is the best 5-card poker hand that can be made from your two-card hole plus the community cards.
The variant called Heads-up means there are only two players; Limit means that every bet is a fixed amount.
(“All-in,” where a player bets everything he has left at the table in a bold and dramatic TV moment, is only possible in a No Limit poker game.)
→ Did you know (this has no relevance to the paper, by the way) that there are fewer 5-card poker hands you can make from 7 cards than from 5 cards? That’s because the weakest possible 5-card hands, seven-high and eight-high, aren’t possible when you have 7 cards to choose from.
The comparative algorithmic simplicity of Heads-up Limit Hold’em still leaves a lot of different ways for a game to play out, and a lot of different decision points player might reach.
Apparently, the important numbers are 3 x 1017 possible games, with 3 x`1014 possible decision points, where you need to choose whether to bet, raise or fold.
Although that’s a lot smaller than the numbers for chess, there is a complexity in poker that board games such as chess and draughts (chequers) don’t have.
Chess, for example, is known as a perfect information game, because the details of every move are known to both players.
Poker, however, is an imperfect information game: if your opponent folds, for example, you win without knowing what hand he had.
In fact, his hand might have been stronger than yours, but he wasn’t willing to risk it, and decided to capitulate without giving you any hints about his risk-assessment strategy.
So a computer has to build up its technique with a huge number of experimental hands in which, amongst other things, it keeps track of the options it didn’t take whenever it reaches a dead end.
That way, in the future it can favour the moves it wished it had made last time, including bluffing instead of playing straight, or vice versa, and see where they lead instead.
Amusingly, this technique is called CFR, for counterfactual regret minimisation, because that’s exactly what it tries to do: avoid the wrong choices, and reduce your regret.
The relevance to security
We have to admit that this story doesn’t have any direct relevance to computer security: it was just so funky that we couldn’t go past it.
But it does act as a reminder that computer-based attacks only ever get stronger.
Apparently, the team behind this research completed their “solution” by using 200 computers, each with 24 processors, running in parallel for more than two months.
Their data storage requirements were nearly 300TB, which they didn’t have, but a space-time tradeoff let them get away with just over 10TB, using data compression that increased runtime only slightly.
This is exactly the sort of approach that cryptographic crackers take, allowing them to attack ever more complex problems and succeed.
And that’s why we keep reminding you to Pick a Proper Password!
→ Can’t view the video on this page? Watch directly from YouTube. Can’t hear the audio? Click on the Captions icon for closed captions.