AI Software Beats Human Champions at Texas Hold’em Poker

An artificial intelligence (AI) program created by a team of developers at Carnegie Mellon University, in partnership with Facebook AI, has beaten top professionals in a game of six-player no-limit Texas hold’em poker, the most popular type of poker on the planet, scoring a victory for machines in the game of human vs. computer.

Enter Pluribus

The program, named Pluribus, beat pro poker player Darren Elias, the record holder for the majority of World Poker Tour titles, and Chris “Jesus” Ferguson, who has won 6 World Series of Poker competitions. Each of the players individually played 5,000 hands of Texas Hold’Em against five iterations of Pluribus.

Another test game, which included 13 pros, each of whom have raked in more than $1M playing poker, resulted in Pluribus emerging the winner after playing five of the pros at a time for what amounted to 10,000 hands in total.

Tuomas Sandholm, a professor of computer science at Carnegie Mellon, assisted in developing Pluribus with Noam Brown, a Ph.D. student at Carnegie Mellon’s computer working for Facebook AI as a research scientist. Sandholm described the achievement as being one of superhuman performance in a multi-player poker game, which he cites as a recognized metric of measuring artificial intelligence capabilities.

Up to this point, superhuman artificial intelligence achievements in strategic planning have been limited to two-party interactions. The program’s ability to defeat five players in a game as complex as Texas Hold’em can open the floodgates to allow artificial intelligence to come up with solutions for a variety of problems that plague human societies.

A picture containing personDescription automatically generated

Betting Strategies

To be able to play a six-player game of poker instead of simply one-on-one, the AI program needs to develop fundamental changes in its playing strategy. Many in the artificial intelligence community, and the gambling community, believe that these strategies may change the way that professional players play the game of poker.

The program’s algorithm came up with some surprising strategies for how to win at poker. One example that the majority of human players wouldn’t utilize is something called “donk betting.” With donk betting, players end one round with a call but begin the next one by betting. It is typically viewed as a sign of weakness and does not usually have much strategic value. That being said, Pluribus defeated the professionals while still placing for more of these donk bets than they did.

The biggest advantage of this program is that it has the ability to use a variety of strategies. Humans try to do the same thing but often fail in execution regarding things like randomization and consistency. The majority of human beings cannot accomplish this the way that artificial intelligence can.

Pluribus scored a statistically significant win, which is especially noteworthy given the experience of its competitors. The program was not just playing decent poker, it was playing some of the best games that have ever been played in history.

Other Findings

Michael ‘Gags’ Gagliano, who has raked in a whopping $2M in his career as a professional poker player, also went up against the program. Gagliano said after the match that it was “incredibly fascinating” to see some of the strategies that the AI selected during the match. Many of the plays that it chose were ones that human beings rarely–if ever–make during games, particularly as it relates to the size of its bets. Gagliano said that he believed artificial intelligence would play a key role in the evolution of the game of poker and described the experience of having a first-hand look at the future of the game as “amazing.”

TextDescription automatically generated

Professor Sandholm has been in charge of a research team that studies computer poker for over 16 years. Together with Brown, the pair developed a prototype of the current software, known as Libratus. This program beat four professional poker players in a combined hand of 120k, heads-up, no-limit Texas Hold’Em–which is a two-player iteration of the game that Pluribus won.

Other games like chess or ‘Go’ have acted as milestones in the development of artificial intelligence technologies. During those games, each of the players knows the status of the board, as well as all the individual pieces. Poker presents a more compelling challenge due to the fact it includes incomplete information. Competitors can never be sure which cards are at play, and their rivals will often bluff their way to victory. This makes the challenge even more difficult for AI to surmount. The implications for using this technology to solve real-world problems are much wider due to the fact these usually involve a large number of moving parts and unknown variables.

Nash Equilibrium

Every AI program that has demonstrated superhuman capabilities at two-player games could do this by making an approximation of something known as the Nash equilibrium, named after famed mathematician and Carnegie Mellon alum John Nash. The Nash equilibrium is a pair of strategies–one for every player–where neither of the competitors can gain anything from altering the strategy while the other player’s strategy remains static. Despite the fact that AI strategy comes with the guarantee of a result no worse than a tie, the AI will win if it’s up against someone who miscalculates and disrupts the equilibrium.

In games with additional players, using the Nash equilibrium is often a losing strategy, so Pluribus must develop its own strategies outside theories that gain success to claim victory–and it does just that.

Final Thoughts

Despite the fact that poker is an extremely complex game with a lot of moving parts, Pluribus was able to effectively develop strategies that resulted in victory. AI programs that have recently made history in competitive gaming tend to rely on a lot of servers and farms of graphic processing units. Libratus took nearly 15 million core-hours in order to come up with its strategies for poker. Pluribus improved on these strategies using just 12,400 core hours, demonstrating how close this technology is to achieving perfection.