Can a LLM beat you At Chess?
Author(s): Arthur Lagacherie
Originally published on Towards AI.
We can use Outlines to answer this question.
Recently, I discovered a Python package called Outlines, which provides a versatile way to leverage Large Language Models (LLMs) for tasks like:
- Classification
- Named Entity Extraction
- Generate synthetic data
- Summarize a document
- β¦
And⦠Play Chess (there are also 5 other uses).
GitHub – dottxt-ai/outlines: Structured Text Generation
Structured Text Generation. Contribute to dottxt-ai/outlines development by creating an account on GitHub.
github.com
In this article, I will explore various configurations for chess games, including human-versus-LLM (large language model) matches, where a human competes against an AI model, as well as LLM-versus-LLM setups, where two AI models play against each other.
How it works
To accomplish this task easily, Outlines uses a sampling technique different from the usual one.
First, what is sampling in an LLM? When generating the next token, an LLM returns a probability for each token in its vocabulary, ranging from 0% to 100%. There are various ways to select from these predicted tokens, and this selection process is known as sampling.
Outlines, instead of applying sampling to all tokens, select only the tokens related to the text format you want to generate and then apply sampling to this subset.
To choose the tokens related to the text format outlines use a regex updated each move to only match with legal moves.
Efficient Guided Generation for Large Language Models
In this article we show how the problem of neural text generation can be constructively reformulated in terms ofβ¦
arxiv.org
LLM vs LLM
The first thing I want to do is LLM vs. LLM⦠but just one LLM to begin. To do this we need some Python libraries.
!pip install outlines -q
!pip install chess -q
!pip install transformers accelerate einops -q
import chess, chess.svg, re
from outlines import generate, models
from IPython.display import Image, display, clear_output
Chess: a library to handle the board.
IPython, chess.svg: libraries to display the board.
After that, the first thing we need is the function to create the regex that specifies to Outlines the text format.
def legal_moves_regex(board):
"""Build a regex that only matches valid moves."""
legal_moves = list(board.legal_moves)
legal_modes_str = [board.san(move) for move in legal_moves]
legal_modes_str = [re.sub(r"[+#]", "", move) for move in legal_modes_str]
regex_pattern = "|".join(re.escape(move) for move in legal_modes_str)
regex_pattern = f"{regex_pattern}"
return regex_pattern
This function will return a text like this.
'Nh3|Nf3|Nc3|Na3|h3|g3|f3|e3|d3|c3|b3|a3|h4|g4|f4|e4|d4|c4|b4|a4'
Itβs all the legal move of the board state.
Now we have the libraries and the regex generator we can download the model by executing the following line of code.
model = models.transformers("google/gemma-2-2b-it", device="auto")
And the final cell of code to run the main loop.
board = chess.Board("rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1")
prompt = "Let's play Chess. Moves: "
board_state = " "
turn_number = 0
while not board.is_game_over():
regex_pattern = legal_moves_regex(board)
structured = generate.regex(model, regex_pattern)(prompt + board_state)
move = board.parse_san(structured)
if turn_number % 2 == 0 : # It's White's turn
board_state += board.san(move) + " "
else:
board_state += board.san(move) + " " + str(turn_number) + "."
turn_number += 1
board.push(move)
clear_output(wait=True)
display(chess.svg.board(board, size=250, lastmove=move))
First, we define the chessboard, the prompt, the board state, and the turn number. Then we create a while for the game. For each turn, we generate the regex and the move, then update the board state, and to finish displaying the chessboard.
Letβs run it.
Gemma 2b vs. Smollm2 1.7b
Now itβs time to do the same but with two LLMs. Letβs import it.
model1 = models.transformers("Arthur-LAGACHERIE/Gemma-2-2b-4bit", device="cuda")
model2 = models.transformers("HuggingFaceTB/SmolLM2-1.7B-Instruct", device="cuda")
Note: here I use a quantized version of Gemma 2b before I install bitsandbytes βpip install -q bitsandbytesβ.
And we also need to change the game function a little.
board = chess.Board("rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1")
prompt = "Let's play Chess. Moves: "
board_state = " "
turn_number = 0
while not board.is_game_over():
if turn_number % 2 == 0 : # It's White's turn
regex_pattern = legal_moves_regex(board)
structured = generate.regex(model1, regex_pattern)(prompt + board_state)
move = board.parse_san(structured)
board_state += board.san(move) + " "
else:
regex_pattern = legal_moves_regex(board)
structured = generate.regex(model2, regex_pattern)(prompt + board_state)
move = board.parse_san(structured)
board_state += board.san(move) + " " + str(turn_number) + "."
turn_number += 1
board.push(move)
clear_output(wait=True)
display(chess.svg.board(board, size=250, lastmove=move))
print("0" if turn_number % 2 != 0 else "1")
(I also add the last line to print the winner)
Letβs run it.
After a long and difficult (and also dozen and dozen of dumb moves) war between Gemma 2b and Smollm2 1.7b the winner is: Smollm2 🥳
But if you look at the game more deeply you will see some⦠dumb moves. The two LLMs play like a 3-year old human.
LLM vs. Human
Now that weβve seen LLMs pitted against each other, letβs see how a language model fares against a human player (me).
First, letβs download the model, I will take Smollm2 1.7b because he wins against Gemma 2b.
model = models.transformers("HuggingFaceTB/SmolLM2-1.7B-Instruct", device="auto")
Then, we need to update the main while a little.
board = chess.Board("rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1")
display(chess.svg.board(board, size=250))
prompt = "Let's play Chess. Moves: "
board_state = " "
turn_number = 0
while not board.is_game_over():
if turn_number % 2 == 0 : # It's White's turn
inp = input("Your move: ")
move = board.parse_san(inp)
board_state += board.san(move) + " "
else:
regex_pattern = legal_moves_regex(board)
structured = generate.regex(model, regex_pattern)(prompt + board_state)
move = board.parse_san(structured)
board_state += board.san(move) + " " + str(turn_number) + "."
turn_number += 1
board.push(move)
clear_output(wait=True)
display(chess.svg.board(board, size=250, lastmove=move))
print("0" if turn_number % 2 != 0 else "1")
And run it.
I won in 3 minutes; the modelβs chess skills are quite limited.
Conclusion
The models arenβt very intelligent at chess, likely due to their reduced number of parameters.
With the guidance from this article, you can now experiment with LLMs in a chess setting β though you may not see grandmaster-level gameplay.
I hope you enjoy this article and if this is the case you can clap it. (you can also follow me =).
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI