Can a LLM beat you At Chess?

Author(s): Arthur Lagacherie

Originally published on Towards AI.

We can use Outlines to answer this question.

Recently, I discovered a Python package called Outlines, which provides a versatile way to leverage Large Language Models (LLMs) for tasks like:

Classification
Named Entity Extraction
Generate synthetic data
Summarize a document
…

And… Play Chess (there are also 5 other uses).

GitHub – dottxt-ai/outlines: Structured Text Generation

Structured Text Generation. Contribute to dottxt-ai/outlines development by creating an account on GitHub.

github.com

In this article, I will explore various configurations for chess games, including human-versus-LLM (large language model) matches, where a human competes against an AI model, as well as LLM-versus-LLM setups, where two AI models play against each other.

How it works

To accomplish this task easily, Outlines uses a sampling technique different from the usual one.

First, what is sampling in an LLM? When generating the next token, an LLM returns a probability for each token in its vocabulary, ranging from 0% to 100%. There are various ways to select from these predicted tokens, and this selection process is known as sampling.

Outlines, instead of applying sampling to all tokens, select only the tokens related to the text format you want to generate and then apply sampling to this subset.

To choose the tokens related to the text format outlines use a regex updated each move to only match with legal moves.

Efficient Guided Generation for Large Language Models

In this article we show how the problem of neural text generation can be constructively reformulated in terms of…

arxiv.org

LLM vs LLM

The first thing I want to do is LLM vs. LLM… but just one LLM to begin. To do this we need some Python libraries.

!pip install outlines -q
!pip install chess -q
!pip install transformers accelerate einops -q

import chess, chess.svg, re
from outlines import generate, models
from IPython.display import Image, display, clear_output

Chess: a library to handle the board.
IPython, chess.svg: libraries to display the board.

After that, the first thing we need is the function to create the regex that specifies to Outlines the text format.

def legal_moves_regex(board):
 """Build a regex that only matches valid moves."""
 legal_moves = list(board.legal_moves)
 legal_modes_str = [board.san(move) for move in legal_moves]
 legal_modes_str = [re.sub(r"[+#]", "", move) for move in legal_modes_str]
 regex_pattern = "|".join(re.escape(move) for move in legal_modes_str)
 regex_pattern = f"{regex_pattern}"
 return regex_pattern

This function will return a text like this.

'Nh3|Nf3|Nc3|Na3|h3|g3|f3|e3|d3|c3|b3|a3|h4|g4|f4|e4|d4|c4|b4|a4'

It’s all the legal move of the board state.

Now we have the libraries and the regex generator we can download the model by executing the following line of code.

model = models.transformers("google/gemma-2-2b-it", device="auto")

And the final cell of code to run the main loop.

board = chess.Board("rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1")
prompt = "Let's play Chess. Moves: "
board_state = " "
turn_number = 0
while not board.is_game_over():
 regex_pattern = legal_moves_regex(board)
 structured = generate.regex(model, regex_pattern)(prompt + board_state)
 move = board.parse_san(structured)

 if turn_number % 2 == 0 : # It's White's turn
 board_state += board.san(move) + " "
 else:
 board_state += board.san(move) + " " + str(turn_number) + "."

 turn_number += 1

 board.push(move)
 
 clear_output(wait=True)
 display(chess.svg.board(board, size=250, lastmove=move))

First, we define the chessboard, the prompt, the board state, and the turn number. Then we create a while for the game. For each turn, we generate the regex and the move, then update the board state, and to finish displaying the chessboard.

Let’s run it.

Gemma 2b vs. Smollm2 1.7b

Now it’s time to do the same but with two LLMs. Let’s import it.

model1 = models.transformers("Arthur-LAGACHERIE/Gemma-2-2b-4bit", device="cuda")
model2 = models.transformers("HuggingFaceTB/SmolLM2-1.7B-Instruct", device="cuda")

Note: here I use a quantized version of Gemma 2b before I install bitsandbytes ‘pip install -q bitsandbytes’.

And we also need to change the game function a little.

board = chess.Board("rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1")
prompt = "Let's play Chess. Moves: "
board_state = " "
turn_number = 0
while not board.is_game_over():

 if turn_number % 2 == 0 : # It's White's turn
 regex_pattern = legal_moves_regex(board)
 structured = generate.regex(model1, regex_pattern)(prompt + board_state)
 move = board.parse_san(structured)
 board_state += board.san(move) + " "
 else:
 regex_pattern = legal_moves_regex(board)
 structured = generate.regex(model2, regex_pattern)(prompt + board_state)
 move = board.parse_san(structured)
 board_state += board.san(move) + " " + str(turn_number) + "."

 turn_number += 1

 board.push(move)
 
 clear_output(wait=True)
 display(chess.svg.board(board, size=250, lastmove=move))
 
print("0" if turn_number % 2 != 0 else "1")

(I also add the last line to print the winner)

Let’s run it.

After a long and difficult (and also dozen and dozen of dumb moves) war between Gemma 2b and Smollm2 1.7b the winner is: Smollm2 🥳

But if you look at the game more deeply you will see some… dumb moves. The two LLMs play like a 3-year old human.

LLM vs. Human

Now that we’ve seen LLMs pitted against each other, let’s see how a language model fares against a human player (me).

First, let’s download the model, I will take Smollm2 1.7b because he wins against Gemma 2b.

model = models.transformers("HuggingFaceTB/SmolLM2-1.7B-Instruct", device="auto")

Then, we need to update the main while a little.

board = chess.Board("rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1")
display(chess.svg.board(board, size=250))
prompt = "Let's play Chess. Moves: "
board_state = " "
turn_number = 0
while not board.is_game_over():
 
 if turn_number % 2 == 0 : # It's White's turn
 inp = input("Your move: ")
 move = board.parse_san(inp)
 board_state += board.san(move) + " "
 else:
 regex_pattern = legal_moves_regex(board)
 structured = generate.regex(model, regex_pattern)(prompt + board_state)
 move = board.parse_san(structured)
 board_state += board.san(move) + " " + str(turn_number) + "."

 turn_number += 1

 board.push(move)
 
 clear_output(wait=True)
 display(chess.svg.board(board, size=250, lastmove=move))
print("0" if turn_number % 2 != 0 else "1")

And run it.

I won in 3 minutes; the model’s chess skills are quite limited.

Conclusion

The models aren’t very intelligent at chess, likely due to their reduced number of parameters.

With the guidance from this article, you can now experiment with LLMs in a chess setting — though you may not see grandmaster-level gameplay.

I hope you enjoy this article and if this is the case you can clap it. (you can also follow me =).

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

Can a LLM beat you At Chess?

Author(s): Arthur Lagacherie

GitHub – dottxt-ai/outlines: Structured Text Generation

Structured Text Generation. Contribute to dottxt-ai/outlines development by creating an account on GitHub.

How it works

Efficient Guided Generation for Large Language Models

In this article we show how the problem of neural text generation can be constructively reformulated in terms of…

LLM vs LLM

Gemma 2b vs. Smollm2 1.7b

LLM vs. Human

Conclusion

Feedback ↓ Cancel reply

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

NN#11 — Neural Networks Decoded: Concepts Over Code

OpenAI Planning to Launch Specialized AI Agents

AI Solutions Are Creating Artificial Needs

OpenAI Invests $50M in NextGenAI Research Consortium

LAI #65 What Happens When You Combine LangGraph, DeepSeek-R1, Function Call, & Agentic RAG

The World’s Leading AI and Technology Publication.

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

Can a LLM beat you At Chess?

Author(s): Arthur Lagacherie

GitHub – dottxt-ai/outlines: Structured Text Generation

Structured Text Generation. Contribute to dottxt-ai/outlines development by creating an account on GitHub.

How it works

Efficient Guided Generation for Large Language Models

In this article we show how the problem of neural text generation can be constructively reformulated in terms of…

LLM vs LLM

Gemma 2b vs. Smollm2 1.7b

LLM vs. Human

Conclusion

Related posts

Feedback ↓ Cancel reply

Popular posts

Updates

Recent Posts

The World’s Leading AI and Technology Publication.

Company

CONTACT US

GDPR CCPA Statement