
Colab & SQLite with LangChain & Qwen 2.5 Coder 4 SQL Assistant
Last Updated on April 14, 2025 by Editorial Team
Author(s): Claudio Giorgio Giancaterino
Originally published on Towards AI.

SQL stands for Structured Query Language; it manages and interacts with data. Whether youβre handling a small database or gigabytes of data, SQL allows efficient querying, updating, and management of relational databases.
SQL can generate reports and insights for business decisions. In Machine Learning, SQL can be used for data preparation before ingesting data into models. SQL is integrated into cloud-based databases like Amazon RDS and Microsoft Azure SQL. This programming language can be used in many use cases.
SQL was born at IBM in the early 1970s, was developed by Donald D. Chamberlin and Raymond F. Boycein and is still popular.


Large Language Models (LLMs) showed power skills as Python coding assistants, and this small job aims to use an LLM as an SQL coding assistant.
For this purpose, I used Google Colaboratory in combination with SQLite. Itβs a C-language library that implements a small and fast SQL database engine. SQLite is built into all mobile phones and most computers and comes bundled inside countless other applications that people use every day.
I want to simulate a marketing campaign analysis, so I start creating 3 tables: customers, campaigns and sales. You can follow the code.
First of all I connect with SQLite database by a function βconnβ and I create a cursor object from the connection, which is used to execute SQL commands and interact with the database.
# connection to SQLite database
conn = sqlite3.connect('marketing.db')
cursor = conn.cursor()
Then I set up a foundational database structure for a marketing-related application by ensuring clean table creation and preparing the database to store and manage customer, campaign, and sales data effectively.
cursor.execute('''
DROP TABLE IF EXISTS customers;
''')
# Build customers table
cursor.execute('''
CREATE TABLE customers (
customerID INTEGER PRIMARY KEY,
city TEXT,
occupation TEXT,
age INTEGER
);
''')
cursor.execute('''
DROP TABLE IF EXISTS campaigns;
''')
# Build campaigns table
cursor.execute('''
CREATE TABLE campaigns (
campaignID INTEGER PRIMARY KEY,
product TEXT,
start_date DATE,
end_date DATE,
budget NUMERIC,
channel TEXT
);
''')
cursor.execute('''
DROP TABLE IF EXISTS sales;
''')
# Build sales table
cursor.execute('''
CREATE TABLE sales (
saleID INTEGER PRIMARY KEY,
customerID INTEGER,
campaignID INTEGER,
product TEXT,
quantity INTEGER,
sale_date DATE,
saleamount NUMERIC
);
''')
conn.commit()
I fill in the tables with the help of Python, and Iβm ready to execute some SQL operations.



I used LangChain to interact with the Large Language Model: Qwen2.5-Coder-7B-Instruct provided by Hugging Face.
Qwen2.5-Coder-7B-Instruct is a Large Language Model developed by Alibaba Cloud, and tailored for coding tasks. It belongs to the Qwen2.5 series, known for its advanced reasoning, multimodal capabilities, and efficiency. This model has 7,61 billion parameters and supports long-context inputs of up to 128k tokens, making it suitable for handling extensive codebases and complex programming tasks.
The first question I asked was about creating a join between all the tables.
# Define initial messages with prompt engineering
system_message = SystemMessage(content="You are a senior data analyst. Give an help in coding with SQL")
user_message = HumanMessage(content="""
Please write an SQL query to connect the three tables according to the rules provided.
Tables:
1. customers (customerID INTEGER PRIMARY KEY, city TEXT, occupation TEXT, age INTEGER)
2. campaigns (campaignID INTEGER PRIMARY KEY, product TEXT, start_date DATE, end_date DATE, budget NUMERIC, channel TEXT)
3. sales (saleID INTEGER PRIMARY KEY, customerID INTEGER, campaignID INTEGER, product TEXT, quantity INTEGER, sale_date DATE, saleamount NUMERIC)
Rules:
Retrive all fields from the tables.
SQL Query:
""")
# Generate response
sql_query = generate_response(system_message, user_message)
print(sql_query.content)
The second problem I asked was to perform a βgroup byβ function showing the total amount of sales grouped per city and channel.
# Define initial messages with prompt engineering
system_message = SystemMessage(content="You are a senior data analyst. Give an help in coding with SQL")
user_message = HumanMessage(content="""
Please write an SQL query to connect the three tables according to the rules provided.
Tables:
1. customers (customerID INTEGER PRIMARY KEY, city TEXT, occupation TEXT, age INTEGER)
2. campaigns (campaignID INTEGER PRIMARY KEY, product TEXT, start_date DATE, end_date DATE, budget NUMERIC, channel TEXT)
3. sales (saleID INTEGER PRIMARY KEY, customerID INTEGER, campaignID INTEGER, product TEXT, quantity INTEGER, sale_date DATE, saleamount NUMERIC)
Rules:
Retrive all fields from the tables.
SQL Query:
""")
# Generate response
sql_query = generate_response(system_message, user_message)
print(sql_query.content)
In the last exercise, I asked for help to group by total sales amount per city and filtered per βsocial networkβ channel.
# Define initial messages with prompt engineering
system_message = SystemMessage(content="You are a senior data analyst. Give an help in coding with SQL")
user_message = HumanMessage(content="""
Please write an SQL query according to the rules provided.
Tables:
1. customers (customerID INTEGER PRIMARY KEY, city TEXT, occupation TEXT, age INTEGER)
2. campaigns (campaignID INTEGER PRIMARY KEY, product TEXT, start_date DATE, end_date DATE, budget NUMERIC, channel TEXT)
3. sales (saleID INTEGER PRIMARY KEY, customerID INTEGER, campaignID INTEGER, product TEXT, quantity INTEGER, sale_date DATE, saleamount NUMERIC)
Rules:
Join all the tables. Group by total of saleamount per city and channel. The fields to show are channel, city and total saleamount. Filter by "SOCIAL NETWORK" channel
SQL Query:
""")
# Generate response
sql_query = generate_response(system_message, user_message)
print(sql_query.content)
In all coding questions, I received correct answers from the model.
Here are the results from the last one.


This is the first time I have used Qwen 2.5 Coder, and it seems amazing. I think itβs a valid alternative to the proprietary LLMs. Enjoy!!!
References:
Qwen/Qwen2.5-Coder-7B-Instruct Β· Hugging Face
ChatPromptTemplate β 🦜🔗 LangChain documentation
HumanMessage β 🦜🔗 LangChain documentation
AIMessage β 🦜🔗 LangChain documentation
SystemMessage β 🦜🔗 LangChain documentation
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI