Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab VeloxTrend Ultrarix Capital Partners Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-FranΓ§ois Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Colab & SQLite with LangChain & Qwen 2.5 Coder 4 SQL Assistant
Latest   Machine Learning

Colab & SQLite with LangChain & Qwen 2.5 Coder 4 SQL Assistant

Last Updated on April 14, 2025 by Editorial Team

Author(s): Claudio Giorgio Giancaterino

Originally published on Towards AI.

Colab & SQLite with LangChain & Qwen 2.5 Coder 4 SQL Assistant
Pixabay: SQL

SQL stands for Structured Query Language; it manages and interacts with data. Whether you’re handling a small database or gigabytes of data, SQL allows efficient querying, updating, and management of relational databases.

SQL can generate reports and insights for business decisions. In Machine Learning, SQL can be used for data preparation before ingesting data into models. SQL is integrated into cloud-based databases like Amazon RDS and Microsoft Azure SQL. This programming language can be used in many use cases.

SQL was born at IBM in the early 1970s, was developed by Donald D. Chamberlin and Raymond F. Boycein and is still popular.

SQL β€” Explore β€” Google Trends

Large Language Models (LLMs) showed power skills as Python coding assistants, and this small job aims to use an LLM as an SQL coding assistant.

For this purpose, I used Google Colaboratory in combination with SQLite. It’s a C-language library that implements a small and fast SQL database engine. SQLite is built into all mobile phones and most computers and comes bundled inside countless other applications that people use every day.

I want to simulate a marketing campaign analysis, so I start creating 3 tables: customers, campaigns and sales. You can follow the code.

First of all I connect with SQLite database by a function β€œconn” and I create a cursor object from the connection, which is used to execute SQL commands and interact with the database.

# connection to SQLite database
conn = sqlite3.connect('marketing.db')
cursor = conn.cursor()

Then I set up a foundational database structure for a marketing-related application by ensuring clean table creation and preparing the database to store and manage customer, campaign, and sales data effectively.

cursor.execute('''
DROP TABLE IF EXISTS customers;
'''
)
# Build customers table
cursor.execute('''
CREATE TABLE customers (
customerID INTEGER PRIMARY KEY,
city TEXT,
occupation TEXT,
age INTEGER
);
'''
)

cursor.execute('''
DROP TABLE IF EXISTS campaigns;
'''
)
# Build campaigns table
cursor.execute('''
CREATE TABLE campaigns (
campaignID INTEGER PRIMARY KEY,
product TEXT,
start_date DATE,
end_date DATE,
budget NUMERIC,
channel TEXT
);
'''
)

cursor.execute('''
DROP TABLE IF EXISTS sales;
'''
)
# Build sales table
cursor.execute('''
CREATE TABLE sales (
saleID INTEGER PRIMARY KEY,
customerID INTEGER,
campaignID INTEGER,
product TEXT,
quantity INTEGER,
sale_date DATE,
saleamount NUMERIC
);
'''
)

conn.commit()

I fill in the tables with the help of Python, and I’m ready to execute some SQL operations.

customers table
campaigns table
sales table

I used LangChain to interact with the Large Language Model: Qwen2.5-Coder-7B-Instruct provided by Hugging Face.

Qwen2.5-Coder-7B-Instruct is a Large Language Model developed by Alibaba Cloud, and tailored for coding tasks. It belongs to the Qwen2.5 series, known for its advanced reasoning, multimodal capabilities, and efficiency. This model has 7,61 billion parameters and supports long-context inputs of up to 128k tokens, making it suitable for handling extensive codebases and complex programming tasks.

The first question I asked was about creating a join between all the tables.

# Define initial messages with prompt engineering
system_message = SystemMessage(content="You are a senior data analyst. Give an help in coding with SQL")
user_message = HumanMessage(content="""

Please write an SQL query to connect the three tables according to the rules provided.

Tables:

1. customers (customerID INTEGER PRIMARY KEY, city TEXT, occupation TEXT, age INTEGER)
2. campaigns (campaignID INTEGER PRIMARY KEY, product TEXT, start_date DATE, end_date DATE, budget NUMERIC, channel TEXT)
3. sales (saleID INTEGER PRIMARY KEY, customerID INTEGER, campaignID INTEGER, product TEXT, quantity INTEGER, sale_date DATE, saleamount NUMERIC)

Rules:
Retrive all fields from the tables.

SQL Query:
"""
)

# Generate response
sql_query = generate_response(system_message, user_message)
print(sql_query.content)

The second problem I asked was to perform a β€œgroup by” function showing the total amount of sales grouped per city and channel.

# Define initial messages with prompt engineering
system_message = SystemMessage(content="You are a senior data analyst. Give an help in coding with SQL")
user_message = HumanMessage(content="""

Please write an SQL query to connect the three tables according to the rules provided.

Tables:

1. customers (customerID INTEGER PRIMARY KEY, city TEXT, occupation TEXT, age INTEGER)
2. campaigns (campaignID INTEGER PRIMARY KEY, product TEXT, start_date DATE, end_date DATE, budget NUMERIC, channel TEXT)
3. sales (saleID INTEGER PRIMARY KEY, customerID INTEGER, campaignID INTEGER, product TEXT, quantity INTEGER, sale_date DATE, saleamount NUMERIC)

Rules:
Retrive all fields from the tables.

SQL Query:
"""
)

# Generate response
sql_query = generate_response(system_message, user_message)
print(sql_query.content)

In the last exercise, I asked for help to group by total sales amount per city and filtered per β€œsocial network” channel.

# Define initial messages with prompt engineering
system_message = SystemMessage(content="You are a senior data analyst. Give an help in coding with SQL")
user_message = HumanMessage(content="""

Please write an SQL query according to the rules provided.

Tables:

1. customers (customerID INTEGER PRIMARY KEY, city TEXT, occupation TEXT, age INTEGER)
2. campaigns (campaignID INTEGER PRIMARY KEY, product TEXT, start_date DATE, end_date DATE, budget NUMERIC, channel TEXT)
3. sales (saleID INTEGER PRIMARY KEY, customerID INTEGER, campaignID INTEGER, product TEXT, quantity INTEGER, sale_date DATE, saleamount NUMERIC)

Rules:
Join all the tables. Group by total of saleamount per city and channel. The fields to show are channel, city and total saleamount. Filter by "SOCIAL NETWORK" channel

SQL Query:
"""
)

# Generate response
sql_query = generate_response(system_message, user_message)
print(sql_query.content)

In all coding questions, I received correct answers from the model.

Here are the results from the last one.

This is the first time I have used Qwen 2.5 Coder, and it seems amazing. I think it’s a valid alternative to the proprietary LLMs. Enjoy!!!

References:

Repository

Embrace SQL in Google Colab: A Step-by-Step Tutorial for Data Scientists | by Jyoti Dabass, Ph.D. | Python in Plain English

Qwen/Qwen2.5-Coder-7B-Instruct Β· Hugging Face

ChatPromptTemplate β€” 🦜🔗 LangChain documentation

HumanMessage β€” 🦜🔗 LangChain documentation

AIMessage β€” 🦜🔗 LangChain documentation

SystemMessage β€” 🦜🔗 LangChain documentation

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓