
Claude Code as a Data Analyst: From Zero to First Report
Author(s): Freddie Robinson
Originally published on Towards AI.
As data analysts we’ve all been there, the dreaded request for the monthly/yearly [insert topic] report, an essential task that’s also a massive time sink.
My thoughts for the last week? “Can’t AI just… do this?” Surely, it can whip up a simple data analysis report. Right?
So I set out on a quest to see if I could truly automate a detailed customer contacts analysis report end to end using Claude Code. Here’s the story of how it went from a chaotic mess to a surprisingly competent first draft.

The prerequisites
Before we dive in, a word on the setup. Running Claude Code in your local environment is a must, but to give Claude Code true data analyst powers you’ll also need a tool for Claude Code to securely read your database and run queries.
There are a ton of great open-source examples of this online, just search for “MCP SQL server” and find one that works for your setup. Now let’s get to the fun part.
Attempt 1: The no-context data analyst
My first attempt was an optimistic attempt to give Claude Code full decision making on its own. I threw a simple prompt into the chat:
##Role
You are an expert data analyst and your job is to provide a monthly analysis of why users are contacting us.
##Task
You need to perform an analysis and write a report on contact trends up to the end of July 2025. You should use the last 12 months of contact data to analyse trends, but index more heavily on the change in contact volume/contact rate between June and July 2025. Focus your analysis on which types of contacts and what contact rates are increasing or decreasing the most, and why those trends are happening.
Unsurprisingly, Claude Code struggled:
- The SQL queries were generating errors. Claude Code had no context on what the correct tables to reference were, what the columns meant or what the values meant. It made guesses everywhere, and was forced to constantly fix its errors.
- It wrote inaccurate queries. Even if the query worked, it was often an inaccurate one. For example it decided to guess the best way to calculate ‘active users’ and ended up using the wrong column to count on.
Clearly there’s a need to refine both the context we provide it and the instructions to follow.
Improving the context
The key is to think about the context you’d give a brand new data analyst who joined your team and you wanted them to write the report. What would you share with them? You’d share information about the tables, a few SQL queries to get going, and an example structure of the report.
Here’s the info we’ll pass it:
- Table documentation. I gave Claude a list of all the relevant tables and a brief, human-readable description for every single column. If you don’t tell it what contact_reason_id means, it’s just going to guess.
- A list of the main queries to run. If you know what queries you want to look giving it a list of the main queries to run in plain English will reduce the scope of the agent missing context massively and focus on what matters.
- A few example SQL queries. Most queries that you run on a single table are very similar, often swapping out a condition or column. A few example queries boosted the query quality significantly.
- A chart style guide. I gave it instructions for the charts, with a colour palette to use, what type of chart and what to do with axes.
- An analysis blueprint. I provided an example of a previous report to use as a structure to follow.
Table docs
contact_id — The unique id of the contact. You can count contacts with this column.
contact_channel — The channel the customer contacted us through. This can be either phone, email or chat.
created_at_ts — The time the contact was created at. You can split contacts on time series with this column.
…
This was a game changer for its output, we had correct queries, clear visualisations and a consistent document structure.
But we still had a problem.
Even with all this context and list of queries written out, Claude Code still went a little rogue. It would run a query, then run a second query, and then start writing the analysis before it had gathered all the necessary data.
- It tried to execute multiple tasks at once and became overwhelmed. It decided to run a single task to create and execute all the SQL queries at once in a notebook, this caused all sorts of pains in terms of query quality, observability and timeouts.
- It drifted away from the tasks it should follow regularly. It decided to write the report after generating a couple of SQL queries, unlikely to be enough for a good analysis.
Improving the instructions to follow
We needed a way to control Claude Code’s attention so that it focused on the right task step by step. It was time to enforce Claude’s todo list. It’s also likely for longer workflows that you may have to break it over multiple Claude Code sessions. Therefore we want a todo list that can pick up where the last session left off.
Therefore the prompt I created was a step by step workflow for it to follow which baked in an iterative loop over each of the ‘pending tasks’, and once the pending task was done it moved it to ‘completed tasks’ in the todo list.

A further improvement was to split out major tasks into separate prompts. Creating queries and writing reports are very distinct tasks which use separate skill sets, so we split this workflow into two separate prompts over one Claude conversation.
By forcing it to follow this logical flow, I finally got it to stop improvising and stick to the plan.
Attempt 2: A much more detailed prompt
So, after all that trial and error, what did the final prompt look like?
First message — query writer
<master_prompt>
<role_and_task>
You are an expert data analyst, operating autonomously. Your goal is to create SQL queries, execute them and create charts based off the results. You should continue without stopping.
</role_and_task>
<resources>
<resource=”[table_name]”>Primary table for contacts data.</resource>
<resource=”[docs_name]”>Documentation explaining the data tables. You MUST read this first.</resource>
</resources>
<style_guides>
The charts should use colours based off this palette: [‘#1f77b4’, ‘#ff7f0e’, ‘#2ca02c’, ‘#d62728’, ‘#9467bd’, ‘#8c564b’, ‘#e377c2’, ‘#7f7f7f’, ‘#bcbd22’, ‘#17becf’]
</style_guides>
<workflow>
<step_1_setup>
First, check if the file `july_2025_analysis/todo.txt` exists. If it does not, create the`july_2025_analysis` folder and then create `july_2025_analysis/todo.txt` with the content from the <initial_todo_list> below.
</step_1_setup>
<step_2_loop>
Once the `todo.txt` file is ready, you will begin an execution loop. You MUST continue this process until the `PENDING TASKS` list is empty. DO NOT STOP between tasks. For each loop cycle:
a. Read the `todo.txt` file to identify the next task.
b. If the `PENDING TASKS` list is empty, proceed to the final step and output a message confirming the entire process is complete.
c. Execute the FIRST task from the `PENDING TASKS` list.
d. After the task is successfully completed, immediately update the `july_2025_analysis/todo.txt` file by moving the task description from the `PENDING TASKS` section to the `COMPLETED TASKS` section.
e. Immediately begin the next loop cycle without stopping.
</step_2_loop>
</workflow>
<initial_todo_list>
## PENDING TASKS
1. Read `[docs_name]` to understand the data schema.
2. Create an empty Jupyter notebook named `july_2025_analysis/contacts_analysis_july_2025.ipynb`.
3. **QUERY 1:** Create an SQL query for ‘monthly contact rate per active user’. Execute the query in the notebook. Create a dual-axis chart (bar for users, line for rate). Save as `july_2025_analysis/01_contact_rate_per_user.png`.
4. **QUERY 2:** [repeat the steps for query 1 for the remaining queries]
## COMPLETED TASKS
</initial_todo_list>
</master_prompt>
Second round — writing the analysis
<master_prompt>
<role_and_task>
You are an expert data analyst, operating autonomously. Your goal is to provide clear analysis about trends in data. You write data reports based off charts. Do not guess the reasons for data trends, just report on the data trends. You will work through the entire task list without stopping. The final deliverable will be a .docx file containing your written analysis and all supporting charts.
</role_and_task>
<resources>
<resource file=”[example_analysis]”>An example report to use as a template for style and structure.</resource>
</resources>
<initial_todo_list>
1. **WRITE ANALYSIS:** Write the full analysis of contact trends in a new file:`july_2025_analysis/contacts_analysis_july_2025.md`. The analysis must be based entirely off the .png charts in the july_2025_analysis folder and follow the structure of `[example_analysis]`.
2. **GENERATE DOCX:** Generate the final report by combining the text from `contacts_analysis_july_2025.md` and all saved .png charts into a single .docx file named `july_2025_report.docx`.
</initial_todo_list>
</master_prompt>
How did it do?
It did well! Ultimately the AI was able to produce all the correct queries/charts that it needed to do and write an analysis of the data trends.
Here are the key principles we learned:
- Give it its onboarding. Taking the time to set up the documentation for it to follow can take a bit of time, but if you’re writing this report regularly the work pays dividends quick.
- Control Claude Code’s todo list. Claude works through the list, use that to your advantage to control its workflow rather than let it go off in its own direction.
In our next exploration, we’ll get a bit bolder. We’ll look at giving Claude Code more freedom to decide what queries to deep-dive into. Can we upgrade it from a junior intern to a self-sufficient mid-level analyst? Stay tuned!
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.
Published via Towards AI
Take our 90+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!
Towards AI has published Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!

Discover Your Dream AI Career at Towards AI Jobs
Towards AI has built a jobs board tailored specifically to Machine Learning and Data Science Jobs and Skills. Our software searches for live AI jobs each hour, labels and categorises them and makes them easily searchable. Explore over 40,000 live jobs today with Towards AI Jobs!
Note: Content contains the views of the contributing authors and not Towards AI.