Beyond the Prompt: Engineering the “Thought-Action-Observation” Loop

Last Updated on February 19, 2026 by Editorial Team

Author(s): Shreyash Shukla

Originally published on Towards AI.

Beyond the Prompt: Engineering the “Thought-Action-Observation” Loop — Image Source: Google Gemini

The “One-Shot” Fallacy

In the early days of Generative AI, the industry was obsessed with “Zero-Shot” performance — the ability of a model to answer a question in a single turn. The standard RAG (Retrieval-Augmented Generation) pattern reflects this: User asks question → System retrieves documents → Model generates answer.

For a simple query like “What is the capital of France?”, this works perfectly. But for an enterprise query like “Why did our gross margin in the Northeast region drop last quarter compared to budget?”, the “One-Shot” approach collapses. This question requires a sequence of distinct cognitive steps: locating the sales table, identifying the budget table, checking the currency conversion rates, and validating the region codes. Trying to stuff all this context into a single prompt inevitably leads to hallucinations.

Leading AI researchers have identified this as the primary bottleneck for complex tasks. Andrew Ng (DeepLearning.AI) argues that we have likely reached the point of diminishing returns for “bigger models” and that the next leap in performance will come from “Agentic Workflows” — systems that iteratively reason, plan, and execute tools. His research demonstrates that a smaller model (like GPT-3.5) wrapped in an agentic loop often outperforms a larger model (like GPT-4) using a zero-shot prompt [The Future of AI is Agentic].

We must therefore move from a “Chat” architecture to a “Loop” architecture, where the agent is not asked to answer, but to explore.

The Core Function Manifest

To enable this “Loop” architecture, we must fundamentally change how the LLM interacts with our infrastructure. We do not paste data into the chat window. Instead, we provide the model with a Function Manifest — a strict catalog of capabilities it is allowed to invoke.

In this paradigm, the LLM stops being a “Writer” and becomes a “Router.” Its job is not to generate the final answer immediately, but to select the correct tool for the current step of the problem.

Our architecture currently exposes the four foundational tools required for structured data analysis. While this list is designed to be extensible — allowing us to plug in Vector Search or Python interpreters in the future — the core “Analyst Loop” relies on these primitives:

Semantic Graph Tool (The Search Engine):

Purpose: Finds the relevant tables based on business concepts (e.g., mapping “Revenue” to my_company_data.revenue_daily).
Input: search_query (Natural Language).

Table Schema Tool (The Blueprint):

Purpose: Retrieves the precise DDL (Create Table statement) for a specific table to understand columns and types.
Input: table_name (e.g., my_company_data.revenue_daily).

Shape Detector (The Eyes):

Purpose: Retrieves the statistical profile (Cardinality, Nulls, Most Frequent Values) to prevent logic errors and bad groupings.
Input: table_name, column_name.

Execute Query Tool (The Hands):

Purpose: Runs the final SQL query against the warehouse and returns the result set.
Input: sql_query (Valid SQL).

This separation of concerns is critical. By isolating “Schema” (Structure) from “Shape” (Statistics), we keep the context window lean and ensure the agent only consumes the tokens it actually needs to solve the specific ambiguity at hand.

The Brain (System Instructions)

Tools are useless without a manual. If we hand an LLM a “Execute Query” tool without strict guidelines, it will revert to its training behavior — guessing schema and hallucinating tables. To prevent this, we must rigorously define the System Instructions (or “System Prompt”).

This is the “Operating System” of the agent. It is not a request; it is a set of immutable laws that define the agent’s persona, constraints, and error-handling procedures. We treat this prompt as code — version-controlled, tested, and optimized.

Below is a sanitized example of the System Instructions that govern the loop. Notice how it explicitly forces the agent to use the tools in a specific order (Graph → Schema → Shape → Action) rather than jumping to a conclusion.

ROLE: Senior Data Analyst Agent
GOAL: Answer user questions by executing valid SQL against the warehouse.

CORE DIRECTIVES:
1. NO GUESSING: You do not know the database schema. You must discover it using tools.
2. NO DML: You are strictly forbidden from using DROP, ALTER, or INSERT. Read-only.

THE PROTOCOL (The Loop):
1. DISCOVER: When a user asks a question, first use `semantic_graph_search` to identify relevant tables.
2. INSPECT: Once a table is identified, use `get_table_schema` to see the actual columns.
3. VALIDATE: Before writing a GROUP BY or FILTER, use `get_column_stats` (Shape Detector).
 - CRITICAL: If distinct_count > 1000, do NOT group by this column without a filter.
4. EXECUTE: Only after steps 1-3 are complete, use `execute_query` to run the SQL.

ERROR HANDLING:
- If `execute_query` returns an error, do NOT apologize. Analyze the error message, adjust your SQL, and retry.
- If the result is empty, check `get_column_stats` to ensure your filter values (e.g., 'USA' vs 'US') exist in the data.

The “Execute Query” Tool (The Specialist)

The final tool in the manifest is execute_query. On the surface, this seems simple: take a SQL string and run it. However, in an enterprise environment, this is the most dangerous tool in the stack. A hallucinated DROP TABLE command or a runaway CROSS JOIN could cause catastrophic data loss or outage.

Therefore, we do not simply pass the LLM’s output to the database driver. We wrap this tool in a robust Safety & Correction Layer:

Read-Only Enforcement: The tool parses the AST (Abstract Syntax Tree) of the incoming SQL. If it detects any DML keywords (like INSERT, UPDATE, DELETE, DROP, ALTER), it immediately rejects the request with a "Permission Denied" error, without ever touching the database.
The “Self-Healing” Loop: If the query fails (e.g., “Column rev_amt not found"), the tool captures the specific database error message and feeds it back to the LLM as a new observation. This allows the agent to trigger its own "Debugging Routine," realize its mistake, check the schema again, and issue a corrected query.

This architecture aligns with Microsoft Research’s findings on “Grounding,” which emphasize that for agents to be viable in the real world, they must operate within a “verifiable execution environment” that constrains their action space to safe operations [Grounding LLMs in Interactive Environments].

The Agent as an Operating System

By moving to this Tool-Driven Architecture, we have effectively upgraded our system from a Chatbot to a Runtime Environment.

We are no longer relying on the LLM to remember the database; we are empowering it to navigate the database. The Function Manifest provides the hands, and the System Instructions provide the discipline. This combination allows the agent to handle the complexity of the real world — iterating, checking facts, and self-correcting — just like a human analyst would.

But having “hands” is not enough. The agent still needs to know what to look for. In the next article, we will discuss Surgical Context Injection — how we handle the ambiguity of human language and ensure the agent asks the right clarifying questions before it even touches the tools.

Build the Complete System

This article is part of the Cognitive Agent Architecture series. We are walking through the engineering required to move from a basic chatbot to a secure, deterministic Enterprise Consultant.

To see the full roadmap — including Semantic Graphs (The Brain), Gap Analysis (The Conscience), and Sub-Agent Ecosystems (The Organization) — check out the Master Index below:

The Cognitive Agent Architecture: From Chatbot to Enterprise Consultant

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

15 engineers. 100,000+ students. Towards AI Academy teaches what actually survives production.

Start free — no commitment:

→ Agents Architecture Cheatsheet — 3 years of architecture decisions in 6 pages

Our courses:

→ AI Engineering Certification — 90+ lessons from project selection to deployed product. The most comprehensive practical LLM course out there.

→ Agent Engineering Course — Hands on with production agent architectures, memory, routing, and eval frameworks — built from real enterprise engagements.

→ AI for Work — Understand, evaluate, and apply AI for complex work tasks.

Note: Article content contains the views of the contributing authors and not Towards AI.

Frequently Used, Contextual References

Resources

Beyond the Prompt: Engineering the “Thought-Action-Observation” Loop

Author(s): Shreyash Shukla

The “One-Shot” Fallacy

The Core Function Manifest

The Brain (System Instructions)

The “Execute Query” Tool (The Specialist)

The Agent as an Operating System

Build the Complete System

Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

Recent Posts

Full-Stack Data Scientists for the Agentic Coding World

Building Production-Grade AI Skills with Snowflake Cortex AI Function Studio

I Tried 10 AI Agent Frameworks in 2026 — Here’s the Honest Guide I Wish I Had Earlier

How One Spring Boot Optimization Saved Our Startup $30,000 a Year

Inside Palantir AIP: How the World’s Most Controversial AI Platform Actually Works

What Is a Reverse Proxy? (And Why Every Backend Developer Should Care)

What Claude Opus 4.8 Actually Changes If You’re Building Agents

QWEN 3.7 Max Worked For 35 Hrs Straight And The Results Were Mind-blowing

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Beyond the Prompt: Engineering the “Thought-Action-Observation” Loop

Author(s): Shreyash Shukla

The “One-Shot” Fallacy

The Core Function Manifest

The Brain (System Instructions)

The “Execute Query” Tool (The Specialist)

The Agent as an Operating System

Build the Complete System

Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

Related posts

Recent Posts

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

GDPR CCPA Statement