Tool Descriptions Are Critical: Making Better LLM Tools + Research Capability

Last Updated on April 24, 2025 by Editorial Team

Author(s): JV Roig

Originally published on Towards AI.

Tool Descriptions Are Critical: Making Better LLM Tools + Research Capability

This is Part 5 in the series “Journey to Agentic AI, Starting From First Principles”

If you’re just joining us, here are links to the previous articles in this series:

Part 1: From Next-Token to Tool-Use: How to Give LLMs the Ability to Use Tools
Part 2: Giving Control of Our Computer to an LLM
Part 3: External Services and Git Powers
Part 4: MCP in Action, Some Mythbusting and Security Nightmares

If you’re curious about how we can make LLMs use tools, they’re a fantastic resource, and most have video demos and actual code you can play with through the companion GitHub repo of this series.

Recap

In Parts 1–3, we learned the basics of how to evolve LLMs from their natural “next-token prediction” behavior to almost agentic “tool-use” behavior, and saw many different applications of this through a variety of tools — from reading files, to working with GitHub repos.

In Part 4, we learned about MCP, its advantages and security caveats, and why we’re not going to suddenly change all of our tools to use MCP even though its hip (there’s really no pedagogical benefit to it yet!).

During the MCP discussion, we also saw how it allowed the Claude LLM to suddenly have some research capability through the Claude Desktop app + Brave Web Search MCP server.

Here in Part 5 we’re going to build a similar capability, but we’ll also try to make it better, and, along the way, learn a powerful secret that is absolutely critical to building reliable tool-calling LLMs.

Web research powers implementation

If you want to follow along, we’re going to start building our new tools on top of our existing code from Part 3 (GitHub link). And as before, whether you decide to follow along or not, all of the code we will discuss today will be made available in the GitHub repo too, which I will link at the end.

Let’s get started!

First, let’s plan the tools we need.

Web search: We’ll still use the Brave Web Search API, since Brave offers a very generous free tier. Just like in the MCP scenario, we’ll need an API key, which you can get for free from https://brave.com/search/api/
Fetching webpages: The Brave Web Search API doesn’t actually return full HTML content of the webpages — just a summary. This makes sense, because otherwise they’d be storing the entire internet in their database, instead of just indexing the entire internet. But for real research capability, we have to retrieve the full content of chosen webpages.

[Note: As before, the sample code uses Alibaba Cloud Model Studio for the endpoint and uses Qwen Max, one of the most powerful models out there. However, since the code uses standard OpenAI API, you can target any other endpoint you want, and use any model you want — from other online services like OpenAI itself, or your own local models running through vLLM, TGI, or llama.cpp]

Brave web search

Let’s begin by adding a new file web.py inside of our qwen_tools_lib folder, and then implementing the web search tool:

import os
import requests
import json

def brave_web_search(query, count=10):
 """
 Search the web using Brave Search API.
 
 Args:
 query (str): The search query.
 count (int, optional): The number of results to return. Defaults to 10.
 
 Returns:
 dict: A dictionary containing the search results or an error message.
 """
 try:
 # Get API key from environment variables
 api_key = os.environ.get('BRAVE_API_KEY')
 if not api_key:
 return {"error": "BRAVE_API_KEY environment variable not found"}
 
 # Prepare the API request
 url = "https://api.search.brave.com/res/v1/web/search"
 headers = {
 "Accept": "application/json",
 "Accept-Encoding": "gzip",
 "X-Subscription-Token": api_key
 }
 params = {
 "q": query,
 "count": count
 }
 
 # Make the API request
 response = requests.get(url, headers=headers, params=params)
 response.raise_for_status() # Raise an exception for HTTP errors
 
 # Return the JSON response
 return response.json()
 
 except requests.exceptions.RequestException as e:
 return {"error": f"API request failed: {str(e)}"}
 except json.JSONDecodeError:
 return {"error": "Failed to decode JSON response"}
 except Exception as e:
 return {"error": f"An unexpected error occurred: {str(e)}"}

The Brave Web Search tool is pretty straightforward since the Brave service actually handles the heavy-lifting. All we do is call their remote API.

Of course, we need to have a valid BRAVE_API_KEY and our tool implementation assumes it will be accessible as an environment variable.

Our sample application already uses python-dotenv so we have a .env file. Previously, that was just used for our inference endpoint API key. We can just add an entry there for our Brave API key:

DASHSCOPE_API_KEY=sk-secreeeetttt
BRAVE_API_KEY=BSA_I_still_seeeeecret

Fetching webpages

Fetching a webpage is going to be trickier than expected.

We could simply use Python’s requests library to fetch a webpage to get its full-content, sure, but webpages have too much boilerplate and unnecessary code that isn’t really content. Try it yourself — if you implement a fetch-webpage tool that does just that, it will regularly fail because it will load too many tokens into the LLM as a result, even for webpages that, to the human eye, don’t seem like they have too much content.

So, to strip away unnecessary bloat and focus on actual main content, we’ll use the `BeautifulSoup` library:

def fetch_web_page(url, headers=None, timeout=30, clean=True):
 """
 Fetch content from a specified URL and extract the main content.
 
 Args:
 url (str): The URL to fetch content from.
 headers (dict, optional): Custom headers to include in the request. Defaults to None.
 timeout (int, optional): Request timeout in seconds. Defaults to 30.
 clean (bool, optional): Whether to clean and extract main content. Defaults to True.
 
 Returns:
 str or dict: The cleaned web page content as text, or a dictionary with an error message if the request fails.
 """
 try:
 # Set default headers if none provided
 if headers is None:
 headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
 }
 
 # Make the request
 response = requests.get(url, headers=headers, timeout=timeout)
 response.raise_for_status() # Raise an exception for HTTP errors
 
 if not clean:
 return response.text
 
 # Clean and extract main content using BeautifulSoup
 try:
 from bs4 import BeautifulSoup
 import re

 soup = BeautifulSoup(response.text, 'html.parser')
 
 # Remove script, style, and other non-content elements
 for element in soup(["script", "style", "header", "footer", "nav", "aside", "form", "iframe", "noscript"]):
 element.decompose()
 
 # Remove elements likely to be ads, banners, etc.
 for element in soup.find_all(class_=re.compile('(ad|banner|menu|sidebar|footer|header|nav|comment|popup|cookie)', re.IGNORECASE)):
 element.decompose()
 
 clean_text = soup.get_text(separator=' ', strip=True)
 
 # Clean up extra whitespace
 clean_text = re.sub(r'\s+', ' ', clean_text).strip()
 
 return clean_text
 
 except ImportError:
 # If BeautifulSoup is not available, return the raw text
 return {"error": "BeautifulSoup is required for content cleaning but not installed. Install with: pip install beautifulsoup4"}
 
 except requests.exceptions.RequestException as e:
 return {"error": f"Request failed: {str(e)}"}
 except Exception as e:
 return {"error": f"An unexpected error occurred: {str(e)}"}

All we’re doing here is using the requests module to fetch a webpage, and then use BeautifulSoup to remove the bloat from it. (You can tinker a bit with the elements being removed if you find that it is a little too aggressive in stripping away content.)

Of course, so that our LLM will know about these new tools and how to use them, we need to add new entries into our qwen_tool.py file too:

def list_tools():
 tools_available = """
[...snip...]
-brave_web_search: Search the web using Brave Search API.
 Parameters:
 - query (required, string): the search query to submit to Brave
 - count (optional, integer): the number of results to return, defaults to 10
 Returns: Object - a JSON object containing search results or error information from the Brave Search API
- fetch_web_page: Fetch content from a specified URL.
 Parameters:
 - url (required, string): the URL to fetch content from
 - headers (optional, dictionary): custom headers to include in the request, defaults to a standard User-Agent
 - timeout (optional, integer): request timeout in seconds, defaults to 30
 - clean (optional, boolean): whether to extract only the main content, defaults to True
 Returns: String - the cleaned web page content as text, or an error object if the request fails
"""
 return tools_available

That’s it, our LLM can now use the Brave Web Search API to search the web and can also pull down full webpages. Just like that, we now have very LLM-usable tools!

But usable is far from optimal… for example:

Even with our tools loaded, Qwen-Max seems not to understand “research” and instead just does a web-search without using fetch-webpage to actually get deeper info from interesting webpages

Using tool descriptions and messages to control behavior

And now we’re going to discuss a very important lesson when it comes to enabling LLMs with tools and trying to achieve agentic AI:

Everything about the tools is a prompt!

Remember that:

The descriptions we created for the tools are added to the system prompt.
The instructions on how to call for tools are added to the system prompt.
The tool responses themselves — whether success or failure — are also returned back to the LLM, and so become part of its context.

This means we should be paying attention to the things we say in all of these scenarios, because they are all essentially prompt engineering.

To solidify this lesson, let’s tackle the web research problem — our LLM seems to be content just using the web search summaries instead of really doing research by using fetch_web_page in concert with brave_web_search and we’ll fix that by treating the tool descriptions and tool messages as prompt engineering.

First, let’s modify the brave_web_search and fetch_web_pagedescriptions:

"""
-brave_web_search: Search the web using Brave Search API. The responses here only contain summaries. Use fetch_web_page to get the full contents of interesting search results, which should be your default behavior whenever you are asked to do research on any topic.
 Parameters:
 - query (required, string): the search query to submit to Brave
 - count (optional, integer): the number of results to return, defaults to 10
 Returns: Object - a JSON object containing search results or error information from the Brave Search API. Use fetch_web_page on relevant URLs to get the full, deeper information, especially for research tasks.

- fetch_web_page: Fetch content from a specified URL. This is a good tool to use after doing a brave_web_search, in order to get more details from interesting search results.
 Parameters:
 - url (required, string): the URL to fetch content from
 - headers (optional, dictionary): custom headers to include in the request, defaults to a standard User-Agent
 - timeout (optional, integer): request timeout in seconds, defaults to 30
 - clean (optional, boolean): whether to extract only the main content, defaults to True
 Returns: String - the cleaned web page content as text, or an error object if the request fails
"""

We added quite a few things here to nudge our LLM towards the right behavior:

The brave_web_search description now talks about the importance of using fetch_web_page and why.
The description of what it returns is also updated to nudge towards fetch_web_page.
The fetch_web_page description now also talks about it being a good tool to use after brave_web_search as a combo, especially for research tasks.

But it’s not just the literal tool descriptions and argument descriptions that are avenues for prompt engineering — even the literal tool results are, so let’s also use the brave_web_search result message as an opportunity for prompt engineering:

#In web.py, def brave_web_search()

#We update its last line from:
return response.json()

#...to:
message = "Web search results: \n" + json.dumps(response.json()) + "\nThe information here are just summaries. Use fetch_web_page tool to retrieve real content for in-depth information, especially for research purposes."
return message

So even as we return results to an LLM — which all become part of its context, and thus can serve as useful instructions for various purposes — we prompt-engineer a way to nudge it towards fetching full results instead of being satisfied with just the web search summaries.

Let’s try our task again:

Now it immediately used fetch_web_page to pull the contents of the Kamiwaza AI website!

Much better!

Here’s a nice video to see this in action:

Wrap up

The key lesson here is that everything about the tools — from how you describe the tool in general, to its syntax, parameters, parameter descriptions, and even the success and error messages it returns — are all opportunities for prompt engineering.

This key insight is critical not just for helping you use one tool after another, as in our example here. It’s critical in general for making agentic AI more reliable — such as for helping LLM’s recover when they make mistakes, or helping them understand how to do better when they encounter failure.

Don’t worry, we’ll dive more into those things as we continue this series.

If you’re interested in the code and in trying this out yourself, here’s the companion GitHub repo: GitHub link.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

Tool Descriptions Are Critical: Making Better LLM Tools + Research Capability

Author(s): JV Roig

Tool Descriptions Are Critical: Making Better LLM Tools + Research Capability

Recap

Web research powers implementation

Brave web search

Fetching webpages

Using tool descriptions and messages to control behavior

Wrap up

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

Why Knowledge Graphs Are the Missing Piece in AI Agent API Discovery

The Complexity of Self-Driving Cars Explained Simply

Bridging Symbolic AI and Deep Learning: How Knowledge Graphs are Revolutionizing ResNets

LAI #93: Smarter Model Choices, Multi-Agent Systems, and Cutting Through AI Noise

Who Wins Purview vs Rogue AI in Data Control

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

Tool Descriptions Are Critical: Making Better LLM Tools + Research Capability

Author(s): JV Roig

Tool Descriptions Are Critical: Making Better LLM Tools + Research Capability

Recap

Web research powers implementation

Brave web search

Fetching webpages

Using tool descriptions and messages to control behavior

Wrap up

Related posts

Popular posts

Updates

Recent Posts

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

GDPR CCPA Statement