
Test MCP Servers Across Leading LLMs — and Even Try “gpt-oss” + MCPs for Free
Last Updated on September 4, 2025 by Editorial Team
Author(s): hideya
Originally published on Towards AI.

Introduction
Large Language Models (LLMs) are evolving at an incredible pace — getting smarter, faster, and cheaper almost every month. Recently, even an open-source version of ChatGPT, “gpt-oss,” has appeared.
At the same time, MCP (Model Context Protocol) servers have matured into powerful, practical tools. In fact, for many routine tasks, even small, inexpensive LLMs can handle MCP server calls effectively.
This led me to a question:
👉 Which MCP tasks can be powered by cheap models? And how do performance, cost, and speed compare across different providers?
To facilitate experimentation, I created a command-line tool that enables you to quickly test various MCP servers with different LLMs.
Yes — including gpt-oss with free access tiers!
👉 And if you’ve ever run into the frustrating GoogleGenerativeAIFetchError: 400 Bad Request
when trying Gemini with MCP servers using TypeScript — don’t worry. I’ll cover how this tool works around that issue in the last section.
Why This Tool?
When experimenting, I often wanted to know something like:
- Can a given MCP task run fine on OpenAI’s gpt-5-mini ($0.25 / $2.00 per million tokens)?
- Or even on the ultra-cheap gpt-5-nano ($0.05 / $0.40)?
- How about Gemini 2.5 Flash-Lite, famous for its performance, low cost, and free quota?
- Or the lightning-fast Cerebras + gpt-oss-120B, with throughput up to 3,000 tokens per second?
This tool lets you compare all of these scenarios with just a simple configuration file.
How It Works
By writing a config file, you can easily run the same set of queries against different MCP servers (e.g., Notion, GitHub) and compare responses across LLMs.
Here’s a simplified example in JSON5 (comments supported):
{
"llm": {
"provider": "openai", "model": "gpt-5-mini"
// "provider": "anthropic", "model": "claude-3-5-haiku-latest"
// "provider": "google_genai", "model": "gemini-2.5-flash"
// "provider": "xai", "model": "grok-3-mini"
// "provider": "cerebras", "model": "gpt-oss-120b"
// "provider": "groq", "model": "openai/gpt-oss-20b"
},
"mcp_servers": {
"notion": { // use "mcp-remote" to access the remote MCP server
"command": "npx",
"args": ["-y", "mcp-remote", "https://mcp.notion.com/mcp"]
},
"github": { // can be accessed directly if no OAuth reqquired
"type": "http",
"url": "https://api.githubcopilot.com/mcp",
"headers": { "Authorization": "Bearer ${GITHUB_PERSONAL_ACCESS_TOKEN}" }
}
},
"example_queries": [
"Tell me about my Notion account",
"Tell me about my GitHub profile"
]
}
Key features:
- JSON5 format → supports comments & trailing commas (not like JSON)
- Environment variables → avoids hardcoding API keys (
${ENVIRONMENT_VARIABLE_NAME}
will be replaced with its value)
Supported LLM Providers
- OpenAI
- Anthropic
- Google Gemini (not Vertex AI)
- xAI
- Cerebras (for its speed and esp. gpt-oss-120B support)
- Groq (for its speed and esp. gpt-oss-20B/120B support)
👉 NOTE: Output format is text-only (other responses ignored).
Installation
Two versions are available:
- npm (TypeScript) version — Requires Node.js 18+
npm install -g @h1deya/mcp-client-cli
- pip (Python) version — Requires Python 3.11+
pip install mcp-chat
Run with:
mcp-client-cli
(npm version)mcp-chat
(pip version)
Running the Tool
Before jumping into real-world MCP servers like Notion or GitHub, I’d recommend starting with a minimal sandbox setup. This way, you can confirm the tool works in your environment without worrying about API tokens or OAuth.
Simple Setup
Here’s a basic config that connects to two local MCP servers:
- Filesystem MCP → lets the LLM read/write local files under the specified directory
- Fetch MCP → lets the LLM fetch web pages
{
"llm": {
"provider": "openai", "model": "gpt-5-mini"
// "provider": "anthropic", "model": "claude-3-5-haiku-latest"
// "provider": "google_genai", "model": "gemini-2.5-flash"
// "provider": "xai", "model": "grok-3-mini"
// "provider": "cerebras", "model": "gpt-oss-120b"
// "provider": "groq", "model": "openai/gpt-oss-20b"
},
"example_queries": [
"Explain how an LLM works in a few sentences",
"Read file 'llm_mcp_config.json5' and summarize its contents",
"Summarize the top headline on bbc.com"
],
"mcp_servers": {
"filesystem": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-filesystem",
"." // Can only manipulate files under the specified directory
]
},
"fetch": {
"command": "uvx",
"args": ["mcp-server-fetch"]
}
}
}
👉 This config is the best place to start. It requires no external API tokens, and it gives you a feel for how LLMs interact with MCP servers.
- Save the above configuration as
llm_mcp_config.json5
. - Add API keys in a
.env
file as needed:ANTHROPIC_API_KEY=sk-ant-…
OPENAI_API_KEY=sk-proj-…
GOOGLE_API_KEY=AI…
XAI_API_KEY=xai-…
CEREBRAS_API_KEY=csk-…
GROQ_API_KEY=gsk_… - Run the tool from the directory containing the two aforementioned files.
mcp-client-cli
mcp-chat
Below is an example of the console output when starting up the tool:
% mcp-client-cli
Initializing model... { provider: 'cerebras', model: 'gpt-oss-120b' }
Initializing 2 MCP server(s)...
Writing MCP server log file: mcp-server-filesystem.log
Writing MCP server log file: mcp-server-fetch.log
[info] MCP server "filesystem": initializing with: {"command":"npx","args":["-y","@modelcontextprotocol/server-filesystem","."],"stderr":14}
[info] MCP server "fetch": initializing with: {"command":"uvx","args":["mcp-server-fetch"],"stderr":16}
[info] MCP server "fetch": connected
[info] MCP server "fetch": 1 tool(s) available:
[info] - fetch
[MCP Server Log: "filesystem"] Secure MCP Filesystem Server running on stdio
[info] MCP server "filesystem": connected
[MCP Server Log: "filesystem"] Client does not support MCP Roots, using allowed directories set from server args: [ '/Users/hideya/.../mcp-chat-test' ]
[info] MCP server "filesystem": 14 tool(s) available:
[info] - read_file
︙
︙
[info] - list_allowed_directories
[info] MCP servers initialized: 15 tool(s) available in total
Conversation started. Type 'quit' or 'q to end the conversation.
Exaample Queries (just type Enter to supply them one by one):
- Explain how an LLM works in a few sentences
- Read file 'llm_mcp_config.json5' and summarize its contents
- Summarize the top headline on bbc.com
Query: █
You’ll see initialization logs, MCP connections, and then a prompt where you can enter queries — or replay example queries to test multiple combinations of MCP servers and LLMs (the first example query serves as a sanity check for LLM behavior without involving MCP). Please note that the local MCP server log files are saved in the current directory (the fetch server does not write logs, so the contents are empty).
Use the --help
option to learn how to change the configuration file name to read, and the directory in which log files are saved.
Free Access to gpt-oss
Both Cerebras and Groq provide free tiers for gpt-oss models and are supported by this tool:
Performance benchmarks (as of August 2025):
- GPT-5: ~200 tokens/sec
- Cerebras + gpt-oss-120B: 3,000 tokens/sec 🤯
- Groq + gpt-oss-20B: ~1,000 tokens/sec
- Groq + gpt-oss-120B: ~500 tokens/sec
Setting up your account and API key is simple and doesn’t require a credit card.
Advanced Notes
- Implementation: It is an MCP client developed with LangChain.js. It converts MCP tools into LangChain tools, using a custom lightweight adapter (npmjs / PyPI). Execution runs on LangGraph’s ReAct agent.
- Gemini Issue: Gemini + LangChain.js official MCP adapters sometimes break due to strict JSON schema rules, throwing
400 Bad Request
. The npm versionmcp-client-cli
sidesteps this with schema transformations in the custom MCP adapter. Python SDKs don’t face this issue. See the next section for details. - npm vs pip versions differences:
– npm (mcp-client-cli
) → local MCP server logs also print to console.
– pip (mcp-chat
) → logs only to file, and Python error messages can be tricky.
⚠️ Advanced: Fixing Gemini + LangChain.js + MCP Compatibility — Avoid 400 Bad Request
Crashes
Please feel free to skip this quite long and technical section.
If you’ve ever tried using Google Gemini together with LangChain.js and MCP servers with complex schemas, you may have run into this error:
[GoogleGenerativeAI Error]: Error fetching from
https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent:
[400 Bad Request] Invalid JSON payload received.
Unknown name "anyOf" at ...
This message often repeats dozens of times, then the request fails entirely.
If you searched for GoogleGenerativeAIFetchError: [GoogleGenerativeAI Error] 400 Bad Request
, this section explains the cause and how to workaround it when using LangChain (you can avoid this if you use Google Vertex AI).
Why This Happens
- Gemini’s schema requirements are very strict.
MCP servers define their tools using flexible JSON schemas. Most LLMs accept these just fine. - But Gemini rejects valid MCP tool schemas if they contain fields it doesn’t expect (e.g., use of
anyOf
). - The result is a 400 Bad Request — even though the same MCP server works fine with OpenAI, Anthropic, or xAI.
- Google provides a fix in its new Gemini SDK (
@google/genai
), but LangChain.js cannot leverage it due to its architectural misalignment.
👉 For many developers, this can make Gemini difficult to use with LangChain.js and some MCP servers. Even if only one complex MCP server is included in the MCP definitions passed to MultiServerMCPClient, all subsequent MCP usage starts failing with the error above.
How This Tool Fixes It
This command-line interface (CLI) automatically converts MCP tool schemas to a Gemini-compatible format before sending them. This process is carried out by the custom lightweight MCP tool converter, which this app uses internally.
If you’d prefer to see the raw error behavior, you can disable this fix by setting:
{
"schema_transformations": false,
"llm": {
"provider": "google_genai", "model": "gemini-2.5-flash"
}
...
}
If you’re a Gemini user frustrated by schema errors, this feature can enable you to test those failing MCP servers.
If your development environment needs a fix, and you’d like to apply the same workaround for LangChain.js, consider using the same MCP Tool converter until the issue is fixed in the official SDKs or in the MCP servers.
Conclusion
This command-line tool makes it easy to:
- Experiment with different LLMs and MCP servers
- Benchmark performance, cost, and speed
- Try open-source/parameter models like gpt-oss, even for free
If you’re exploring the rapidly evolving LLM + MCP ecosystem, I hope this helps you tinker, compare, and discover the setups that work best for your projects. 🙏✨
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.
Published via Towards AI
Take our 90+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!
Towards AI has published Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!

Discover Your Dream AI Career at Towards AI Jobs
Towards AI has built a jobs board tailored specifically to Machine Learning and Data Science Jobs and Skills. Our software searches for live AI jobs each hour, labels and categorises them and makes them easily searchable. Explore over 40,000 live jobs today with Towards AI Jobs!
Note: Content contains the views of the contributing authors and not Towards AI.