Model Context Protocol and the “too many tools” problem.

Some weeks ago I wrote this post regarding the autodiscovery of MCP servers feature that many clients (like GitHub Copilot, Cursor and others) have and why it’s better to stay in control of that.

Then some days after I posted this on Linkedin:

And today I want to find a bit of time to explain why I posted this image and this message. The image was related to Claude Code, but this is not relevant because the concept is the same for every LLM.

The context window for an LLM is its memory capacity for a single conversation and it includes your messages, LLM responses, file contents and tools.

In the example I posted, you can see that claude-sonnet-4 has a standard context window size of 200K tokens. In my Claude Code I have installed the Playwright MCP server that exposes a long list of tools (as you can see from the first image).

As you can see from the image below, all these tools (just to be exposed to the LLM) consume around 22.2% of its context window:

Having too many tools exposed in an MCP server can degrade the performances and the output of your LLM! With a lot of tools exposed, LLM latency will be increased (so slower performance) and the agent can be distracted, but the main problem is another: if tools are crowding out the context window, there’s less room for context about the current project or goal.

When adding MCP servers to your client, it’s quite common to give the LLM access to every single tool you can find. But this is a mistake! Overwhelming the LLM with tool choices often leads to suboptimal tool choices or wasted tokens while the LLM goes on hallucinations.

Some clients are aware of this problem. For example, Cursor has an hard limit of 40 MCP tools in total that can be exposed to LLMs, regardless of how many MCP servers are installed. If you exceed this limit, Cursor will only send the first 40 tools to the agent, making the remaining tools inaccessible. To work around this, you can for example manually disable unneeded tools in your mcp.json file. Another approach is to install a third-party proxy service, such as MCP Proxy, which acts as an intermediary and allows Cursor to access thousands of tools. 

GitHub Copilot supports an hard limit of maximum 128 MCP tools enabled at a time. In GitHub Copilot a chat request can have a maximum of 128 tools enabled at a time. If you have more than 128 tools selected, you need to reduce the number of tools by deselecting some tools in the tools picker, or ensure that virtual tools are enabled (github.copilot.chat.virtualTools.threshold):

The hard limit reason is always the same: to prevent “flooding” the agent’s context window with too many tools, which can degrade LLM’s performance. 

The simple solution to avoid this “too many exposed MCP tools” problem is to deactivate the tools you don’t need per conversation. Many populars MCP client support this possibility, but I can agree that this is quite frustrating.

Another possible way is to explicitly instruct the LLM to what MCP server it can use for the task, but again this is noisy. Generally speaking, manually selecting tools becomes tedious, especially when you’re unsure which tools you need for a specific task.

MCP server design best practice

I think that this problem should be handled in the MCP designing phase! I’m a big fan of building focused MCP servers from the beginning. My personal rules:

  • Don’t create a single MCP server for everything, but instead create different MCPs for different goals.
  • Build your MCP server around a user’s workflow rather than around the underlying framework or API. Scoped MCP servers work absolutely better.

Just to give a stupid example, PERSONALLY instead of doing something like this (single MCP server with tons of tools exposed):

I prefer to do something more scoped, like the following:

Another possible approach (that I’ve pesonally not yet used, but I think it’s interesting) is using a proxy, like in the following schema (image credit to solo.io):

Generally speaking, a well-designed MCP server gives agents exactly what they need to accomplish their goals without overwhelming them with choices they don’t need. Domain-driven design is often a key aspect on this topic.

Original Post https://demiliani.com/2025/09/04/model-context-protocol-and-the-too-many-tools-problem/

0 Votes: 0 Upvotes, 0 Downvotes (0 Points)

Leave a reply

Follow
Search
Loading

Signing-in 3 seconds...

Signing-up 3 seconds...