-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Description
Is your feature request related to a problem? Please describe.
Before calling the LLM, the llm_agent sends 2 to 3 HTTP requests to the MCP server. Since a ListToolsRequest is triggered with every LLM call, more latency is introduced, especially in more complex systems.
In public MCP environments, this step is a natural part of the process. However, in internal MCP setups, where the agent continuously sends a ListToolsRequest to the same version of the MCP server, this can significantly impact the agent's response speed.
If we can cache the tool response, we can significantly reduce the latency.
Describe the solution you'd like
How about adding a caching logic to the MCPToolset so that the tool list is stored in memory for a certain period of time?
Add the following two parameters to the MCPToolset:
- cache: A flag to enable or disable caching.
- cache_ttl_seconds: Sets the time-to-live for the cache; during this period, no additional requests will be made to the server.
I would greatly appreciate your active feedback on this work. Thank you!