Context compression does not prevent context overflow errors

When you have context compression enabled with a token limit lower than your model's limit, you can still get "stuck" with an error like:

    litellm.ContextWindowExceededError: litellm.BadRequestError: AnthropicError - b'{"type":"error","error":{"type":"invalid_request_error","message":"prompt is too long: 217388 tokens > 200000 maximum"},"request_id":"req_011CYGQ9pNMT8pdQbGzZcPUs"}'

This is because the context compression logic runs *after* the event is sent to the LLM.  The context compression should either happen *before* running the LLM or detect an error like the above and compress and retry after.

See also

- https://github.com/google/adk-python/issues/4146
- https://github.com/google/adk-python/commit/a88e8647558a9b9d0bfdf38d2d8de058e3ba0596
- https://github.com/google/adk-python/issues/3829


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Context compression does not prevent context overflow errors #4545

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Context compression does not prevent context overflow errors #4545

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions