Skip to content

Conversation

@roomote
Copy link
Contributor

@roomote roomote bot commented Jan 6, 2026

This PR attempts to address Issue #9169. Feedback and guidance are welcome.

Problem

When two configuration profiles are created with LiteLLM API provider but different instances (different base URLs), the list of models was always pulled from one of them, causing issues when they do not provide the same models.

Root Cause

The model cache was using only the provider name as the cache key, not distinguishing between different LiteLLM instances with different base URLs.

Solution

  • Added getCacheKey() function that generates unique cache keys including a hash of the base URL for multi-instance providers (LiteLLM, Ollama, LM Studio, Requesty, DeepInfra, Roo)
  • Updated writeModels()/readModels() to use unique filenames per instance
  • Updated getModels()/refreshModels() to use unique cache keys
  • Updated flushModels() to clear both memory and file cache for the specific instance
  • Added deleteModelsCacheFile() for cleaning up disk cache

Testing

  • All existing tests pass
  • Added new tests specifically for multi-instance caching behavior:
    • getCacheKey returns correct keys for different provider configurations
    • Different LiteLLM instances use separate cache entries
    • In-flight request tracking works per-instance

Changes from Previous PR #9170

This PR rescopes the fix as requested, with the following improvements:

  • Cleaner implementation with separate getCacheKey() and getCacheFilename() helpers
  • File cache is also deleted during flushModels() (not just memory cache)
  • More comprehensive test coverage

Fixes #9169

This commit fixes an issue where different LiteLLM instances with different
base URLs would share the same model cache, causing model mixing when
switching between configuration profiles.

Changes:
- Added getCacheKey() function that generates unique cache keys including
  a hash of the base URL for multi-instance providers (LiteLLM, Ollama,
  LM Studio, Requesty, DeepInfra, Roo)
- Updated writeModels/readModels to use unique filenames per instance
- Updated getModels/refreshModels to use unique cache keys
- Updated flushModels to clear both memory and file cache for the instance
- Added deleteModelsCacheFile() for cleaning up disk cache
- Added comprehensive tests for multi-instance caching behavior

Fixes #9169
@roomote
Copy link
Contributor Author

roomote bot commented Jan 6, 2026

Rooviewer Clock   See task on Roo Cloud

Reviewed the fix for model mixing between LiteLLM configuration profiles. The implementation correctly addresses the root cause by using unique cache keys that include a hash of the base URL for multi-instance providers. Found one minor code quality issue:

Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.

Comment on lines +107 to 113
async function readModels(options: GetModelsOptions): Promise<ModelRecord | undefined> {
const filename = getCacheFilename(options)
const cacheDir = await getCacheDirectoryPath(ContextProxy.instance.globalStorageUri.fsPath)
const filePath = path.join(cacheDir, filename)
const exists = await fileExistsAtPath(filePath)
return exists ? JSON.parse(await fs.readFile(filePath, "utf8")) : undefined
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This readModels() function is defined but never called anywhere. The disk cache reading is done synchronously in getModelsFromCacheWithKey() using fsSync.readFileSync() (lines 406-408). Consider removing this dead code, or if it's intended for future async disk cache reading, add a comment explaining the planned use case.

Fix it with Roo Code or mention @roomote and request a fix.

@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Jan 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels.

Projects

Status: Triage

Development

Successfully merging this pull request may close these issues.

[BUG] Mixing models between configuration profiles

3 participants