Skip to content

Conversation

@martysai
Copy link
Owner

@martysai martysai commented Feb 3, 2026

Summary

Extends PR #8 to support multiple Qwen Coder models with model-specific configurations.

Changes

  • Model Configuration Registry (src/training/models.py)

    • SamplingConfig and ModelConfig dataclasses
    • Pre-configured support for Qwen 2.5 Coder (7B, 14B, 32B) and Qwen3 Coder 30B-A3B (MoE)
    • Model-specific sampling parameters (Qwen3 uses temp=1.0, top_p=0.95)
  • API Enhancements (src/training/serve.py)

    • Per-request model selection via model parameter
    • New /models endpoint to list available configurations
    • Health endpoint now reports active model info
    • Model-specific keep_alive settings (300s for MoE models)
  • CLI Improvements (scripts/run_ollama.py)

    • --model-key for registry-based model selection
    • --list-models to display available models
    • Model-specific sampling parameters in requests
  • Cross-Platform Support

    • .gitattributes for consistent line endings
    • Windows PowerShell/CMD examples in documentation

Available Models

Model Key Architecture Memory (Q4) Context
qwen2.5-coder-32b Dense ~18GB 32K
qwen2.5-coder-14b Dense ~8GB 32K
qwen2.5-coder-7b Dense ~4GB 32K
qwen3-coder-30b MoE (3.3B active) ~18GB 256K

Test plan

  • 20 new tests for models module
  • 10 new tests for model selection in serve
  • All 48 tests passing
  • Manual testing with Ollama

🤖 Generated with Claude Code

- Add model configuration registry (src/training/models.py) with
  SamplingConfig and ModelConfig dataclasses
- Support Qwen 2.5 Coder (7B, 14B, 32B) and Qwen3 Coder 30B-A3B (MoE)
- Add per-request model selection via API and environment variable
- Apply model-specific sampling parameters (Qwen3 uses temp=1.0, top_p=0.95)
- Add /models endpoint to list available configurations
- Update health endpoint to report active model info
- Add --model-key and --list-models CLI options
- Add .gitattributes for cross-platform line endings
- Add Windows PowerShell/CMD examples in documentation
- Add 48 new tests (20 for models, 10 for serve model selection)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@martysai martysai merged commit bd38d1f into qwen2.5-coder Feb 3, 2026
@martysai martysai deleted the ecstatic-mclaren branch February 3, 2026 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants