Skip to content

Conversation

@RexBearIU
Copy link
Collaborator

@RexBearIU RexBearIU commented Jan 16, 2026

Description

This pull request significantly updates and modernizes the knowledge distillation tutorial for MaxText, aligning it with current best practices and tooling. The guide now uses Qwen3-32B as the teacher model (via vLLM) and Llama-3.1-8B as the student, streamlines the setup with Hyperdisk storage, and provides new scripts and commands for dataset generation and fine-tuning. The instructions have been clarified, unnecessary conversion steps removed for the teacher, and the fine-tuning process updated for the latest MaxText and vLLM workflows.

Tests

Manually triggered the distillation pipeline and monitored the execution flow step-by-step. Confirmed that the training loop finished and resources were released.
image

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@codecov
Copy link

codecov bot commented Jan 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@RexBearIU RexBearIU force-pushed the jackyf/docs/distillation branch 2 times, most recently from 84aa2ed to eb215d2 Compare January 21, 2026 09:28
@RexBearIU RexBearIU force-pushed the jackyf/docs/distillation branch 4 times, most recently from d3bd2e7 to f813340 Compare January 23, 2026 09:25
@RexBearIU RexBearIU force-pushed the jackyf/docs/distillation branch from f813340 to 4b59129 Compare January 26, 2026 07:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants