generated from MIT-Emerging-Talent/ET6-CDSP-starter
-
Notifications
You must be signed in to change notification settings - Fork 1
Labels
documentationImprovements or additions to documentationImprovements or additions to documentation
Milestone
Description
Summary
Standardize our model testing using Apollo 11 source text with 15 prompts covering summarization, reasoning, and RAG tasks.
Proposal
Source Text: Wikipedia Apollo 11 excerpts (from “Lunar Landing” and “Lunar Surface Operations” sections) with permanent link (~1,400 words, CC BY-SA 3.0)
15 Test Prompts:
- 5 Summarization (easy → hard)
- 5 Reasoning (causal, analytical, hypothetical)
- 5 RAG (fact retrieval with ground truth answers)
Why Apollo 11?
- Works for all model types (DistilBERT, SLMs, commercial)
- Fact-dense for RAG testing
- Properly licensed and reproducible
- Hardware-friendly length
Open Questions for Team
Prompt format: JSON for automation or plain text for simplicity?
Text length: Is 1,400 words optimal, or should we go shorter/longer(full sections?)?
Multiple sources: Start with one text or prepare multiple examples?
Next Steps
- Please review detailed documentation here
- Discuss and feedback for open questions
- Implementation after approval
Metadata
Metadata
Assignees
Labels
documentationImprovements or additions to documentationImprovements or additions to documentation