Skip to content

Conversation

@Mattsface
Copy link
Collaborator

Why

Migrate from Python dataclasses to Pydantic v2 for:

  • Built-in data validation (no more silent failures)
  • Automatic nested model instantiation (remove manual __post_init__)
  • Pythonic snake_case field names (while maintaining API compatibility via aliases)
  • Better IDE support and type safety

What

Complete migration of 150+ model classes across all modules:

  • Added MLBBaseModel base class with extra="ignore", populate_by_name=True
  • Renamed all fields from camelCase to snake_case (e.g., gamepkgame_pk)
  • Added Field(alias="...") for API compatibility
  • Replaced __post_init__ with Pydantic's automatic model instantiation
  • Added field_validators to handle empty dicts {} from API as None
  • Fixed type mismatches discovered during migration (e.g., stolenbasepercentage is str not int)
  • Class renames: GamepaceGamePace, HomerunderbyHomeRunDerby

103 files changed, 5,196 insertions, 5,245 deletions

Tests

  • ✅ All 186 tests pass (2 skipped)
  • ✅ Updated tests to use new snake_case field names
  • ✅ Added test_base_model.py for MLBBaseModel behavior

Risk and impact

Risk Level: Normal

  • Breaking change: all public field names change from camelCase to snake_case
  • Users will get AttributeError if using old field names after upgrade
  • Requires major version bump and migration guide

Add pydantic to project dependencies and refresh lockfile to support upcoming model migration to Pydantic v2.
- Introduce MLBBaseModel (Pydantic v2) with extra="ignore" and populate_by_name
- Export MLBBaseModel from models package
- Add unit tests to verify ignoring unknown fields and alias population
BREAKING CHANGES:
- All model field names are now snake_case (e.g., spring_league instead of springleague)
- Class names standardized to PascalCase (e.g., TeamRecords instead of Teamrecords)
- Models now raise ValidationError instead of TypeError for missing required fields

Models converted:
- Sport, Season
- Venue, Location, TimeZone, FieldInfo, VenueDefaultCoordinates
- League, LeagueRecord
- Division
- Team, TeamRecord, Record, OverallLeagueRecord, TypeRecords, DivisionRecords, LeagueRecords, Records
- Standings, TeamRecords, Streak
- Attendance, AttendanceRecords, AttendanceTotals, AttendanceHighLowGame, AttendanceGameType

Key improvements:
- All models inherit from MLBBaseModel with extra=ignore (handles new API fields gracefully)
- Field aliases maintain API compatibility (e.g., Field(alias=springleague))
- populate_by_name=True allows both old and new names in constructors
- Fixed several type mismatches (season: int, active: bool, elevation: int, etc.)
- Updated all affected tests to use new field names
Models converted:
- people/attributes: BatSide, PitchHand, Position, Status, Home, School
- people/people: Person, Player, Coach, Batter, Pitcher, DraftPick
- data/data: PitchBreak, PitchCoordinates, PitchData, HitCoordinates,
  HitData, CodeDesc, Violation, Count, PlayDetails

Key changes:
- All fields use snake_case with aliases for API compatibility
- Player uses model_validator to handle position -> primary_position mapping
- PlayDetails uses field_validator to convert empty dicts to None
- Fixed type mismatches: current_team is Team (not str), launch_angle is float
- Updated person tests to use new field names
Models converted:

game/ (41 classes):
- Game, MetaData, GameData, GameDataGame, GameDatetime, GameStatus
- GameTeams, GameWeather, GameInfo, ReviewInfo, GameReview, GameFlags
- GameProbablePitchers, MoundVisits, LiveData, GameDecisions, GameLeaders
- Plays, Play, PlayAbout, PlayResult, PlayReviewDetails, PlayMatchup
- PlayMatchupSplits, PlayEvent, PlayRunner, RunnerMovement, RunnerDetails
- RunnerCredits, PlayByInning, PlayByInningHits, HitsByTeam, HitCoordinates
- BoxScore, TopPerformer, BoxScoreTeam, BoxScoreTeams, BoxScoreOfficial
- BoxScoreGameStatus, PlayersDictPerson, BoxScoreVL, BoxScoreTeamInfo
- Linescore, LinescoreTeamScoring, LinescoreInning, LinescoreTeams
- LinescoreOffense, LinescoreDefense

gamepace/ (3 classes):
- GamePace, GamePaceData, PrPortalCalculatedFields

homerunderby/ (11 classes):
- HomeRunDerby, Info, EventType, Status, Round, Matchup, Seed
- Hits, HitData, Coordinates, TrajectoryData

Key changes:
- All fields use snake_case with aliases for API compatibility
- Class names standardized to PascalCase (Gamepace -> GamePace, etc.)
- Field validators handle empty dicts from API as None
- Fixed type mismatches: TrajectoryData uses float, Coordinates optional
- Updated mlb_api.py imports and return types
- Updated all related tests to use new class/field names
Models converted:

schedules/ (4 classes):
- Schedule, ScheduleDates, ScheduleGames, ScheduleHomeAndAway, ScheduleGameTeam

drafts/ (4 classes):
- Round, DraftPick, DraftHome, DraftSchool

awards/ (2 classes):
- Awards, Award

Key changes:
- All fields use snake_case with aliases for API compatibility
- Class names: Home -> DraftHome, School -> DraftSchool (avoid conflicts)
- Fixed optional fields: DraftSchool.city/state/school_class, DraftHome.state
- Field validators handle empty dicts from API as None
- Updated all related tests to use new Pythonic field names
- Changed TypeError expectations to ValidationError in tests
Models converted (8 files, 80+ classes):

stats.py (base classes):
- Stat, Split, PitchArsenalSplit, ExpectedStatistics, Sabermetrics
- ZoneCodes, Zones, HotColdZones, Chart, SprayCharts
- PitchArsenal, OutsAboveAverage, PlayerGameLogStat

hitting.py:
- SimpleHittingSplit, AdvancedHittingSplit, HittingPlay
- HittingWinLoss, HittingHomeAndAway, HittingCareer, HittingSeason
- HittingGameLog, HittingPlayLog, HittingPitchLog, HittingByMonth
- HittingVsTeam, HittingVsPlayer, HittingExpectedStatistics, etc.

pitching.py:
- SimplePitchingSplit, AdvancedPitchingSplit, PitchingPlay
- PitchingSeason, PitchingCareer, PitchingGameLog, PitchingLog
- PitchingByMonth, PitchingHomeAndAway, PitchingWinLoss
- PitchingVsTeam, PitchingVsPlayer, PitchingRankings, etc.

fielding.py:
- SimpleFieldingSplit, FieldingSeason, FieldingCareer
- FieldingHomeAndAway, FieldingGameLog, FieldingByMonth, etc.

catching.py:
- SimpleCatchingSplit, CatchingSeason, CatchingCareer
- CatchingGameLog, CatchingHomeAndAway, CatchingWinLoss, etc.

running.py:
- RunningOpponentsFaced

game.py:
- SimpleGameStats, SeasonGame, CareerGame
- CareerRegularSeasonGame, CareerPlayoffsGame

Key changes:
- All fields use snake_case with aliases for API compatibility
- Fixed type mismatches: stolenbasepercentage, groundoutstoairouts,
  atbatsperhomerun, whiffpercentage are strings not ints
- Game model fields made optional for stats endpoint compatibility
- Field validators handle empty dicts from API as None
- Updated all stats tests to use new Pythonic field names
  (totalsplits -> total_splits)
- Add "Working with Pydantic Models" section with model_dump() and
  model_dump_json() examples
- Update all examples to use snake_case field names
- Simplify examples using f-strings and direct field access
- Fix typos and clean up language
- Add note about Pydantic in Getting Started section
@Mattsface Mattsface force-pushed the refactor/pydantic-model-migration branch from 6af2185 to 0734694 Compare January 13, 2026 06:54
@Mattsface Mattsface merged commit 7004de7 into main Jan 13, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants