Runner retry idempotency for persisted user events #4527
davidahmann
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Problem observed
Runner retry behavior can append duplicate user events for the same logical invocation when retries happen after partial persistence. This creates divergent session histories where one user turn appears multiple times, even though only one logical request occurred. For replay and audit use cases, duplicate append behavior is a direct contract risk.
Why it matters operationally
ADK users rely on persisted event history for replay, regression analysis, and operational receipts. Duplicate user events inflate histories, confuse downstream evaluators, and make automated checks treat retried invocations as separate user turns. This undermines deterministic runner/session behavior and increases manual debugging during incident triage.
Minimal repro
Fix approach
The patch adds a duplicate guard in runner message handling so user-event append for retry/resume paths becomes idempotent for the same invocation context. It also extends unit coverage to include the
state_deltapath, ensuring the guard applies consistently across resumed flows and does not regress regular first-attempt persistence behavior.Validation evidence
uv run pytest tests/unittests/test_runners.py -k 'duplicate or retry or state_delta'passed../autoformat.shwas invoked; local environment reported missingpyinktool.Open follow-up question for maintainers
Do you want this idempotency behavior documented as a strict runner contract guarantee in public API docs, or kept as an internal persistence invariant for now?
This contribution was informed by patterns from Gait: https://github.com/Clyra-AI/gait
Beta Was this translation helpful? Give feedback.
All reactions