Skip to content

Comments

Python: Fixing issue #1366 - Thread corruption when max_iterations is reached.#4234

Merged
alliscode merged 7 commits intomicrosoft:mainfrom
alliscode:investigation/issue-1366-1771969306
Feb 25, 2026
Merged

Python: Fixing issue #1366 - Thread corruption when max_iterations is reached.#4234
alliscode merged 7 commits intomicrosoft:mainfrom
alliscode:investigation/issue-1366-1771969306

Conversation

@alliscode
Copy link
Member

This pull request fixes a critical bug in the agent framework where exhausting max_iterations during tool invocation could corrupt the thread state by skipping a final model call and losing conversation history. The fix ensures that, when the iteration limit is reached, a final model call is made with tool_choice="none" and all accumulated messages are preserved, restoring thread integrity. The changes are minimal, well-tested, and follow existing code patterns.

Bug fix in agent tool invocation loop:

  • Updated python/packages/core/agent_framework/_tools.py to remove the premature return response when max_iterations is reached in both non-streaming and streaming paths. Instead, the code now logs the event and falls through to the existing failsafe logic, making a final model call with tool_choice="none" and prepending all accumulated function call/result messages (fcc_messages). This ensures a clean text response and complete conversation history. [1] [2]

  • Refactored related conditions and log messages for consistency and clarity, matching the style of the max_function_calls limit handling. [1] [2]

Test coverage improvements:

  • Removed unnecessary @pytest.mark.skip decorators from test_max_iterations_limit and test_streaming_max_iterations_limit in test_function_invocation_logic.py, re-enabling these tests to validate the fixed behavior. [1] [2]

  • Added a new targeted test file (test_issue_1366_thread_corruption.py) to cover orphaned function call detection, failsafe invocation, message preservation, and full agent run thread integrity.

Documentation and quality review:

  • Included comprehensive issue fix, reproduction, and quality review reports documenting the root cause, fix rationale, test evidence, and code quality assessment. [1] [2] [3]

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

alliscode and others added 5 commits February 24, 2026 14:05
When the function invocation loop exhausts max_iterations while the model
keeps requesting tools, the failsafe code path (calling the model with
tool_choice='none' and prepending fcc_messages) was unreachable because
'if response is not None: return response' short-circuited before it.

The fix removes the premature return so the failsafe always runs after
loop exhaustion, making a final model call with tool_choice='none' to
produce a clean text answer and prepending accumulated fcc_messages from
prior iterations. This matches the existing pattern used by the error
threshold and max_function_calls paths.

Also unskips test_max_iterations_limit and test_streaming_max_iterations_limit
which were previously skipped with 'needs investigation in unified API'.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…n.py

Apply ruff format to fix multi-line string concatenation and function call
formatting issues flagged by the linter.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings February 24, 2026 23:24
@github-actions github-actions bot changed the title Investigation/issue 1366 1771969306 Python: Investigation/issue 1366 1771969306 Feb 24, 2026
@markwallace-microsoft
Copy link
Member

markwallace-microsoft commented Feb 24, 2026

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/core/agent_framework
   _tools.py8769189%166–167, 322, 324, 342–344, 351, 369, 383, 390, 397, 413, 415, 422, 459, 484, 488, 505–507, 554–556, 578, 632, 654, 717–723, 759, 770–781, 803–805, 810, 814, 828–830, 869, 938, 948, 958, 1014, 1045, 1064, 1342, 1399, 1419, 1490–1494, 1616, 1620, 1644, 1670, 1672, 1688, 1690, 1775, 1805, 1825, 1827, 1880, 1943, 2134–2135, 2183, 2251–2252, 2310, 2315, 2322
TOTAL21740275787% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
4534 244 💤 0 ❌ 0 🔥 1m 16s ⏱️

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes a critical bug in the Python agent framework where exhausting max_iterations during tool invocation would return a response with orphaned function calls (FunctionCallContent without matching FunctionResultContent), corrupting thread state and causing subsequent API calls to fail with 400 errors.

Changes:

  • Fixed the function invocation loop to make a final model call with tool_choice="none" when max_iterations is exhausted, preventing orphaned function calls and ensuring complete conversation history
  • Re-enabled previously skipped tests for max_iterations behavior in both non-streaming and streaming modes
  • Added comprehensive regression tests specifically targeting thread corruption scenarios

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
python/packages/core/agent_framework/_tools.py Removed premature return when max_iterations exhausted; both non-streaming (lines 2195-2203) and streaming (lines 2334-2342) paths now fall through to failsafe logic that calls model with tool_choice="none" and preserves all messages
python/packages/core/tests/core/test_function_invocation_logic.py Removed duplicate @pytest.mark.skip decorators from max_iterations tests, re-enabling validation of the fixed behavior
python/packages/core/tests/core/test_issue_1366_thread_corruption.py Added new comprehensive test file with 4 tests covering orphaned function call detection, failsafe invocation, message preservation, and full agent thread integrity

…ock behavior in test

- Add explicit function_invocation_configuration['enabled'] check to the
  'Maximum iterations reached' log condition in both non-streaming and
  streaming paths, making intent clearer when function invocation is disabled.
- Add comment in test_thread_safe_after_max_iterations_with_agent explaining
  that the failsafe response (tool_choice='none') is provided automatically
  by the mock client, not from run_responses.
@alliscode alliscode changed the title Python: Investigation/issue 1366 1771969306 Python: Fixing issue #1366 - Thread corruption when max_iterations is reached. Feb 25, 2026
@alliscode alliscode enabled auto-merge February 25, 2026 00:37
- Remove issue microsoft#1366 references from _tools.py comments
- Move regression tests from standalone test_issue_1366_thread_corruption.py
  into test_function_invocation_logic.py alongside existing max_iterations tests
- Clean up test docstrings to describe behavior generically
- Delete the standalone issue-specific test file
@alliscode
Copy link
Member Author

alliscode commented Feb 25, 2026

Closes #1366

@alliscode alliscode added this pull request to the merge queue Feb 25, 2026
Merged via the queue into microsoft:main with commit 23fe2c1 Feb 25, 2026
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants