[BUG]: `optimize_data_file=True` does not persist inline_data replacement - CSV sent on every LLM call

**Describe the bug**
When `optimize_data_file=True` is set on a custom CodeExecutor (extending `BaseCodeExecutor`), the `inline_data` (CSV file) is not permanently replaced with the text placeholder `"Available file: 'xxx.csv'"`. Instead, the original `inline_data` is restored on every subsequent LLM call, causing the full CSV to be sent to the LLM every turn.

This means the token-saving feature of `optimize_data_file` does not work as intended.

**To Reproduce**
1. Create a custom CodeExecutor extending `BaseCodeExecutor` with `optimize_data_file=True`
2. Upload a CSV file and send a message to the agent
3. Observe debug logs showing `inline_data` being replaced with `"Available file: ..."` in `_extract_and_replace_inline_files`
4. On the next turn, observe that `inline_data` has returned (the full CSV is sent again)

Debug logs show:
```
=== AFTER _extract_and_replace_inline_files ===
[0][1] replaced with: Available file: `data_1_2.csv`

=== LLM Request Contents (in before_model_callback) ===
[0][1] inline_data: mime=text/csv, size=61194   ← inline_data is back!
```

Additionally, comparing object IDs reveals that `llm_request.contents[0]` is a different object between `_extract_and_replace_inline_files` and `before_model_callback`:
```
=== _extract_and_replace_inline_files ===
contents[0] id: 5998228768   ← replaced this object

=== before_model_callback ===
contents[0] id: 4631206624   ← different object!
```

**Expected behavior**
Once `_extract_and_replace_inline_files` replaces `inline_data` with `"Available file: 'xxx.csv'"`, this replacement should persist for all subsequent LLM calls in the session. The full CSV should not be sent to the LLM on every turn.

**Root Cause (Hypothesis)**
Based on code analysis, the suspected cause is:

1. `_code_execution.request_processor` yields Events (for `Processing input file` and `code_execution_result`)
2. These Events are not `is_final_response()`, so the `while True` loop in `run_async()` (`base_llm_flow.py:367`) continues
3. `_run_one_step_async()` creates a **new `LlmRequest()`** on each iteration (`base_llm_flow.py:383`)
4. `contents.request_processor` rebuilds `llm_request.contents` by calling `copy.deepcopy(event.content)` from `session.events` (`contents.py:444`)
5. Since the original `session.events[0].content` was never modified (only the `llm_request.contents` copy was), the `inline_data` is restored

**Desktop (please complete the following information):**
 - OS: macOS
 - Python version: 3.12
 - ADK version: 1.21.0

**Model Information:**
 - Are you using LiteLLM: Yes
 - Which model is being used: Claude (via Bedrock)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG]: `optimize_data_file=True` does not persist inline_data replacement - CSV sent on every LLM call #4012

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG]: optimize_data_file=True does not persist inline_data replacement - CSV sent on every LLM call #4012

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[BUG]: `optimize_data_file=True` does not persist inline_data replacement - CSV sent on every LLM call #4012