-
Notifications
You must be signed in to change notification settings - Fork 723
Open
Labels
asset: toolDEE Asset tagging - Tool.DEE Asset tagging - Tool.priority: p2Moderately-important priority. Fix may not be included in next release.Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Description
Client Library Issue - The library fails to validate function_result input types, allowing invalid payloads to be sent to the model which causes silent failures/hallucinations.
Environment details
- Programming language: Python
- OS: Linux / macOS
- Language runtime version: Python 3.12.3
- Package version: google-genai 1.56.0
Steps to reproduce
- Initialize the google.genai client with the gemini-3-flash-preview model.
- Define a tool that returns a raw list of strings (e.g., ["a", "b"]).
- Start an interaction and call the tool.
- Send the tool output back to the model via client.interactions.create, passing the raw list directly into the result field of the function_result part (instead of wrapping it in a dictionary/struct).
- Observe that the SDK sends the request without error, but the model subsequently hallucinates values or the SDK crashes on the response.
import random
from google import genai
# Setup
API_KEY = "YOUR_API_KEY"
client = genai.Client(api_key=API_KEY)
# Tool Definition
def get_random_numbers():
return [str(random.randint(1, 50)) for _ in range(10)]
random_numbers_tool = {
"type": "function",
"name": "get_random_numbers",
"description": "Returns a list of 10 random numbers between 1 and 50 as strings.",
"parameters": {"type": "object", "properties": {}, "required": []}
}
print("--- Testing Unwrapped List Result ---")
try:
# 1. Init interaction
interaction = client.interactions.create(
model="gemini-3-flash-preview",
input="Call get_random_numbers. Tell me the highest number.",
tools=[random_numbers_tool]
)
# 2. Handle Tool Call
for output in interaction.outputs:
if output.type == "function_call":
result = get_random_numbers()
print(f"Tool Result Sent: {result}")
# 3. Send back RAW LIST (The Issue)
# The SDK allows this but the API likely expects a Struct/Dict
interaction = client.interactions.create(
model="gemini-3-flash-preview",
previous_interaction_id=interaction.id,
input=[{
"type": "function_result",
"name": output.name,
"call_id": output.id,
"result": result
}]
)
if interaction.outputs:
print(f"Response: {interaction.outputs[-1].text}")
except Exception as e:
print(f"SDK Error: {e}")Observed Results
When passing an unwrapped list, the model fails to parse the data correctly in ~70-100% of cases.
- Hallucinations: The model invents numbers not present in the list (e.g., claiming "99" is the max when the list max is 50).
- SDK Crashes: AttributeError: 'FunctionCallContent' object has no attribute 'text'.
Tool Result: ['39', '19', '15', '48', '35', '50', '4', '6', '10', '25']
Response: The list contains 25 numbers, and the highest number in the list is 99.
# FAILS: Max is 50. List size is 10. Model hallucinated "99" and "25 items".
Tool Result: ['21', '41', '48', '11', '49', '11', '20', '10', '19', '13']
Response: The tool returned 10 numbers: 18, 54, 92, 7, 31, 85, 46, 63, 12, and 77. The highest number in the list is 92.
# FAILS: Complete hallucination. Not a single number matches the tool output.
Expected Behavior
The SDK should validate client-side that result in a function_result is a dict (Struct). If a list is passed, it should raise a TypeError or ValueError immediately to prevent sending invalid payloads that cause silent model failures.
Workaround: Wrapping the result fixes the issue entirely: "result": {"items": result}.
Metadata
Metadata
Assignees
Labels
asset: toolDEE Asset tagging - Tool.DEE Asset tagging - Tool.priority: p2Moderately-important priority. Fix may not be included in next release.Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.Error or flaw in code with unintended results or allowing sub-optimal usage patterns.