-
Notifications
You must be signed in to change notification settings - Fork 723
Description
Summary
When using the Batch API with inlinedRequests, the resulting inlinedResponses are returned in a the wrong order. This directly contradicts the official documentation, which states:
"The responses will be in the same order as the input requests."
This issue, combined with #1886 , renders the inlinedRequests feature unusable for batch processing, as there is currently no way to correlate an output response to its specific input request.
Environment details
- Programming language: Python
- OS: Windows 11
- Language runtime version: 3.13.3
- Package version: google-genai 1.56.0
Steps to reproduce
- Initialize the
google.genaiclient. - Create a list of inline requests (
NUM_REQUESTS = 200) where each prompt requests a specific, predictable output (e.g., "Respond with YES" or "Respond with NO") based on the index. - Submit the batch using
client.batches.create. - Wait for the batch to succeed.
- Iterate through
batch_job.dest.inlined_responses. - Compare the content of response
iwith the expected output for requesti.
Expected Behavior:
The response at index i in inlined_responses corresponds to the request at index i in inlined_requests.
Actual Behavior:
The responses are not in the correct order. In my test of 200 random boolean questions, 96 responses do not match their index position. With another test I've found there is some pattern in this shuffling, but I've not been able to find the exact way it's been shuffled. So far I've ran this script 5 times, in 4/5 cases 96 responses do not match, which seems to imply the shuffling is always in the same way. In the other case no shuffling occurred.
Reproduction Script
"""
Minimal reproduction script for Gemini Batch API response ordering bug.
This script sends batch requests with predictable expected responses
and checks if they come back in the correct order.
Expected behavior: Response at index i should match request at index i
Actual behavior: Responses may be shuffled/out of order
"""
import time
from google import genai
# Configuration
MODEL = "gemini-3-flash-preview"
NUM_REQUESTS = 200 # Number of requests to test ordering
SEED = 0 # Fixed RNG seed for reproducibility
client = genai.Client() # Uses GOOGLE_API_KEY env var
import random
# Fixed RNG for reproducible runs
random.seed(SEED)
print(f"Using random seed: {SEED}")
# Create random pattern of YES/NO
inline_requests = []
random_pattern = []
for i in range(NUM_REQUESTS):
expected = random.choice(["YES", "NO"])
random_pattern.append(expected)
prompt = f"Request #{i}: Respond with exactly one word: {expected}"
inline_requests.append({
'contents': [{
'parts': [{'text': prompt}],
'role': 'user'
}],
'config': {
'temperature': 0,
'max_output_tokens': 5,
'thinking_config': {'thinking_level': 'minimal'}
}
})
batch_job = client.batches.create(
model=MODEL,
src=inline_requests,
config={'display_name': 'order-bug-reproduction'}
)
print(f"Batch of {NUM_REQUESTS} requests created: {batch_job.name}")
# Poll for completion
completed_states = {'JOB_STATE_SUCCEEDED', 'JOB_STATE_FAILED', 'JOB_STATE_CANCELLED', 'JOB_STATE_EXPIRED'}
while True:
batch_job = client.batches.get(name=batch_job.name)
state = batch_job.state.name
print(f"Status: {state}")
if state in completed_states:
break
time.sleep(3)
if batch_job.state.name != 'JOB_STATE_SUCCEEDED':
print(f"Batch failed with state: {batch_job.state.name}")
else:
num_mismatches = 0
responses = batch_job.dest.inlined_responses
for i, resp in enumerate(responses):
expected = random_pattern[i]
if resp.response:
actual = resp.response.text.strip().upper()
match = "YES" in actual if expected == "YES" else "NO" in actual
if not match:
num_mismatches += 1
print(f"\nTotal mismatches: {num_mismatches} out of {NUM_REQUESTS} requests")
if num_mismatches:
print("\nBUG CONFIRMED: Responses are NOT in the same order as requests!")
else:
print("\n✓ All responses matched expected order (bug not reproduced in this run)")Impact & Priority
This should have high priority.
This bug effectively breaks the inlinedRequests functionality of the Batch API.
Developers rely on one of two methods to map inputs to outputs in batch processing:
- Strict Ordering: (Guaranteed by documentation, but broken by this bug).
- Metadata/IDs: Attaching a
custom_idormetadatafield to the request and reading it in the response.
However, strictly ordering is currently broken (this issue), and metadata passthrough is also broken (see related issue ** #1886 **).
Because both ordering is unreliable and metadata is inaccessible, it is impossible to determine which output belongs to which input when using inlinedRequests. This forces developers to abandon inlinedRequests entirely and rely on GCS Bucket (File) I/O, which significantly increases implementation complexity for smaller batch jobs.