feat: add ask_user tool with host-side LLM interception by yash-scaleai · Pull Request #1 · scaleapi/SWE-agent

yash-scaleai · 2026-01-28T10:47:45Z

Summary

Adds an ask_user tool that allows agents to request clarification on underspecified tasks. The tool simulates user responses using an LLM based on the complete task definition.

Key changes:

Add ask_user tool in tools/ask_user/
Add TaskDefinitionInjectionHook to inject task definitions into containers
Auto-detect task_definitions.json in run_batch.py
Critical fix: Intercept ask_user commands on HOST side in agents.py

Why host-side interception?

The original implementation (Bryan's PR #17144) runs the LLM call inside the Modal container. However, Modal containers cannot reach internal API endpoints like litellm.ml-serving-internal.scale.com.

This PR intercepts ask_user commands in agents.py before they are sent to the container, makes the LLM call on the host (where the agent's own LLM calls are made), and returns the response directly.

Flow:

1. Agent outputs: ask_user "what's the cert validation behavior?"
2. Host intercepts command in handle_action()
3. Host makes LLM call (can reach internal API)
4. Response returned directly to agent
5. Container's ask_user binary never runs

Based on Bryan's PR #17144 with architectural modification for internal API compatibility.

Test plan

Single task test with scripts/test_ask_user_single.py
Verified ask_user interception in logs
Verified response in trajectory file

🤖 Generated with Claude Code

Greptile Summary

Adds an ask_user tool that lets agents request clarification on underspecified tasks by simulating user responses via an LLM. The key architectural decision is to intercept ask_user commands on the host side (in agents.py) rather than inside the container, since Modal containers cannot reach internal API endpoints.

Adds host-side interception in DefaultAgent.handle_action() that parses ask_user commands, loads task definitions from a JSON file, and calls litellm.completion to simulate a user response
Adds TaskDefinitionInjectionHook to inject task definitions into containers (for the in-container fallback binary)
Auto-detects task_definitions.json in run_batch.py and registers the hook
Includes container-side ask_user binary and llm.py helper (effectively dead code since host intercepts first)
New tool_use_with_ask_user.yaml config variant

Issues found:

litellm.drop_params = True in agents.py:189 is a global mutation that permanently affects all litellm.completion calls in the process, including the main agent model calls. This is especially problematic in multi-worker mode.
Silent exception swallowing in _load_task_definitions (except Exception: pass) makes it impossible to debug task definition loading failures.
Unnecessary credential injection into the container environment (OPENAI_API_KEY, LLM_API_KEY) despite host-side interception making the in-container binary redundant — this is unnecessary credential exposure.

Confidence Score: 2/5

The litellm.drop_params global mutation can silently alter behavior of the main agent LLM calls across all workers.
Score of 2 reflects the global side-effect bug with litellm.drop_params which can affect correctness of the main agent model calls, combined with silent error swallowing that will make production debugging difficult. The core interception logic is sound but these issues need fixing before merge.
Pay close attention to sweagent/agent/agents.py (global state mutation, silent error handling) and sweagent/run/hooks/task_definition_injection.py (unnecessary credential exposure).

Important Files Changed

Filename	Overview
sweagent/agent/agents.py	Adds ~180 lines for host-side ask_user interception. Contains a global side-effect bug (`litellm.drop_params = True`) that affects all LLM calls, and silently swallows task definition load errors.
sweagent/run/hooks/task_definition_injection.py	New RunHook that injects task definitions into containers. Unnecessarily injects API credentials into the container despite host-side interception making the in-container binary redundant.
sweagent/run/run_batch.py	Auto-detects task_definitions.json in output_dir and adds the TaskDefinitionInjectionHook. Clean integration following existing hook patterns.
tools/ask_user/bin/ask_user	Container-side ask_user binary (now effectively dead code since host intercepts). Contains debug print statements that leak API base URL to stderr.
tools/ask_user/bin/llm.py	Simple OpenAI client wrapper for the container-side tool. Module-level client singleton without thread-safety, but acceptable since this only runs in single-threaded container context.
tools/ask_user/config.yaml	Tool definition for ask_user with question (required) and context (optional) arguments. Correctly follows the existing tool config schema.
config/tool_use_with_ask_user.yaml	New agent config variant adding the ask_user tool bundle alongside existing tool bundles. Clean config following existing patterns.

Sequence Diagram

sequenceDiagram
    participant Agent as Agent LLM
    participant Host as Host (agents.py)
    participant Cache as Task Definitions Cache
    participant LiteLLM as LiteLLM (User Simulator)
    participant Container as Container (never reached)

    Agent->>Host: ask_user "question" "context"
    Host->>Host: _parse_ask_user_command()
    Host->>Cache: _load_task_definitions(traj_path)
    Cache-->>Host: task_defs dict
    Host->>Host: Build system prompt with primary_task + underspecified_task
    Host->>LiteLLM: litellm.completion(model, messages)
    LiteLLM-->>Host: Simulated user response
    Host-->>Agent: step.observation = response
    Note over Container: Container binary never executes<br/>(host intercepts first)

_{Last reviewed commit: 36d9a40}

_{(4/5) You can add custom instructions or style guidelines for the agent here!}

Adds an ask_user tool that allows agents to request clarification on underspecified tasks. The tool simulates user responses using an LLM based on the complete task definition. Key changes: - Add ask_user tool in tools/ask_user/ - Add TaskDefinitionInjectionHook to inject task definitions into containers - Auto-detect task_definitions.json in run_batch.py - **Critical fix**: Intercept ask_user commands on HOST side in agents.py The host-side interception is necessary because Modal containers cannot reach internal API endpoints (e.g., litellm.ml-serving-internal.scale.com). By handling the LLM call on the host (where the agent's own LLM calls are made), we avoid timeout issues while maintaining the same functionality. Based on Bryan's PR #17144 with architectural modification for internal API compatibility. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

socket-security · 2026-01-28T10:48:05Z

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff	Package	Supply Chain Security	Vulnerability	Quality	Maintenance	License
	openai@2.21.0
	swe-rex@1.2.0 ⏵ 1.4.0

View full report

@Retry

Includes post-experiment working tree fixes (traj_path refactor, _find_task_definitions_file search) that ran during Phase B. Safety fixes (no behavior change): - Add threading.Lock around task definitions cache - Tighten _parse_ask_user_command to exact word boundary - Guard result against None before slicing for log - Remove no-op @Retry(stop_after_attempt(1)) in container llm.py Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Infrastructure fixes applied during experiment runs: - pyproject.toml: use jeff-da/SWE-ReX sweap-support branch for Modal - models.py: handle top_p=None in model ID string (was crashing) - justfile: mount data/ volume for container access Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

greptile-apps

_{9 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-20T00:54:20Z

sweagent/agent/agents.py

+    try:
+        logger.info(f"ask_user intercepted on host: question='{question[:100]}...'")
+        # Drop unsupported params for models like GPT-5 that don't support temperature
+        litellm.drop_params = True


Global side-effect on all litellm calls

litellm.drop_params is a module-level global. Setting it to True here permanently affects all subsequent litellm.completion calls in the process, including the main agent model calls in models.py:693. This means unsupported parameters will be silently dropped for the agent's own LLM calls too, which could mask configuration errors or change behavior unexpectedly.

This is especially problematic in multi-worker mode (ThreadPoolExecutor in run_batch.py:288) where a single ask_user call from one worker mutates global state for all workers.

Consider passing drop_params as a per-call parameter instead:

Suggested change

litellm.drop_params = True

litellm.drop_params = True # TODO: this mutates global state; move to per-call param if litellm supports it

Or scope the change more carefully by saving and restoring the original value.

Prompt To Fix With AI

This is a comment left during a code review. Path: sweagent/agent/agents.py Line: 189 Comment: **Global side-effect on all litellm calls** `litellm.drop_params` is a module-level global. Setting it to `True` here permanently affects all subsequent `litellm.completion` calls in the process, including the main agent model calls in `models.py:693`. This means unsupported parameters will be silently dropped for the agent's own LLM calls too, which could mask configuration errors or change behavior unexpectedly. This is especially problematic in multi-worker mode (`ThreadPoolExecutor` in `run_batch.py:288`) where a single `ask_user` call from one worker mutates global state for all workers. Consider passing `drop_params` as a per-call parameter instead: ```suggestion litellm.drop_params = True # TODO: this mutates global state; move to per-call param if litellm supports it ``` Or scope the change more carefully by saving and restoring the original value. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-20T00:54:21Z

sweagent/agent/agents.py

+        except Exception:
+            pass


Silent exception swallowing hides load failures

The bare except Exception: pass here silently swallows all errors when loading task definitions (malformed JSON, permission errors, encoding issues, etc.). When ask_user later fails with "No task definition found for instance X", there will be no indication that the file existed but couldn't be parsed.

At minimum, log the exception:

Suggested change

except Exception:

pass

except Exception as e:

logging.getLogger(__name__).warning("Failed to load task definitions from %s: %s", task_def_file, e)

Prompt To Fix With AI

This is a comment left during a code review. Path: sweagent/agent/agents.py Line: 111-112 Comment: **Silent exception swallowing hides load failures** The bare `except Exception: pass` here silently swallows all errors when loading task definitions (malformed JSON, permission errors, encoding issues, etc.). When `ask_user` later fails with "No task definition found for instance X", there will be no indication that the file existed but couldn't be parsed. At minimum, log the exception: ```suggestion except Exception as e: logging.getLogger(__name__).warning("Failed to load task definitions from %s: %s", task_def_file, e) ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-20T00:54:22Z

sweagent/run/hooks/task_definition_injection.py

+            # Pass through API credentials for the user simulator LLM
+            api_key = os.environ.get("OPENAI_API_KEY") or os.environ.get("LLM_API_KEY")
+            base_url = os.environ.get("OPENAI_BASE_URL") or os.environ.get("LLM_BASE_URL")
+            simulator_model = os.environ.get("USER_SIMULATOR_MODEL")
+
+            if api_key:
+                env_vars["OPENAI_API_KEY"] = api_key
+                env_vars["LLM_API_KEY"] = api_key
+            if base_url:
+                env_vars["OPENAI_BASE_URL"] = base_url
+                env_vars["LLM_BASE_URL"] = base_url
+            if simulator_model:
+                env_vars["USER_SIMULATOR_MODEL"] = simulator_model


Unnecessary credential injection into container

Since ask_user commands are now intercepted on the host side (in agents.py) before reaching the container, the container's ask_user binary never actually runs. Injecting API keys (OPENAI_API_KEY, LLM_API_KEY) into the container environment is unnecessary credential exposure. If the container is compromised or the agent runs arbitrary code, these credentials could be exfiltrated.

Consider removing the API credential injection since the host-side interception makes it redundant, or add a comment explaining why it's still needed (e.g., as a fallback).

Prompt To Fix With AI

This is a comment left during a code review. Path: sweagent/run/hooks/task_definition_injection.py Line: 104-116 Comment: **Unnecessary credential injection into container** Since `ask_user` commands are now intercepted on the host side (in `agents.py`) before reaching the container, the container's `ask_user` binary never actually runs. Injecting API keys (`OPENAI_API_KEY`, `LLM_API_KEY`) into the container environment is unnecessary credential exposure. If the container is compromised or the agent runs arbitrary code, these credentials could be exfiltrated. Consider removing the API credential injection since the host-side interception makes it redundant, or add a comment explaining why it's still needed (e.g., as a fallback). How can I resolve this? If you propose a fix, please make it concise.

yash-scaleai and others added 2 commits February 20, 2026 00:50

greptile-apps bot reviewed Feb 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

feat: add ask_user tool with host-side LLM interception#1

feat: add ask_user tool with host-side LLM interception#1
yash-scaleai wants to merge 3 commits intoscaleapi:scale-customizationsfrom
yash-scaleai:yash/ask-user-host-interception

yash-scaleai commented Jan 28, 2026 •

edited by greptile-apps bot

Loading

Uh oh!

socket-security bot commented Jan 28, 2026 •

edited

Loading

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Feb 20, 2026

Uh oh!

greptile-apps bot Feb 20, 2026

Uh oh!

greptile-apps bot Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	litellm.drop_params = True
	litellm.drop_params = True # TODO: this mutates global state; move to per-call param if litellm supports it

-        except Exception:
-            pass
+        except Exception as e:
+            logging.getLogger(__name__).warning("Failed to load task definitions from %s: %s", task_def_file, e)

Comments

Conversation

yash-scaleai commented Jan 28, 2026 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key changes:

Why host-side interception?

Flow:

Test plan

Greptile Summary

Confidence Score: 2/5

Important Files Changed

Sequence Diagram

Uh oh!

socket-security bot commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yash-scaleai commented Jan 28, 2026 •

edited by greptile-apps bot

Loading

socket-security bot commented Jan 28, 2026 •

edited

Loading