Skip to content

Conversation

@yordis
Copy link

@yordis yordis commented Jan 12, 2026

Signed-off-by: Yordis Prieto yordis.prieto@gmail.com

Signed-off-by: Yordis Prieto <yordis.prieto@gmail.com>
@yordis yordis changed the title docs(rfd): Add Elicitation specification for structured user input RFD: Add Elicitation specification for structured user input Jan 12, 2026
Apply fixes from code review and GitHub research:
- Fix client capabilities to use ClientCapabilities pattern (like fs, terminal)
- Add complete turn response example showing elicitation + content integration
- Define single-elicitation-per-turn design decision for v1
- Clarify URL-mode OAuth is ACP-specific, not fully MCP-aligned
- Expand validation behavior FAQ with client/server responsibility split

These changes align with existing protocol patterns and clarify architectural
decisions identified in code review.
Add comprehensive elicitation system allowing agents to request structured
user input during conversation turns. Includes:

- ElicitationRequest: Request types (text, number, select, multiselect, boolean, password, URL)
- ElicitationSchema: JSON Schema constraints for validation
- ElicitationOption: Choices for select/multiselect types
- ElicitationResponse: User responses with convenient builder methods
- ElicitationCapability: Client capability negotiation
- StopReason.ElicitationRequested: Stop reason for elicitation requests
- Integration with PromptResponse and PromptRequest

All types support serialization, JSON Schema generation, and include
comprehensive tests. Feature-gated under unstable_elicitation flag.
Add 14 new tests covering:
- Schema constraints (min/max values, enum values)
- URL mode with OAuth return formats
- Metadata handling for options and responses
- All ElicitationType variants serialization
- Multiselect array responses
- Optional field serialization behavior
- Custom capability configurations

Total test count: 37 (13 new elicitation tests)
@ignatov
Copy link
Contributor

ignatov commented Jan 12, 2026

Hey! Thanks for contributing, is it true, that we can handle things like that https://github.com/orgs/agentclientprotocol/discussions/371 with this feature?

Add StreamMessage, StreamMessageDirection, and StreamMessageContent types
for monitoring and debugging RPC message flow. These types enable
implementations to observe incoming/outgoing requests, responses, and
notifications.

Includes:
- StreamMessageDirection enum (Incoming/Outgoing)
- StreamMessageContent enum (Request/Response/Notification variants)
- StreamMessage struct wrapping content and direction
- StreamSender/StreamReceiver type aliases using async-broadcast
- Helper constructors (::incoming(), ::outgoing())
- 5 comprehensive tests for serialization and variants

Also adds async-broadcast v0.7 dependency for async multi-consumer
broadcast channel support.
@phil65
Copy link
Contributor

phil65 commented Jan 13, 2026

Just for some context, OpenCodes question tool schema:
https://github.com/anomalyco/opencode/pull/7268/changes#diff-2036e0c76252554ccff3dfc803b49917c90430275cadc8f9962ba528780c4e79
Claude codes question tool schema:
https://platform.claude.com/docs/en/agent-sdk/python#ask-user-question
I guess these question tools as well as MCP elicitaiton are the most common use cases for this feature (+ perhaps some onboarding flows).
This is one of the biggest gaps in ACP currently I think, so +1 from me for this RFD :)

@yordis
Copy link
Author

yordis commented Jan 13, 2026

Hey @ignatov these type of spec should allow for that type of "question" form to happen. In fact, my existing ACP needs is all about such feature.

I am trying to write the spec AND figure out if I can make Zed GUI to implement it.

Overall, the intent is to allow structure input from the user such as Questionnaire, or buttons for specific auctions (at least that is my immediate need).

Copy link
Member

@benbrandt benbrandt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely need something like this, and adopting the same pattern as MCP also allows us to forward MCP elicitation requests which is nice

…like permissions

Changed elicitation from embedded in session/prompt flow to a separate request/response
method pattern matching permissions design. This provides clearer protocol semantics and
consistent handling of structured user input requests.

PROTOCOL CHANGES:
- Moved from PromptRequest.elicitation_response to RequestElicitationRequest
- Moved from PromptResponse.elicitation to RequestElicitationResponse
- New method: session/elicitation (separate like session/request_permission)

API CHANGES:
- Removed elicitation_response field from PromptRequest
- Removed elicitation field from PromptResponse
- Added RequestElicitationRequest wrapper struct
- Added RequestElicitationResponse wrapper struct
- Added SESSION_ELICITATION_METHOD_NAME constant

This aligns with @benbrandt feedback on consistency with permission request/response pattern.
@yordis
Copy link
Author

yordis commented Jan 15, 2026

@benbrandt is it OK to ask to ignore the Rust changes for now, I know it could be annoying, but I am trying to make Zed to work to see the full picture, I just realized that I committed those changes as well while trying to fix the request/response situation.

Since I am in such active development, and I want to see Zed working, the burden is on you to ignore those files and only focus on the markdown until we are ready to merge.

Otherwise, totally cool, I create another branch for myself, just make it a bit more difficult since it requires to switch between them and synchronize a bit more.

Updated RFD to document the refactored architecture where elicitation uses
a separate session/elicitation request/response method (matching permissions
pattern) instead of being embedded in session/prompt flow.

KEY CHANGES:
- Clarified that elicitation is triggered by stopReason: "elicitation_requested"
- Updated flow to show separate session/elicitation method call
- Aligned with permission request/response pattern for consistency
- Added complete JSON-RPC examples with method names and full message structure

This addresses @benbrandt's feedback about consistency between permission and
elicitation request/response mechanisms.
@ignatov
Copy link
Contributor

ignatov commented Feb 1, 2026

@benbrandt I think that we have to work on that in the next wave

@yordis
Copy link
Author

yordis commented Feb 1, 2026

@ignatov tomorrow I am back home so I can continue the work, I got distracted by trying to actually make a Zed GUI that allow us to do OpenCode-style of questionnaire as a PoC.

Let me know whatever you would like to see happening, I am available in 24hrs

Copy link
Member

@benbrandt benbrandt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK left some notes.
I would also appreciate to have the RFD merged separately from the Rust changes. as some of the things you have added would fit better within the SDK (this crate should be mostly limited to schema types)

So before we can merge this, I would want these separated to focus the review and also save you a bunch of extra work

- **Selections**: select (single), multiselect (multiple) with enum-based options
- **Sensitive inputs**: password, URL-mode for out-of-band OAuth flows (addressing PR #330 authentication pain points)

3. **Work in turn context**: Elicitation requests are triggered when a turn ends with `stopReason: "elicitation_requested"`, allowing agents to ask questions naturally within the conversation flow. Agents send elicitation requests via a separate `session/elicitation` method (following the same request/response pattern as `session/request_permission`). Unlike Session Config Options (which are persistent), elicitation requests are transient and turn-specific.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the agent need to return a stop_reason? For example, when permissions are requested, there isn't a stop reason, the agent may just await a response before continuing


3. **Work in turn context**: Elicitation requests are triggered when a turn ends with `stopReason: "elicitation_requested"`, allowing agents to ask questions naturally within the conversation flow. Agents send elicitation requests via a separate `session/elicitation` method (following the same request/response pattern as `session/request_permission`). Unlike Session Config Options (which are persistent), elicitation requests are transient and turn-specific.

4. **Support client capability negotiation**: Clients declare what elicitation types they support (similar to the client capabilities pattern emerging in the protocol). Agents handle gracefully when clients don't support elicitation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely think we should follow the MCP capablity model here:

{
  "capabilities": {
    "elicitation": {
      "form": {},
      "url": {}
    }
  }
}

Where we distinguish between the two forms so we can also better map and pass this along to agents to pass to their MCP clients, and also support clients who may only be able to offer one or the other for various reasons


### Elicitation Request Structure

When a turn ends with `stopReason: "elicitation_requested"`, the agent sends a separate elicitation request (following the same pattern as permission requests). Example 1 (User Selection - from PR #340):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I don't think the agent needs to end their turn, the client would just respond to a request, same as auth

- `select` - Single-choice selection from a list
- `multiselect` - Multiple-choice selection
- `boolean` - Yes/no choice
- `password` - Masked text input (for sensitive credentials)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these don't seem to match MCP, like this one isn't here (in fact MCP explicitly says not to ask for passwords in these forms)

I also think stuff like this would be a format field on field of type: string (this is json schema after all)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the client needs to know it needs to support the restricted json schema and can decide how to represent that. We don't necessarily need to specify specific input types in the protocol definition in my opinion (again, also just looking to the MCP specification here)


**Not supported** (to keep initial implementation simple):
- Complex nested objects/arrays
- `allOf`, `anyOf`, `oneOf`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

|--------|------------------------|-------------|
| **Lifecycle** | Persistent, pre-declared at session init | Transient, appears during turns |
| **Scope** | Session-wide configuration | Single turn/decision point |
| **Defaults** | Required (agents must have defaults) | Required (agents should always provide) |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are defaults required on elicitation? I don't think so? I think the point is the json schema can allow for required fields?


### Can agents use elicitation for information required before responding?

Yes. An agent can include an elicitation request in a turn response with a default value and continue, then incorporate the user's response into the next turn. This is how agents can guide users through multi-step workflows.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, I think this is where tying it to turns is an anti-pattern.
By modeling it as a request / response, the agent can decide in its own control flow whether or not to wait for a response before doing something else


### What if a user doesn't respond to an elicitation request?

The agent's default value is used (which agents must always provide). If an agent truly requires user input and wants to block, it should fail the turn and let the client handle retry logic.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fair to say that an elicitation request requires a response, even if that response is "cancelled" (we should allow for request cancellation, we can tie it together with the request cancellation changes in that RFD)


### Should elicitation support complex nested data structures?

For the initial version: no. We're focusing on simple types (strings, numbers, booleans, arrays of those). Complex nested structures can be added in future versions if use cases emerge. This keeps the initial scope manageable and lets us learn from real-world usage.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the key point is we are supporting whatever MCP supports here


### Can we extend this to replace the existing Permission-Request mechanism?

Potentially, but that's out of scope for this RFD. PR #210 discussed that elicitation "could potentially even replace the Permission-Request mechanism" (Phil65), but that requires separate analysis of the permission request use cases and whether elicitation's constraints (no complex nesting, simpler lifecycle) are sufficient.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A point in favor of keeping these separate: since permission requests are more of a security concern, they should be handled separately so that the Client can offer a consistent experience.

Me deciding to allow a tool call should be distinct from the model asking for clarification. Also a reason that was brought up to keep the auth flow distinct. Maybe we reuse some types, but I don't think we should necessarily conflate the features

@yordis
Copy link
Author

yordis commented Feb 4, 2026

I would also appreciate to have the RFD merged separately from the Rust changes.

That will happen for sure! Please ignore for now, 🙏🏻 I will revert the code tomorrow when I wake up and move it to another branch for my own sake.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants