From 63a84a68deb55ebe1a9a6c9d361718ad186d6cc8 Mon Sep 17 00:00:00 2001 From: Claude Date: Mon, 24 Nov 2025 03:50:34 +0000 Subject: [PATCH 01/58] docs: add comprehensive analysis of OpenCode architecture Add detailed documentation analyzing three key aspects of OpenCode: 1. MCP server connection support - covers entry points, configuration, connection lifecycle, protocol handling, tool registration, and error handling 2. Session multiple connections and message ordering - explains how multiple clients can connect to the same session and how message ordering is guaranteed through callback queues and locks 3. Multi-server statefulness - analyzes why OpenCode servers are stateful and require session affinity, documenting in-memory state that prevents horizontal scaling without distributed coordination --- .../docs/analysis/mcp-server-connection.md | 443 +++++++++++++++ .../analysis/multi-server-statefulness.md | 531 ++++++++++++++++++ .../analysis/session-connection-ordering.md | 522 +++++++++++++++++ 3 files changed, 1496 insertions(+) create mode 100644 packages/opencode/docs/analysis/mcp-server-connection.md create mode 100644 packages/opencode/docs/analysis/multi-server-statefulness.md create mode 100644 packages/opencode/docs/analysis/session-connection-ordering.md diff --git a/packages/opencode/docs/analysis/mcp-server-connection.md b/packages/opencode/docs/analysis/mcp-server-connection.md new file mode 100644 index 00000000000..98d929b720b --- /dev/null +++ b/packages/opencode/docs/analysis/mcp-server-connection.md @@ -0,0 +1,443 @@ +# MCP Server Connection Analysis + +This document provides a comprehensive analysis of how MCP (Model Context Protocol) server connections are supported in the OpenCode codebase, starting from `packages/opencode`. + +## Table of Contents + +1. [Entry Points](#1-entry-points) +2. [Configuration](#2-configuration) +3. [Connection Lifecycle](#3-connection-lifecycle) +4. [Protocol Handling](#4-protocol-handling) +5. [Tool Registration](#5-tool-registration) +6. [Error Handling](#6-error-handling) +7. [Key Files Summary](#7-key-files-summary) +8. [Dependencies](#8-dependencies) +9. [Data Flow Diagram](#9-data-flow-diagram) +10. [Security Considerations](#10-security-considerations) + +--- + +## 1. Entry Points + +### CLI Command Handler + +**File**: `packages/opencode/src/cli/cmd/mcp.ts` (lines 1-81) + +The MCP command is registered as a CLI subcommand in the main application at `packages/opencode/src/index.ts` (line 17). + +**Key Handler**: `McpAddCommand` allows users to interactively add MCP servers: +- Prompts for server name +- Selects between "local" (run local command) or "remote" (connect to URL) +- For remote: validates URL and attempts connection test +- For local: captures command string + +--- + +## 2. Configuration + +### Schema Definition + +**File**: `packages/opencode/src/config/config.ts` (lines 294-337) + +OpenCode supports two configuration types using a Zod discriminated union: + +### A. Local MCP Server (`McpLocal`) + +```typescript +{ + type: "local", + command: string[], // Required - Command and arguments to execute + environment: Record, // Optional - Environment variables + enabled: boolean, // Optional - Enable/disable on startup + timeout: number // Optional, default: 5000ms - Tool fetching timeout +} +``` + +### B. Remote MCP Server (`McpRemote`) + +```typescript +{ + type: "remote", + url: string, // Required - URL endpoint of MCP server + headers: Record, // Optional - HTTP headers (for auth, etc.) + enabled: boolean, // Optional - Enable/disable on startup + timeout: number // Optional, default: 5000ms - Tool fetching timeout +} +``` + +### Configuration Storage Locations + +- **Global config**: `~/.opencode/opencode.json` or `opencode.jsonc` +- **Project config**: `opencode.json`/`opencode.jsonc` or `.opencode/opencode.json` +- **Config field**: `mcp: Record` (line 550 in config.ts) + +### Example Configuration + +```jsonc +{ + "mcp": { + "filesystem": { + "type": "local", + "command": ["opencode", "x", "@modelcontextprotocol/server-filesystem"], + "timeout": 5000 + }, + "remote-api": { + "type": "remote", + "url": "https://example.com/mcp", + "headers": { "Authorization": "Bearer token" }, + "timeout": 10000 + } + } +} +``` + +--- + +## 3. Connection Lifecycle + +### Lifecycle Management + +**File**: `packages/opencode/src/mcp/index.ts` (lines 56-91) + +The MCP module uses `Instance.state()` (a key-value state management system) to manage the lifecycle: + +### Initialization Phase (lines 56-79) + +```typescript +1. On first call: Load config via Config.get() +2. Extract mcp config object (cfg.mcp ?? {}) +3. For each MCP server config: + - Call create(key, mcp) function + - If successful: store client in clients{} and status in status{} +4. Return state object: { status, clients } +``` + +### Maintenance Phase + +- Clients remain in memory and are reused across requests +- Status tracked per server (connected/disabled/failed) +- Tools cached within client instances + +### Cleanup/Disposal Phase (lines 80-90) + +- Registered cleanup function called on `Instance.dispose()` +- Closes all active clients via `client.close()` +- Errors logged but don't block disposal +- Prevents hanging subprocess connections (especially important for Docker containers) + +### Connection State Schema + +**File**: `packages/opencode/src/mcp/index.ts` (lines 25-53) + +```typescript +Status = discriminatedUnion("status", [ + { status: "connected" }, + { status: "disabled" }, + { status: "failed", error: string } +]) +``` + +--- + +## 4. Protocol Handling + +### Transport Layer Implementations + +**File**: `packages/opencode/src/mcp/index.ts` (lines 129-210) + +### For Remote Servers (lines 129-175) + +Two transport implementations tried in sequence: + +1. **StreamableHTTPClientTransport** (lines 131-137) + - URL-based connection + - Headers passed via `requestInit` + - Attempts bidirectional streaming over HTTP + +2. **SSEClientTransport** (lines 139-145) + - Server-Sent Events fallback + - Same header support + - Used if StreamableHTTP fails + +**Error Handling**: If both transports fail, the last error is captured and returned as failed status. + +### For Local Servers (lines 178-210) + +**StdioClientTransport** (lines 182-191): +- Spawns subprocess with specified command +- stderr: "ignore" - suppresses subprocess errors +- Environment variables merged with `process.env` +- Special handling: sets `BUN_BE_BUN=1` for "opencode" command +- Custom environment variables from config applied + +### MCP Client Wrapper + +**File**: `packages/opencode/src/mcp/index.ts` (line 2) + +Uses `experimental_createMCPClient` from `@ai-sdk/mcp` library to wrap transport layers. This provides: +- Protocol message marshalling/unmarshalling +- Tool discovery and invocation +- Resource management + +--- + +## 5. Tool Registration + +### Tool Discovery and Registration + +**File**: `packages/opencode/src/mcp/index.ts` (lines 264-288) + +**MCP.tools()** function: +1. Gets all MCP clients from state +2. For each client, calls `client.tools()` to fetch available tools +3. **Tool Naming Convention**: Sanitizes tool names by replacing non-alphanumeric characters: + - Format: `{sanitized_client_name}_{sanitized_tool_name}` + - Example: "filesystem_read_file", "remote_api_get_user" +4. Returns `Record` for AI SDK consumption + +### Tool Name Sanitization + +**Lines 282-284** in `mcp/index.ts`: + +```typescript +const sanitizedClientName = clientName.replace(/[^a-zA-Z0-9_-]/g, "_") +const sanitizedToolName = toolName.replace(/[^a-zA-Z0-9_-]/g, "_") +result[sanitizedClientName + "_" + sanitizedToolName] = tool +``` + +### Integration into Session Processing + +**File**: `packages/opencode/src/session/prompt.ts` (lines 727-789) + +In the `resolveTools()` function: + +1. **Retrieval** (line 727): `for (const [key, item] of Object.entries(await MCP.tools()))` +2. **Filtering** (line 728): Applied against enabledTools using Wildcard matching +3. **Tool Wrapping** (lines 731-787): + - Wraps original execute function + - Triggers plugin hooks: `tool.execute.before` and `tool.execute.after` + - Handles result content processing: + - Text content extracted to output string + - Image content converted to FilePart attachments with base64 encoding + - Sets up tool output formatter (lines 782-787) +4. **Registration** (line 788): Added to tools dictionary with original AI SDK Tool interface + +### Server API Endpoints + +**File**: `packages/opencode/src/server/server.ts` (lines 1577-1625) + +#### GET /mcp (Status Endpoint, lines 1577-1595) +- Returns status of all configured MCP servers +- Response: `Record` + +#### POST /mcp (Add Endpoint, lines 1597-1625) +- Dynamically add new MCP server at runtime +- Request body: + ```json + { + "name": "server-name", + "config": { "type": "local" | "remote", ... } + } + ``` +- Response: Updated status record + +--- + +## 6. Error Handling + +### Error Types and Handling + +#### A. Failed Status Tracking (lines 120-240) + +- Each server gets separate status tracking +- On failure: `{ status: "failed", error: "error message" }` +- Errors captured for: + - Connection failures (both transport types) + - Tool fetching timeouts + - Subprocess startup failures + - Unknown errors + +#### B. Timeout Handling (line 226) + +**File**: `packages/opencode/src/mcp/index.ts` + +Uses `withTimeout()` utility (from `packages/opencode/src/util/timeout.ts`): + +```typescript +const result = await withTimeout(mcpClient.tools(), mcp.timeout ?? 5000) +``` + +- Default: 5000ms timeout +- If exceeded: Operation timed out error +- Caught and status set to failed with error message + +#### C. Client Closure on Tool Fetch Failure (lines 231-246) + +- If tool fetching fails after successful connection +- Client is immediately closed +- Status marked as failed: "Failed to get tools" +- Prevents hanging connections + +#### D. Plugin Hook Exception Handling (lines 732-752) + +- Before/after hooks wrapped in plugin trigger +- Any plugin hook errors don't break tool execution +- Errors logged per server + +#### E. Error Formatting for CLI + +**File**: `packages/opencode/src/cli/error.ts` (lines 8-9) + +`MCP.Failed` error detected and formatted as: +> "MCP server "{name}" failed. Note, opencode does not support MCP authentication yet." + +#### F. Disposal Error Handling (lines 82-87) + +- Individual `client.close()` errors logged but don't prevent other clients from closing +- Graceful degradation + +--- + +## 7. Key Files Summary + +| File Path | Lines | Role | +|-----------|-------|------| +| `src/mcp/index.ts` | Full | Core MCP module - client creation, connection lifecycle, tool fetching | +| `src/config/config.ts` | 294-550 | MCP schema definitions (McpLocal, McpRemote, Mcp union type) and config loading | +| `src/cli/cmd/mcp.ts` | Full | CLI interface for adding MCP servers interactively | +| `src/session/prompt.ts` | 727-789 | Tool registration in AI SDK, wrapping MCP tools with plugin hooks | +| `src/server/server.ts` | 1577-1625 | HTTP API endpoints for MCP status and dynamic registration | +| `src/acp/agent.ts` | 480-518 | ACP integration: configures MCP servers during session init | +| `src/project/instance.ts` | Full | Instance state management - handles MCP client lifecycle per project | +| `src/project/state.ts` | Full | Underlying state storage and disposal mechanism | +| `src/util/timeout.ts` | Full | Timeout wrapper for tool fetching operations | +| `src/cli/error.ts` | 8-9 | Error formatting for MCP failures | + +--- + +## 8. Dependencies + +**MCP-Related NPM Packages** (from package.json): +- `@ai-sdk/mcp@0.0.8` - AI SDK provider for MCP integration +- `@modelcontextprotocol/sdk@1.15.1` - Official MCP SDK with transport implementations + +--- + +## 9. Data Flow Diagram + +``` +┌─────────────────────────────────────────────────────────────────┐ +│ opencode.json/opencode.jsonc │ +│ (mcp config section) │ +└────────────────┬────────────────────────────────────────────────┘ + │ + ▼ + ┌────────────────┐ + │ Config.get() │ + └────────┬───────┘ + │ + ▼ + ┌────────────────────────────────┐ + │ MCP.state() initialization │ + │ (Instance.state) │ + └───────┬────────────┬───────────┘ + │ │ + ┌───────▼──┐ ┌──────▼──────┐ + │ Remote │ │ Local │ + │ Servers │ │ Servers │ + └─┬────┬───┘ └──┬────┬─────┘ + │ │ │ │ + ┌──▼┐ ┌─▼──┐ ┌───▼┐ ┌──▼───┐ + │HTP│ │SSE │ │Cmd │ │Env │ + │ │ │ │ │Std │ │Vars │ + └──┬┘ └─┬──┘ └───┬┘ └──┬───┘ + │ │ │ │ + └────┴─┬──┬────┴────┘ + │ │ + ▼ ▼ + ┌─────────────────────────────────┐ + │ experimental_createMCPClient │ + │ (@ai-sdk/mcp) │ + └────────┬──────────────┬─────────┘ + │ │ + ▼ ▼ + ┌────────────┐ ┌─────────────┐ + │ Connected │ │ Failed │ + │ Clients │ │ Status │ + └────┬───────┘ └─────────────┘ + │ + ▼ + ┌─────────────────────────────────┐ + │ MCP.tools() │ + │ - Fetch tools per client │ + │ - Sanitize names │ + │ - Timeout enforcement │ + └────────┬──────────────┬─────────┘ + │ │ + ▼ ▼ + ┌────────────┐ ┌────────────┐ + │ AI SDK │ │ Wrapped │ + │ Tool{} │ │ Execute │ + └────┬───────┘ └─────────────┘ + │ + ▼ + ┌─────────────────────────────────┐ + │ resolveTools() in prompt.ts │ + │ - Plugin hook wrapping │ + │ - Tool filtering │ + │ - Result processing │ + └────────┬──────────────┬─────────┘ + │ │ + ▼ ▼ + ┌───────────────┐ ┌──────────────┐ + │ Available │ │ Agent Model │ + │ Tools{} │ │ Tool Calls │ + └───────────────┘ └──────────────┘ +``` + +--- + +## 10. Security Considerations + +1. **No Built-in MCP Authentication**: Error message explicitly states this (`cli/error.ts` line 9) + +2. **Custom Header Support**: Remote servers can pass headers for custom auth, but handled at transport layer + +3. **Permission System**: Agent-level permissions (edit, bash, webfetch) respected; MCP tools inherit these + +4. **Subprocess Isolation**: Local servers run in subprocess with configurable environment variables + +5. **Timeout Protection**: Default 5-second timeout prevents hanging connections from blocking the system + +--- + +## Complete Configuration Flow + +``` +1. User/Config → opencode.json mcp field + ↓ +2. Config Loading → Config.get() merges all config sources + ↓ +3. Instance Initialization → Instance.state() called + ↓ +4. MCP Client Creation → For each config entry: + a) Validate config schema + b) Select transport based on type + c) Create MCP client via @ai-sdk/mcp + d) Fetch tools with timeout + e) Store status and client reference + ↓ +5. Tool Resolution → SessionPrompt.resolveTools(): + a) Call MCP.tools() + b) Wrap each tool with plugin hooks + c) Add to session's available tools + ↓ +6. Tool Execution → AI model calls tool + → Original execute wrapped function called + → Result formatted and returned + ↓ +7. Cleanup → Instance.dispose(): + a) Close all MCP clients + b) Terminate subprocesses + c) Log any errors +``` diff --git a/packages/opencode/docs/analysis/multi-server-statefulness.md b/packages/opencode/docs/analysis/multi-server-statefulness.md new file mode 100644 index 00000000000..12bab47c913 --- /dev/null +++ b/packages/opencode/docs/analysis/multi-server-statefulness.md @@ -0,0 +1,531 @@ +# Multi-Server Statefulness Analysis + +This document provides a comprehensive analysis of whether OpenCode servers are stateless and whether sessions can be handled by multiple server instances. + +## Table of Contents + +1. [Executive Summary](#1-executive-summary) +2. [State Storage Locations](#2-state-storage-locations) +3. [In-Memory State Analysis](#3-in-memory-state-analysis) +4. [File-Based Storage Analysis](#4-file-based-storage-analysis) +5. [Locking Across Processes](#5-locking-across-processes) +6. [Session Affinity Requirements](#6-session-affinity-requirements) +7. [Event Broadcasting Limitations](#7-event-broadcasting-limitations) +8. [What Breaks with Multiple Servers](#8-what-breaks-with-multiple-servers) +9. [Race Condition Scenarios](#9-race-condition-scenarios) +10. [Production Deployment Requirements](#10-production-deployment-requirements) + +--- + +## 1. Executive Summary + +**OpenCode servers are HIGHLY STATEFUL and are NOT designed for multi-server load balancing.** + +| Question | Answer | +|----------|--------| +| **Are servers stateless?** | **No** - significant in-memory state | +| **Can any server handle any session?** | **No** - must use session affinity | +| **What if different server handles next message?** | Data corruption, lost callbacks, broken cancellation | + +**Recommendation**: Use a single server per working directory, or implement sticky sessions if load balancing is required. + +--- + +## 2. State Storage Locations + +### File-Based Storage (Stateless/Shareable) + +**Location**: XDG base directories (`~/.local/share/opencode/storage/`) + +**Structure**: Hierarchical JSON files: +- `storage/session/{projectID}/{sessionID}.json` - Session metadata +- `storage/message/{sessionID}/{messageID}.json` - Individual messages +- `storage/part/{messageID}/{partID}.json` - Message parts +- `storage/session_diff/{sessionID}.json` - Diff data + +**Shared by multiple servers**: YES (files are shared via filesystem) + +### In-Memory State (Stateful/Process-Bound) + +**Location**: Process memory only + +**Shared between server instances**: **NO** + +--- + +## 3. In-Memory State Analysis + +### A. Session Prompt State Lock (CRITICAL) + +**File**: `packages/opencode/src/session/prompt.ts` (lines 59-78) + +```typescript +const state = Instance.state( + () => { + const data: Record< + string, + { + abort: AbortController + callbacks: { + resolve(input: MessageV2.WithParts): void + reject(): void + }[] + } + > = {} + return data + }, + // cleanup on dispose +) +``` + +**Problem**: This state is entirely **process-local**. It contains: +- `AbortController` instances for each active session +- Callback queues for multiple concurrent requests to the same session +- Session busy/idle state tracking + +**Multi-Server Impact**: If two servers try to handle the same session: +- Server A's abort signal won't affect Server B's processing +- Server B cannot see Server A's abort state +- Both servers will try to process the same session independently + +### B. Locking Mechanism (PROCESS-LOCAL) + +**File**: `packages/opencode/src/util/lock.ts` + +```typescript +const locks = new Map< + string, + { + readers: number + writer: boolean + waitingReaders: (() => void)[] + waitingWriters: (() => void)[] + } +>() +``` + +**Facts**: +- This is an **in-memory reader/writer lock** +- Keys are file paths +- Stored in a `Map` in process memory +- Used in `Storage.read()` and `Storage.write()` operations +- **NO file-level OS locks** (no `flock`, `fcntl`, or similar) + +**Multi-Server Impact**: +- Server 1 acquires write lock on `session/{id}.json` +- Server 2 can acquire its own write lock on the same file (different lock object!) +- Both servers will write to the same file simultaneously +- **File corruption or lost writes possible** + +### C. Session Busy State (PROCESS-LOCAL) + +**File**: `packages/opencode/src/session/prompt.ts` (lines 80-83) + +```typescript +export function assertNotBusy(sessionID: string) { + const match = state()[sessionID] + if (match) throw new Session.BusyError(sessionID) +} +``` + +**File**: `packages/opencode/src/session/index.ts` (lines 443-446) + +```typescript +export class BusyError extends Error { + constructor(public readonly sessionID: string) { + super(`Session ${sessionID} is busy`) + } +} +``` + +**How it's used**: +- `SessionRevert.revert()` calls `assertNotBusy()` before reverting +- `SessionRevert.unrevert()` calls `assertNotBusy()` before unreverting +- Only checked at operation start, not continuously + +**Multi-Server Impact**: +- Only Server A knows session is busy (in its local state) +- Server B has no knowledge and will attempt the operation +- No prevention of concurrent modifications + +### D. Session Status State (PROCESS-LOCAL) + +**File**: `packages/opencode/src/session/status.ts` (lines 43-46) + +```typescript +const state = Instance.state(() => { + const data: Record = {} + return data +}) +``` + +States: `"idle"`, `"retry"`, `"busy"` + +**Multi-Server Impact**: +- Server A has session status "busy" +- Server B sees status "idle" (different state maps) +- No coordination between servers + +### E. Bus Subscriptions (PROCESS-LOCAL) + +**File**: `packages/opencode/src/bus/index.ts` (lines 11-17) + +```typescript +const state = Instance.state(() => { + const subscriptions = new Map() + return { + subscriptions, + } +}) +``` + +Events are published to: +1. Local subscriptions (in-process only) +2. `GlobalBus` (EventEmitter in process memory) + +**Multi-Server Impact**: +- Server A publishes event: `Bus.publish(Event.Created, { info: result })` +- Server B won't receive it (different subscription maps) +- Only clients connected to Server B can see events from Server B +- Clients connected to different servers see different event streams + +--- + +## 4. File-Based Storage Analysis + +**Storage Implementation**: `packages/opencode/src/storage/storage.ts` + +```typescript +export async function read(key: string[]) { + const dir = await state().then((x) => x.dir) + const target = path.join(dir, ...key) + ".json" + return withErrorHandling(async () => { + using _ = await Lock.read(target) // In-memory lock! + const result = await Bun.file(target).json() + return result as T + }) +} + +export async function update(key: string[], fn: (draft: T) => void) { + const dir = await state().then((x) => x.dir) + const target = path.join(dir, ...key) + ".json" + return withErrorHandling(async () => { + using _ = await Lock.write(target) // In-memory lock! + const content = await Bun.file(target).json() + fn(content) + await Bun.write(target, JSON.stringify(content, null, 2)) + return content as T + }) +} +``` + +**Critical Issue**: The `Lock` mechanism protects within-process access but provides **zero protection** against concurrent writes from different server processes. + +--- + +## 5. Locking Across Processes + +### Distributed Locking Capability: NONE + +The lock mechanism: +- Uses an in-memory `Map` as the lock store +- Each process has its own `Map` instance +- No OS-level file locks +- No persistent lock files +- No distributed lock service integration + +### What Happens with Concurrent Multi-Server Writes + +``` +Server A Server B +────────────────────────────────────────────── +Lock.write("session/123.json") Lock.write("session/123.json") + ↓ ↓ +locks.set(path, {writer: true}) locks.set(path, {writer: true}) + ↓ ↓ +Read file Read file + ↓ ↓ +Modify content A Modify content B + ↓ ↓ +Write file Write file (overwrites A!) + ↓ ↓ +locks.delete(path) locks.delete(path) + +Result: Server B's write overwrites Server A's changes! +``` + +--- + +## 6. Session Affinity Requirements + +### Can a Different Server Pick Up a Session Mid-Conversation? + +**NO - It would break in multiple ways** + +### 1. Abort Signals Don't Work + +The `AbortController` from Server A won't be available on Server B: + +```typescript +// Server A started processing +const abort = start(sessionID) // Creates abort in Server A's memory + +// User requests cancel from Server B +SessionPrompt.cancel(sessionID) // Cancels Server B's memory state, not A's! +``` + +### 2. Callback Queues Are Lost + +If multiple requests are queued waiting for response: + +```typescript +// Server A is processing +state()[sessionID].callbacks = [resolve1, reject1] + +// Server B tries to continue processing +state()[sessionID] // undefined in Server B! +// callbacks array lost +``` + +### 3. Busy State Doesn't Transfer + +```typescript +// Server A has session as "busy" +SessionStatus.set(sessionID, { type: "busy" }) + +// Server B doesn't see this +SessionStatus.get(sessionID) // Returns { type: "idle" } +``` + +### 4. File Corruption Risk + +Both servers could modify the same session state file concurrently without coordination. + +--- + +## 7. Event Broadcasting Limitations + +### Global Event Mechanism + +**File**: `packages/opencode/src/bus/global.ts` + +```typescript +export const GlobalBus = new EventEmitter<{ + event: [{ directory: string; payload: any }] +}>() +``` + +**Used in** `packages/opencode/src/bus/index.ts` (line 73): + +```typescript +GlobalBus.emit("event", { + directory: Instance.directory, + payload, +}) +``` + +### Capability + +- Events can be broadcasted via `/global/event` endpoint +- Server A publishes event → Server B's clients receive via `/global/event` stream +- **HOWEVER**: This is unidirectional broadcast, not coordination +- Does NOT prevent concurrent modifications +- Does NOT provide distributed locking + +--- + +## 8. What Breaks with Multiple Servers + +| Component | Current Behavior | Multi-Server Impact | +|-----------|------------------|---------------------| +| **Lock Mechanism** | In-memory Map | Both servers acquire independent locks → race condition | +| **Session Prompt State** | Process-local with abort controller | Server B can't see/cancel Server A's processing | +| **Session Status** | Process-local state | Servers have inconsistent session status views | +| **Callback Queues** | In-memory queue per session | Queued callbacks lost when switching servers | +| **Bus Subscriptions** | Per-instance subscriptions | Different servers receive different events | +| **Abort Signals** | Process-local | Cancellation doesn't propagate across servers | +| **File Writes** | Protected by in-memory locks only | Concurrent writes cause data corruption | +| **Session Busy Check** | Local state check | No inter-server synchronization | + +--- + +## 9. Race Condition Scenarios + +### Scenario 1: Concurrent Message Processing + +``` +Time Server A Server B +──── ────────────────────────────── ─────────────────────────────── +T0 POST /session/123/message + → SessionPrompt.prompt() + → state()[123] = {abort, []} + +T1 POST /session/123/message + → SessionPrompt.prompt() + → start(123) returns controller + → state()[123] = {abort, []} + (Different state map!) + +T2 Reading session messages + Lock.read("message/123/*.json") + Acquires in-memory lock on Server A + +T3 Reading session messages + Lock.read("message/123/*.json") + Acquires DIFFERENT in-memory lock + on Server B (same file!) + +T4 Session.updateMessage(msg1) + Lock.write(["message", 123, id]) + Writes file with lock on Server A + +T5 Session.updateMessage(msg2) + Lock.write(["message", 123, id]) + Writes SAME FILE with lock on B + msg2 overwrites msg1! +``` + +### Scenario 2: Cancellation Failure + +``` +Time Server A Server B +──── ────────────────────────────── ─────────────────────────────── +T0 Processing message for session 123 + state()[123].abort = controller_A + +T1 User cancels session 123 + cancel(123) + state()[123] = undefined + (No effect on Server A!) + +T2 Server A continues processing + (Doesn't know about cancellation) + +T3 Server A completes and writes + (User expected it to be cancelled) +``` + +### Scenario 3: Lost Queued Requests + +``` +Time Server A Server B +──── ────────────────────────────── ─────────────────────────────── +T0 Processing session 123 + state()[123].callbacks = [] + +T1 New request arrives at Server A + Queued: callbacks = [resolve1] + +T2 Session 123 processing completes + Server B handles completion + state()[123] = undefined + (No callbacks to resolve!) + +T3 resolve1 never called + Client hangs forever +``` + +--- + +## 10. Production Deployment Requirements + +To support multiple servers handling the same sessions, OpenCode would need: + +### 1. Distributed File Locking + +Replace in-memory `Lock` with external coordination: +- Redis-based distributed locks (Redlock algorithm) +- ZooKeeper/etcd for distributed coordination +- File-level OS locks (flock/fcntl) for single-host deployments + +### 2. Shared State Store + +Move process-local state to shared storage: +- Redis for session state, callbacks, abort signals +- Database for persistent state +- Distributed cache for performance + +### 3. Global Event Pub/Sub + +Replace in-memory Bus with distributed messaging: +- Redis Pub/Sub +- NATS +- Apache Kafka +- RabbitMQ + +### 4. Session Affinity Routing + +If not implementing the above, use load balancer sticky sessions: +- Cookie-based affinity +- IP-based affinity +- Session ID hashing + +### Example Architecture for Multi-Server + +``` +┌─────────────────────────────────────────────────┐ +│ Load Balancer │ +│ (with session affinity) │ +└──────────┬──────────────┬──────────────┬────────┘ + │ │ │ + ┌─────▼────┐ ┌─────▼────┐ ┌─────▼────┐ + │ Server 1 │ │ Server 2 │ │ Server 3 │ + └─────┬────┘ └─────┬────┘ └─────┬────┘ + │ │ │ + └──────────┬───┴──────────────┘ + │ + ┌───────▼───────┐ + │ Redis │ + │ - Locks │ + │ - Session │ + │ - Events │ + └───────┬───────┘ + │ + ┌───────▼───────┐ + │ Shared FS │ + │ - Files │ + │ - Storage │ + └───────────────┘ +``` + +--- + +## Summary + +### Stateless Components (Can Be Shared) + +- Session data files (on disk) +- Session metadata (on disk) +- Configuration (on disk) +- Message files (on disk) + +### Stateful Components (Blocking Multi-Server) + +- Session prompt state with abort controllers +- In-memory locking mechanism +- Session busy state tracking +- Bus event subscriptions +- Session status tracking +- Callback queue management + +### Deployment Options + +| Option | Complexity | Guarantee | +|--------|------------|-----------| +| **Single server** | Low | Full consistency | +| **Session affinity** | Medium | Consistency per session | +| **Full distribution** | High | Full horizontal scaling | + +--- + +## Conclusion + +OpenCode servers are designed for **single-server deployments** or **session-affinity-based load balancing**. The extensive use of in-memory state for session management, locking, and event broadcasting means that: + +1. **Sessions must be handled by the same server** that started processing them +2. **No coordination exists** between multiple server instances +3. **File writes can be corrupted** if multiple servers access the same session +4. **Events are not distributed** across server instances +5. **Cancellation and abort signals** don't propagate between servers + +For production deployments requiring multiple servers, implement sticky sessions at the load balancer level, or undertake significant architectural changes to move state management to distributed systems. diff --git a/packages/opencode/docs/analysis/session-connection-ordering.md b/packages/opencode/docs/analysis/session-connection-ordering.md new file mode 100644 index 00000000000..96fce5a9293 --- /dev/null +++ b/packages/opencode/docs/analysis/session-connection-ordering.md @@ -0,0 +1,522 @@ +# Session Multiple Connections and Message Ordering + +This document provides a comprehensive analysis of how the OpenCode server handles multiple client connections for the same session and guarantees message ordering. + +## Table of Contents + +1. [Multiple Client Support](#1-multiple-client-support) +2. [Message Ordering Guarantee](#2-message-ordering-guarantee) +3. [Concurrent Request Handling](#3-concurrent-request-handling) +4. [File-Level Concurrency Control](#4-file-level-concurrency-control) +5. [Event Broadcasting](#5-event-broadcasting) +6. [Race Condition Prevention](#6-race-condition-prevention) +7. [Key Files Summary](#7-key-files-summary) + +--- + +## 1. Multiple Client Support + +### Does OpenCode Allow Multiple Connections for the Same Session? + +**YES** - The OpenCode server uses a Bus-based pub/sub event system that allows multiple clients to connect to the same session and receive updates. + +### Event Stream Endpoints + +**File**: `packages/opencode/src/server/server.ts` (lines 1957-1995) + +```typescript +GET /event - Session-scoped Server-Sent Events (SSE) stream +GET /global/event - Global event stream +``` + +### Connection Implementation + +**Code Location**: `packages/opencode/src/server/server.ts:1973-1995` + +```typescript +async (c) => { + log.info("event connected") + return streamSSE(c, async (stream) => { + stream.writeSSE({ + data: JSON.stringify({ + type: "server.connected", + properties: {}, + }), + }) + const unsub = Bus.subscribeAll(async (event) => { + await stream.writeSSE({ + data: JSON.stringify(event), + }) + }) + await new Promise((resolve) => { + stream.onAbort(() => { + unsub() + resolve() + log.info("event disconnected") + }) + }) + }) +} +``` + +Each client maintains a separate HTTP connection with an SSE stream. Multiple connections are supported because events are published to all subscribers. + +### Connection Tracking + +**File**: `packages/opencode/src/bus/index.ts` (lines 1-119) + +- Each connection uses the Bus subscription system to track active listeners +- Subscriptions are stored in a `Map()` structure (line 12) +- When a client connects via SSE, it registers a callback handler +- When a client disconnects, the `stream.onAbort()` callback unsubscribes the handler + +### Global Event Broadcasting + +**File**: `packages/opencode/src/bus/global.ts` (lines 1-10) + +```typescript +export const GlobalBus = new EventEmitter<{ + event: [{ + directory: string + payload: any + }] +}>() +``` + +- Uses Node.js EventEmitter for cross-directory event propagation +- Each event published to Bus is also emitted to GlobalBus for multi-client notification + +--- + +## 2. Message Ordering Guarantee + +### How Are Messages Processed? + +Messages are processed **sequentially per session**, using a sophisticated queueing mechanism that ensures only one message is processed at a time for any given session. + +### Session Prompt State Lock + +**File**: `packages/opencode/src/session/prompt.ts` (lines 55-238) + +```typescript +const state = Instance.state( + () => { + const data: Record = {} + return data + } +) + +function start(sessionID: string) { + const s = state() + if (s[sessionID]) return // Session already busy - return undefined + const controller = new AbortController() + s[sessionID] = { + abort: controller, + callbacks: [], + } + return controller.signal +} + +export const loop = fn(Identifier.schema("session"), async (sessionID) => { + const abort = start(sessionID) + if (!abort) { + // Session is busy - queue this request + return new Promise((resolve, reject) => { + const callbacks = state()[sessionID].callbacks + callbacks.push({ resolve, reject }) + }) + } + // Process message... +}) +``` + +### How the Queue Works + +1. **First client** to call `prompt()` acquires the lock (sets `state()[sessionID]`) +2. **Subsequent clients** push their resolve/reject callbacks into the queue (line 237) +3. When the first message completes, **all queued callbacks are resolved** in order (line 607) + +### Callback Resolution + +**File**: `packages/opencode/src/session/prompt.ts` (lines 603-610) + +```typescript +SessionCompaction.prune({ sessionID }) +for await (const item of MessageV2.stream(sessionID)) { + if (item.info.role === "user") continue + const queued = state()[sessionID]?.callbacks ?? [] + for (const q of queued) { + q.resolve(item) // Resolve all queued callbacks with the result + } + return item +} +``` + +--- + +## 3. Concurrent Request Handling + +### What Happens When Two Clients Send Messages? + +#### Scenario A: Client B sends while Client A's message is processing + +1. Client A's message starts processing (acquires lock) +2. Client B's request arrives and calls `start(sessionID)` +3. `start()` returns `undefined` because session is busy +4. Client B's promise is queued in `callbacks[]` +5. When Client A's message completes, Client B receives the result +6. Client B's message is then processed next + +#### Scenario B: Concurrent sends arrive at exact same time + +- Only **one** client acquires the lock (first to call `start()`) +- Others are queued and resolved in order +- No race condition due to JavaScript's single-threaded event loop + +### BusyError Prevention + +**File**: `packages/opencode/src/session/prompt.ts` (lines 80-83) + +```typescript +export function assertNotBusy(sessionID: string) { + const match = state()[sessionID] + if (match) throw new Session.BusyError(sessionID) +} +``` + +**File**: `packages/opencode/src/session/index.ts` (lines 443-446) + +```typescript +export class BusyError extends Error { + constructor(public readonly sessionID: string) { + super(`Session ${sessionID} is busy`) + } +} +``` + +This check is used by operations like `SessionRevert.revert()` and `SessionRevert.unrevert()` to prevent concurrent modifications during processing. + +### Session Status Tracking + +**File**: `packages/opencode/src/session/status.ts` (lines 43-46) + +```typescript +const state = Instance.state(() => { + const data: Record = {} + return data +}) +``` + +States: `"idle"`, `"retry"`, `"busy"` + +--- + +## 4. File-Level Concurrency Control + +### Reader-Writer Lock Pattern + +**File**: `packages/opencode/src/util/lock.ts` (lines 1-98) + +The server implements a classic read-write lock with writer starvation prevention: + +```typescript +export namespace Lock { + const locks = new Map< + string, + { + readers: number + writer: boolean + waitingReaders: (() => void)[] + waitingWriters: (() => void)[] + } + >() + + function process(key: string) { + const lock = locks.get(key) + if (!lock || lock.writer || lock.readers > 0) return + + // Prioritize writers to prevent starvation + if (lock.waitingWriters.length > 0) { + const nextWriter = lock.waitingWriters.shift()! + nextWriter() + return + } + + // Wake up all waiting readers + while (lock.waitingReaders.length > 0) { + const nextReader = lock.waitingReaders.shift()! + nextReader() + } + } + + export async function read(key: string): Promise { + // Multiple concurrent readers allowed + // ... + } + + export async function write(key: string): Promise { + // Exclusive write access + // ... + } +} +``` + +### Lock Characteristics + +- **Line 28-32**: Writers have priority over readers (prevents starvation) +- **Line 51-52**: Multiple readers can hold lock simultaneously +- **Line 77-78**: Only one writer can hold lock exclusively +- Uses Promise-based async/await locking with disposal pattern (`Symbol.dispose`) + +### Storage Lock Usage + +**File**: `packages/opencode/src/storage/storage.ts` (lines 168-196) + +```typescript +export async function read(key: string[]) { + const dir = await state().then((x) => x.dir) + const target = path.join(dir, ...key) + ".json" + return withErrorHandling(async () => { + using _ = await Lock.read(target) // Read lock + const result = await Bun.file(target).json() + return result as T + }) +} + +export async function update(key: string[], fn: (draft: T) => void) { + const dir = await state().then((x) => x.dir) + const target = path.join(dir, ...key) + ".json" + return withErrorHandling(async () => { + using _ = await Lock.write(target) // Write lock + const content = await Bun.file(target).json() + fn(content) + await Bun.write(target, JSON.stringify(content, null, 2)) + return content as T + }) +} + +export async function write(key: string[], content: T) { + const dir = await state().then((x) => x.dir) + const target = path.join(dir, ...key) + ".json" + return withErrorHandling(async () => { + using _ = await Lock.write(target) // Write lock + await Bun.write(target, JSON.stringify(content, null, 2)) + }) +} +``` + +### Concurrency Control Strategy + +- **Reads**: Multiple concurrent reads on the same file (lock-free for reads) +- **Updates**: Exclusive write lock (read-modify-write transaction) +- **Writes**: Exclusive write lock (atomic writes) + +--- + +## 5. Event Broadcasting + +### Multi-Client Event Distribution + +**File**: `packages/opencode/src/bus/index.ts` (lines 55-78) + +```typescript +export async function publish( + def: Definition, + properties: z.output +) { + const payload = { + type: def.type, + properties, + } + log.info("publishing", { type: def.type }) + + const pending = [] + for (const key of [def.type, "*"]) { + const match = state().subscriptions.get(key) + for (const sub of match ?? []) { + pending.push(sub(payload)) // Call all subscribers + } + } + + GlobalBus.emit("event", { // Broadcast globally + directory: Instance.directory, + payload, + }) + + return Promise.all(pending) // Wait for all handlers +} +``` + +### Event Types Published + +**File**: `packages/opencode/src/session/index.ts` (lines 87-120) + +```typescript +export const Event = { + Created: Bus.event("session.created", z.object({ info: Info })), + Updated: Bus.event("session.updated", z.object({ info: Info })), + Deleted: Bus.event("session.deleted", z.object({ info: Info })), + Diff: Bus.event("session.diff", z.object({ sessionID, diff })), + Error: Bus.event("session.error", z.object({ sessionID, error })), +} +``` + +### Message Update Events + +**File**: `packages/opencode/src/session/index.ts` (lines 344-388) + +```typescript +export const updateMessage = fn(MessageV2.Info, async (msg) => { + await Storage.write(["message", msg.sessionID, msg.id], msg) + Bus.publish(MessageV2.Event.Updated, { // Broadcast message update + info: msg, + }) + return msg +}) + +export const updatePart = fn(UpdatePartInput, async (input) => { + const part = "delta" in input ? input.part : input + const delta = "delta" in input ? input.delta : undefined + await Storage.write(["part", part.messageID, part.id], part) + Bus.publish(MessageV2.Event.PartUpdated, { // Broadcast part update + part, + delta, + }) + return part +}) +``` + +### Dual Response Mechanism + +1. **Direct Response**: Message response streamed back on the same HTTP connection +2. **Event Broadcasting**: Message updates also published to all SSE subscribers on `/event` endpoint + +### Message Streaming Endpoint + +**File**: `packages/opencode/src/server/server.ts` (lines 942-980) + +```typescript +.post("/session/:id/message", async (c) => { + c.status(200) + c.header("Content-Type", "application/json") + return stream(c, async (stream) => { + const sessionID = c.req.valid("param").id + const body = c.req.valid("json") + const msg = await SessionPrompt.prompt({ ...body, sessionID }) + stream.write(JSON.stringify(msg)) + }) +}) +``` + +--- + +## 6. Race Condition Prevention + +### Key Mechanisms Summary + +| Concern | Mechanism | File | Lines | +|---------|-----------|------|-------| +| **Session-level message conflicts** | `start()` function returns falsy if session busy; queues requests | prompt.ts | 207-238 | +| **File-level concurrent access** | Reader-Writer Lock with writer priority | lock.ts | 24-45 | +| **State disposal races** | Timeout-protected disposal with Promise.all | state.ts | 31-64 | +| **Event ordering** | Bus.publish waits for all subscribers (Promise.all) | index.ts | 77 | + +### Cleanup on Session Completion + +**File**: `packages/opencode/src/session/prompt.ts` (lines 218-230) + +```typescript +export function cancel(sessionID: string) { + log.info("cancel", { sessionID }) + const s = state() + const match = s[sessionID] + if (!match) return + match.abort.abort() // Abort ongoing processing + for (const item of match.callbacks) { + item.reject() // Reject queued requests + } + delete s[sessionID] // Remove session from state + SessionStatus.set(sessionID, { type: "idle" }) + return +} +``` + +### Async Queue Utility + +**File**: `packages/opencode/src/util/queue.ts` (lines 1-19) + +```typescript +export class AsyncQueue implements AsyncIterable { + private queue: T[] = [] + private resolvers: ((value: T) => void)[] = [] + + push(item: T) { + const resolve = this.resolvers.shift() + if (resolve) resolve(item) + else this.queue.push(item) + } + + async next(): Promise { + if (this.queue.length > 0) return this.queue.shift()! + return new Promise((resolve) => this.resolvers.push(resolve)) + } + + async *[Symbol.asyncIterator]() { + while (true) yield await this.next() + } +} +``` + +This enables async iteration patterns where consumers can wait for items that haven't been pushed yet. + +--- + +## 7. Key Files Summary + +| File | Role | +|------|------| +| `src/session/prompt.ts:207-238` | Session lock and callback queue for message ordering | +| `src/util/lock.ts:1-98` | Reader-writer lock implementation for file access | +| `src/bus/index.ts:55-78` | Event broadcasting to multiple clients | +| `src/server/server.ts:1973-1995` | SSE event streaming endpoints | +| `src/storage/storage.ts:168-196` | Locked file operations | +| `src/session/status.ts:43-46` | Session status tracking | +| `src/util/queue.ts:1-19` | Async queue for event processing | + +--- + +## Summary Table: Concurrency Control + +| Layer | Mechanism | Scope | Guarantees | +|-------|-----------|-------|-----------| +| **Session Message** | Single-threaded loop with callback queue | Per session | Sequential processing, queued requests | +| **File I/O** | Reader-Writer Lock | Per file | Concurrent reads, exclusive writes | +| **Event Publishing** | Bus pub/sub + GlobalBus EventEmitter | Global | All subscribers notified atomically | +| **State Storage** | Directory-scoped Instance.state | Per project | Singleton per init function | +| **HTTP Connections** | SSE streams with individual subscriptions | Per connection | Independent event delivery | + +--- + +## Conclusion + +The OpenCode server supports **multiple concurrent client connections** to the same session through: + +- **Isolated SSE streams** for each client connection +- **Bus-based pub/sub** for event broadcasting +- **Sequential message processing** per session using callback queues +- **File-level locking** with reader-writer semantics +- **Atomic storage operations** with automatic timestamps and event publishing + +This design ensures **message ordering is preserved per session** while allowing **concurrent message processing across different sessions** and **concurrent client connections** to receive real-time updates. + +When two clients send messages to the same session: +1. One gets processed immediately (acquires the lock) +2. The other waits in the queue +3. Both receive the result when processing completes +4. The queued message then processes next From d4a0791d5012326d449a0aab8533904b351e5e45 Mon Sep 17 00:00:00 2001 From: Claude Date: Mon, 24 Nov 2025 04:05:36 +0000 Subject: [PATCH 02/58] docs: add analysis of event replay, LSP utilization, and Go prompts Add four additional documentation files analyzing OpenCode architecture: 1. event-historical-replay.md - Explains that OpenCode does NOT replay historical events on client connect; uses pull-based model for history and push-based for real-time updates 2. lsp-utilization.md - Comprehensive analysis of how OpenCode uses LSP including 19 built-in language servers, diagnostics injection, and symbol search capabilities 3. golang-project-prompts.md - Documents what prompts are sent for Go projects; importantly, there are NO Go-specific instructions - the model infers conventions from project structure and its training 4. lsp-selection-mechanism.md - Details how OpenCode selects which LSP server to use based on file extensions, root detection, and configuration hierarchy --- .../docs/analysis/event-historical-replay.md | 291 +++++++++++ .../docs/analysis/golang-project-prompts.md | 388 ++++++++++++++ .../docs/analysis/lsp-selection-mechanism.md | 475 ++++++++++++++++++ .../opencode/docs/analysis/lsp-utilization.md | 389 ++++++++++++++ 4 files changed, 1543 insertions(+) create mode 100644 packages/opencode/docs/analysis/event-historical-replay.md create mode 100644 packages/opencode/docs/analysis/golang-project-prompts.md create mode 100644 packages/opencode/docs/analysis/lsp-selection-mechanism.md create mode 100644 packages/opencode/docs/analysis/lsp-utilization.md diff --git a/packages/opencode/docs/analysis/event-historical-replay.md b/packages/opencode/docs/analysis/event-historical-replay.md new file mode 100644 index 00000000000..b181f239cd9 --- /dev/null +++ b/packages/opencode/docs/analysis/event-historical-replay.md @@ -0,0 +1,291 @@ +# Event Historical Replay Analysis + +This document analyzes whether OpenCode performs historical replay of events when a new client connects to a session. + +## Table of Contents + +1. [Executive Summary](#1-executive-summary) +2. [SSE Event Endpoint Behavior](#2-sse-event-endpoint-behavior) +3. [Bus Subscription Mechanism](#3-bus-subscription-mechanism) +4. [How Clients Get Historical Data](#4-how-clients-get-historical-data) +5. [TUI Client Implementation](#5-tui-client-implementation) +6. [Architecture Pattern](#6-architecture-pattern) + +--- + +## 1. Executive Summary + +**OpenCode does NOT perform historical replay of events when a new client connects.** + +Instead, it uses a **pull-based model** for historical data and **push-based model** for real-time updates: + +| Data Type | Retrieval Method | +|-----------|------------------| +| Historical messages | `GET /session/:id/message` (pull) | +| Session state | `GET /session/:id` (pull) | +| Past diffs | `GET /session/:id/diff` (pull) | +| Future updates | SSE `/event` stream (push) | + +--- + +## 2. SSE Event Endpoint Behavior + +### What Happens on Connect + +**File**: `packages/opencode/src/server/server.ts` (lines 1957-1996) + +When a client connects to the `/event` endpoint, only a connection acknowledgment is sent: + +```typescript +.get("/event", /* ... */, async (c) => { + log.info("event connected") + return streamSSE(c, async (stream) => { + // Send only a connection acknowledgment - NO historical events + stream.writeSSE({ + data: JSON.stringify({ + type: "server.connected", + properties: {}, + }), + }) + // Subscribe ONLY to future events + const unsub = Bus.subscribeAll(async (event) => { + await stream.writeSSE({ + data: JSON.stringify(event), + }) + }) + await new Promise((resolve) => { + stream.onAbort(() => { + unsub() + resolve() + log.info("event disconnected") + }) + }) + }) +}) +``` + +**Key Finding**: Only a `server.connected` event is sent. No message history or session state is replayed. + +### Global Event Endpoint + +**File**: `packages/opencode/src/server/server.ts` (lines 127-170) + +The `/global/event` endpoint uses `GlobalBus` with identical behavior - no historical replay: + +```typescript +.get("/global/event", /* ... */, async (c) => { + log.info("global event connected") + return streamSSE(c, async (stream) => { + GlobalBus.on("event", handler) + // No historical events sent + await new Promise((resolve) => { + stream.onAbort(() => { + GlobalBus.off("event", handler) + resolve() + }) + }) + }) +}) +``` + +--- + +## 3. Bus Subscription Mechanism + +### No Event History Storage + +**File**: `packages/opencode/src/bus/index.ts` + +The Bus implementation stores subscriptions in memory with NO event history: + +```typescript +const state = Instance.state(() => { + const subscriptions = new Map() + return { subscriptions } // No event history! +}) + +export function subscribeAll(callback: (event: any) => void) { + return raw("*", callback) +} + +function raw(type: string, callback: (event: any) => void) { + const subscriptions = state().subscriptions + let match = subscriptions.get(type) ?? [] + match.push(callback) + subscriptions.set(type, match) + // ... returns unsubscribe function +} +``` + +**Key Characteristics**: +- No event history maintained +- `Bus.subscribeAll()` only calls subscribers with NEW events going forward +- Events are NOT stored or cached +- Pure pub/sub pattern with zero replay capability + +--- + +## 4. How Clients Get Historical Data + +Clients retrieve historical data through explicit REST API calls, not through event streams. + +### Message History API + +**File**: `packages/opencode/src/server/server.ts` (lines 838-875) + +```typescript +.get("/session/:id/message", /* ... */, async (c) => { + const query = c.req.valid("query") + const messages = await Session.messages({ + sessionID: c.req.valid("param").id, + limit: query.limit, // Supports pagination + }) + return c.json(messages) +}) +``` + +### Message Retrieval Implementation + +**File**: `packages/opencode/src/session/index.ts` (line 287) + +```typescript +export const messages = fn( + z.object({ + sessionID: Identifier.schema("session"), + limit: z.number().optional(), + }), + async (input) => { + const result = [] as MessageV2.WithParts[] + for await (const msg of MessageV2.stream(input.sessionID)) { + if (input.limit && result.length >= input.limit) break + result.push(msg) + } + result.reverse() + return result + }, +) +``` + +**File**: `packages/opencode/src/session/message-v2.ts` (line 670) + +```typescript +export const stream = fn(Identifier.schema("session"), async function* (sessionID) { + const list = await Array.fromAsync(await Storage.list(["message", sessionID])) + for (let i = list.length - 1; i >= 0; i--) { + yield await get({ + sessionID, + messageID: list[i][2], + }) + } +}) +``` + +Messages are fetched from persistent storage (file system), not from event streams. + +--- + +## 5. TUI Client Implementation + +### Explicit State Synchronization + +**File**: `packages/opencode/src/cli/cmd/tui/context/sync.tsx` + +The TUI client explicitly syncs session state on demand: + +```typescript +session: { + async sync(sessionID: string) { + if (store.message[sessionID]) return // Cache check + + // Fetch session data via explicit API calls + const [session, messages, todo, diff] = await Promise.all([ + sdk.client.session.get({ path: { id: sessionID } }), + sdk.client.session.messages({ path: { id: sessionID }, query: { limit: 100 } }), + sdk.client.session.todo({ path: { id: sessionID } }), + sdk.client.session.diff({ path: { id: sessionID } }), + ]) + + // Store in local state + setStore(produce((draft) => { + draft.message[sessionID] = messages.data!.map((x) => x.info) + for (const message of messages.data!) { + draft.part[message.info.id] = message.parts + } + // ... + })) + }, +} + +// Only future events are listened to via event stream +sdk.event.listen((e) => { + const event = e.details + // Handle message.updated, session.updated, etc. (NEW events only) +}) +``` + +**Key Points**: +- Session history is NOT replayed via SSE +- Client explicitly calls `/session/:id/message` API +- Messages are fetched with optional limit (pagination support) +- Event stream is used ONLY for incremental updates + +--- + +## 6. Architecture Pattern + +### Event Types Published + +**File**: `packages/opencode/src/session/index.ts` (lines 87-120) + +Session publishes these events for future subscribers only: + +```typescript +export const Event = { + Created: Bus.event("session.created", z.object({ info: Info })), + Updated: Bus.event("session.updated", z.object({ info: Info })), + Deleted: Bus.event("session.deleted", z.object({ info: Info })), + Diff: Bus.event("session.diff", z.object({ ... })), + Error: Bus.event("session.error", z.object({ ... })), +} +``` + +**File**: `packages/opencode/src/session/message-v2.ts` (lines 373-399) + +Message events: + +```typescript +export const Event = { + Updated: Bus.event("message.updated", z.object({ info: Info })), + Removed: Bus.event("message.removed", z.object({ ... })), + PartUpdated: Bus.event("message.part.updated", z.object({ ... })), + PartRemoved: Bus.event("message.part.removed", z.object({ ... })), +} +``` + +Events are only for NEW state changes, not for delivering historical state. + +### Design Rationale + +This architecture provides: + +1. **Scalability**: No need to store event history in memory +2. **Simplicity**: Clear separation between historical data and real-time updates +3. **Flexibility**: Clients can fetch exactly what they need via REST +4. **Efficiency**: SSE connections remain lightweight + +--- + +## Summary + +| Aspect | Finding | +|--------|---------| +| **Event History Storage** | None - events are not stored | +| **Historical Event Replay** | Not implemented | +| **SSE On Connect** | Sends only `server.connected` acknowledgment | +| **Future Events** | Streamed via `/event` or `/global/event` endpoints | +| **Session History Retrieval** | Explicit REST API calls (`GET /session/:id/message`) | +| **Message Pagination** | Supported via `limit` query parameter | +| **Client State Sync** | Lazy-loaded on demand via `session.sync()` | +| **Storage Backend** | File system based (via Storage abstraction) | + +**Conclusion**: OpenCode uses a clean separation of concerns where the event bus handles real-time notifications while REST endpoints handle data retrieval. New clients must explicitly fetch past data through API calls. diff --git a/packages/opencode/docs/analysis/golang-project-prompts.md b/packages/opencode/docs/analysis/golang-project-prompts.md new file mode 100644 index 00000000000..2c62c7be75a --- /dev/null +++ b/packages/opencode/docs/analysis/golang-project-prompts.md @@ -0,0 +1,388 @@ +# Golang Project Prompts Analysis + +This document provides a comprehensive analysis of what prompts are sent to the model when working on a Golang project in OpenCode. + +## Table of Contents + +1. [Executive Summary](#1-executive-summary) +2. [System Prompt Construction](#2-system-prompt-construction) +3. [Prompt Components](#3-prompt-components) +4. [Go-Specific Detection](#4-go-specific-detection) +5. [Complete Example Prompt](#5-complete-example-prompt) +6. [Go-Specific Tooling](#6-go-specific-tooling) +7. [Key Finding](#7-key-finding) + +--- + +## 1. Executive Summary + +**Important Finding**: There are **NO language-specific instructions for Go** in the system prompts. The model receives: + +- Generic coding instructions (same for all languages) +- Project structure (which happens to include `.go` files) +- Custom `AGENTS.md` instructions (if provided by the user) +- Go LSP diagnostics (when errors are reported via gopls) + +The model must infer Go best practices from the project context and its training. + +--- + +## 2. System Prompt Construction + +### Primary Entry Point + +**File**: `packages/opencode/src/session/prompt.ts` (lines 465-470) + +```typescript +const system = await resolveSystemPrompt({ + providerID: model.providerID, + modelID: model.info.id, + agent, + system: lastUser.system, +}) +``` + +### Resolution Function + +**File**: `packages/opencode/src/session/prompt.ts` (lines 621-641) + +```typescript +async function resolveSystemPrompt(input: { + system?: string + agent: Agent.Info + providerID: string + modelID: string +}) { + let system = SystemPrompt.header(input.providerID) + system.push( + ...(() => { + if (input.system) return [input.system] + if (input.agent.prompt) return [input.agent.prompt] + return SystemPrompt.provider(input.modelID) + })(), + ) + system.push(...(await SystemPrompt.environment())) + system.push(...(await SystemPrompt.custom())) + // max 2 system prompt messages for caching purposes + const [first, ...rest] = system + system = [first, rest.join("\n")] + return system +} +``` + +--- + +## 3. Prompt Components + +The final prompt sent consists of these components in order: + +### 1. Provider Header + +**File**: `packages/opencode/src/session/system.ts` (lines 22-25) + +```typescript +export function header(providerID: string) { + if (providerID.includes("anthropic")) return [PROMPT_ANTHROPIC_SPOOF.trim()] + return [] +} +``` + +### 2. Model-Specific Base Prompt + +**File**: `packages/opencode/src/session/system.ts` (lines 27-34) + +```typescript +export function provider(modelID: string) { + if (modelID.includes("gpt-5")) return [PROMPT_CODEX] + if (modelID.includes("gpt-") || modelID.includes("o1") || modelID.includes("o3")) return [PROMPT_BEAST] + if (modelID.includes("gemini-")) return [PROMPT_GEMINI] + if (modelID.includes("claude")) return [PROMPT_ANTHROPIC] + if (modelID.includes("polaris-alpha")) return [PROMPT_POLARIS] + return [PROMPT_ANTHROPIC_WITHOUT_TODO] +} +``` + +**Prompt files by model**: + +| Model | Prompt File | Lines | +|-------|-------------|-------| +| Claude | `src/session/prompt/anthropic.txt` | 106 | +| GPT-4/o1/o3 | `src/session/prompt/beast.txt` | lengthy | +| Gemini | `src/session/prompt/gemini.txt` | 156 | +| Polaris | `src/session/prompt/polaris.txt` | - | +| GPT-5 | `src/session/prompt/codex.txt` | 319 | +| Other | `src/session/prompt/qwen.txt` | - | + +### 3. Environment Context + +**File**: `packages/opencode/src/session/system.ts` (lines 36-59) + +```typescript +export async function environment() { + const project = Instance.project + return [ + [ + `Here is some useful information about the environment you are running in:`, + ``, + ` Working directory: ${Instance.directory}`, + ` Is directory a git repo: ${project.vcs === "git" ? "yes" : "no"}`, + ` Platform: ${process.platform}`, + ` Today's date: ${new Date().toDateString()}`, + ``, + ``, + ` ${ + project.vcs === "git" + ? await Ripgrep.tree({ + cwd: Instance.directory, + limit: 200, + }) + : "" + }`, + ``, + ].join("\n"), + ] +} +``` + +For a Go project, this includes: +- Working directory path +- Git repo status +- Platform (Linux/macOS/Windows) +- Current date +- Project file tree (first 200 files/directories) + +### 4. Custom Instructions + +**File**: `packages/opencode/src/session/system.ts` (lines 71-118) + +Custom instructions are loaded from (searched in order): + +**Local files** (project-specific): +- `AGENTS.md` - Agent instructions for the repo +- `CLAUDE.md` - Legacy Claude instructions +- `CONTEXT.md` - Deprecated context file + +**Global files** (user-level): +- `~/.claude/CLAUDE.md` +- `${Global.Path.config}/AGENTS.md` + +--- + +## 4. Go-Specific Detection + +### Go File Detection + +**File**: `packages/opencode/src/lsp/language.ts` (line 35) + +```typescript +".go": "go", +``` + +### Go Project Root Detection + +**File**: `packages/opencode/src/lsp/server.ts` (lines 211-217) + +```typescript +export const Gopls: Info = { + id: "gopls", + root: async (file) => { + const work = await NearestRoot(["go.work"])(file) + if (work) return work + return NearestRoot(["go.mod", "go.sum"])(file) + }, + extensions: [".go"], + // ... +} +``` + +Language detection looks for: +1. `go.work` (Go workspace files) +2. `go.mod` + `go.sum` (standard Go module files) + +### Gopls Language Server + +**File**: `packages/opencode/src/lsp/server.ts` (lines 219-250) + +```typescript +async spawn(root) { + let bin = Bun.which("gopls", { + PATH: process.env["PATH"] + ":" + Global.Path.bin, + }) + if (!bin) { + if (!Bun.which("go")) return + if (Flag.OPENCODE_DISABLE_LSP_DOWNLOAD) return + + log.info("installing gopls") + const proc = Bun.spawn({ + cmd: ["go", "install", "golang.org/x/tools/gopls@latest"], + env: { ...process.env, GOBIN: Global.Path.bin }, + stdout: "pipe", + stderr: "pipe", + stdin: "pipe", + }) + // ... installation logic + } + // ... +} +``` + +Gopls is automatically installed if not present. + +--- + +## 5. Complete Example Prompt + +When working on a Go project with Claude Sonnet, the model receives: + +### Message 1 (System) + +``` + + +You are OpenCode, the best coding agent on the planet. + +You are an interactive CLI tool that helps users with software engineering tasks. +Use the instructions below and the tools available to you to assist the user. + +... [full anthropic.txt content - 106 lines of instructions about: + - Tone and style + - Task management + - Tool usage + - Code editing guidelines + - Security considerations] +``` + +### Message 2 (System) + +``` +Here is some useful information about the environment you are running in: + + Working directory: /home/user/mygoproject + Is directory a git repo: yes + Platform: linux + Today's date: Sun Nov 24 2024 + + + mygoproject/ + go.mod + go.sum + main.go + cmd/ + server/ + main.go + internal/ + handler/ + handler.go + service/ + service.go + pkg/ + utils/ + helpers.go + tests/ + integration_test.go + README.md + Dockerfile + .gitignore + + +Instructions from: /home/user/mygoproject/AGENTS.md +... [custom instructions if file exists] +``` + +### Messages 3+ + +User messages with file contents, conversation history, tool calls, etc. + +--- + +## 6. Go-Specific Tooling + +### Go Formatter + +**File**: `packages/opencode/src/format/formatter.ts` (lines 14-21) + +```typescript +export const gofmt: Info = { + name: "gofmt", + command: ["gofmt", "-w", "$FILE"], + extensions: [".go"], + async enabled() { + return Bun.which("gofmt") !== null + }, +} +``` + +OpenCode uses `gofmt` (Go's standard formatter) when available. + +### Tool Context for Go Code + +**File**: `packages/opencode/src/session/prompt.ts` (lines 559-598) + +When processing Go code, the model has access to: + +| Tool | Usage for Go | +|------|--------------| +| **Read** | Read `.go` files | +| **Bash** | Run `go test`, `go build`, `go run`, `go fmt` | +| **Edit** | Modify Go source files | +| **LSP symbols** | Via gopls integration | +| **MCP servers** | Any connected MCP servers | + +The model sees `go.mod`/`go.sum` contents when referenced in conversation. + +--- + +## 7. Key Finding + +### No Go-Specific Instructions + +**Important**: The system prompts contain **no language-specific instructions for Go**. The model receives: + +1. **Generic coding instructions** (same for JavaScript, Python, Rust, etc.) +2. **Project structure** (which shows `.go` files, `go.mod`, etc.) +3. **Custom AGENTS.md** (if provided by user) +4. **Go LSP diagnostics** (when errors are reported) + +### How Go Best Practices Are Inferred + +The model must infer Go conventions from: + +1. **Project structure** - Standard Go layout (`cmd/`, `internal/`, `pkg/`) +2. **Existing `.go` files** - When read during conversation +3. **`go.mod` contents** - Module path, dependencies +4. **LSP diagnostics** - Type errors, unused imports from gopls +5. **User's custom instructions** - AGENTS.md can specify Go guidelines +6. **Model's training** - Knowledge of Go idioms, error handling patterns, etc. + +### Recommended AGENTS.md for Go Projects + +Users can add Go-specific instructions in `AGENTS.md`: + +```markdown +## Go Development Guidelines + +- Follow standard Go project layout (cmd/, internal/, pkg/) +- Use `go fmt` for formatting +- Handle errors explicitly, don't ignore them +- Use table-driven tests +- Prefer composition over inheritance +- Use interfaces for dependency injection +- Run `go vet` and `golangci-lint` before committing +``` + +--- + +## Summary Table + +| Aspect | Implementation | Go-Specific Details | +|--------|----------------|---------------------| +| **Language Detection** | File extension `.go` | Detected via LSP extensions | +| **Project Root** | `go.mod`, `go.sum`, `go.work` | Searches up directory tree | +| **LSP Server** | gopls (auto-installed) | `go install golang.org/x/tools/gopls@latest` | +| **Formatter** | gofmt | Called when saving Go files | +| **System Prompt** | Model-agnostic | No Go-specific instructions | +| **Environment Context** | File tree + metadata | Includes entire project structure | +| **Custom Instructions** | AGENTS.md/CLAUDE.md | User-provided only | +| **Diagnostics** | gopls errors/warnings | Reported through LSP | + +The design is **language-agnostic** - OpenCode treats Go projects the same as JavaScript, Python, or Rust projects, relying on LSP integration and the model's inherent knowledge of language conventions. diff --git a/packages/opencode/docs/analysis/lsp-selection-mechanism.md b/packages/opencode/docs/analysis/lsp-selection-mechanism.md new file mode 100644 index 00000000000..0c6b2feca3f --- /dev/null +++ b/packages/opencode/docs/analysis/lsp-selection-mechanism.md @@ -0,0 +1,475 @@ +# LSP Selection Mechanism Analysis + +This document provides a comprehensive analysis of how OpenCode decides which LSP (Language Server Protocol) server to use for different files and projects. + +## Table of Contents + +1. [Overview](#1-overview) +2. [Configuration Schema](#2-configuration-schema) +3. [Server Selection Algorithm](#3-server-selection-algorithm) +4. [Default LSP Servers](#4-default-lsp-servers) +5. [Root Detection Methods](#5-root-detection-methods) +6. [Custom LSP Configuration](#6-custom-lsp-configuration) +7. [Per-Project Settings](#7-per-project-settings) +8. [Auto-Discovery and Installation](#8-auto-discovery-and-installation) +9. [Multiple Servers Per File](#9-multiple-servers-per-file) +10. [Server Lifecycle](#10-server-lifecycle) + +--- + +## 1. Overview + +OpenCode's LSP selection is based on: + +1. **File extension** - Primary matching criteria +2. **Project root detection** - Finding the appropriate workspace +3. **Configuration** - User-defined server settings +4. **Availability** - Whether the server binary exists + +--- + +## 2. Configuration Schema + +**File**: `packages/opencode/src/config/config.ts` (lines 565-600) + +```typescript +lsp: z + .union([ + z.literal(false), // Disable all LSPs + z.record( + z.string(), // Server ID (e.g., "gopls", "typescript") + z.union([ + z.object({ + disabled: z.literal(true), // Disable specific server + }), + z.object({ + command: z.array(z.string()), // Command to spawn + extensions: z.array(z.string()).optional(), // File extensions + disabled: z.boolean().optional(), + env: z.record(z.string(), z.string()).optional(), // Environment vars + initialization: z.record(z.string(), z.any()).optional(), // Init options + }), + ]), + ), + ]) + .optional() +``` + +### Configuration Options + +| Option | Type | Description | +|--------|------|-------------| +| `lsp: false` | boolean | Disables all LSP servers globally | +| `lsp..disabled` | boolean | Disables a specific LSP server | +| `lsp..command` | string[] | Custom command to run the LSP | +| `lsp..extensions` | string[] | File extensions to match | +| `lsp..env` | object | Environment variables for the LSP process | +| `lsp..initialization` | object | Initialization options passed to LSP | + +--- + +## 3. Server Selection Algorithm + +**File**: `packages/opencode/src/lsp/index.ts` (lines 156-240) + +The `getClients(file)` function implements the selection: + +```typescript +async function getClients(file: string) { + const s = state() + const extension = path.parse(file).ext || file // Step 1: Get extension + const result: LSPClient[] = [] + + for (const [name, server] of Object.entries(s.servers)) { // Step 2: Iterate servers + // Step 3: Extension filtering + if (server.extensions.length && !server.extensions.includes(extension)) continue + + // Step 4: Root detection + const root = await server.root(file) + if (!root) continue + + // Step 5: Skip broken servers + if (s.broken.has(root + server.id)) continue + + // Step 6: Check cache + const key = root + server.id + const existing = s.clients[key] + if (existing) { + result.push(existing) + continue + } + + // Step 7: Check inflight spawns + const inflight = s.spawning.get(key) + if (inflight) { + const client = await inflight + if (client) result.push(client) + continue + } + + // Step 8: Spawn new server + const promise = (async () => { + const handle = await server.spawn(root) + if (!handle) { + s.broken.add(key) + return undefined + } + const client = await LSPClient.create({...}) + s.clients[key] = client + return client + })() + + s.spawning.set(key, promise) + const client = await promise + if (client) result.push(client) + } + + return result +} +``` + +### Selection Flow Summary + +``` +1. Extract file extension (.ts, .go, .py, etc.) +2. Load server configuration from Config +3. For each LSP server: + a. Check if file extension matches server.extensions + b. Call server.root(file) to determine project root + c. Skip if root not found or server previously failed + d. Check cache for existing client at (root, serverID) + e. If cached, return cached client + f. If spawning in progress, wait for completion + g. Otherwise spawn new server process +4. Return array of all applicable LSP clients +``` + +--- + +## 4. Default LSP Servers + +**File**: `packages/opencode/src/lsp/server.ts` (lines 13-1168) + +OpenCode ships with 19 built-in LSP server definitions: + +| Server | ID | Extensions | Root Detection | +|--------|-----|------------|----------------| +| **Deno** | `deno` | `.ts`, `.tsx`, `.js`, `.jsx`, `.mjs` | `deno.json`, `deno.jsonc` | +| **TypeScript** | `typescript` | `.ts`, `.tsx`, `.js`, `.jsx`, `.mjs`, `.cjs`, `.mts`, `.cts` | Lockfiles (excludes deno) | +| **Vue** | `vue` | `.vue` | Lockfiles | +| **ESLint** | `eslint` | `.ts`, `.tsx`, `.js`, `.jsx`, `.mts`, `.cts`, `.vue` | Lockfiles | +| **Go** | `gopls` | `.go` | `go.work`, `go.mod`, `go.sum` | +| **Ruby** | `ruby-lsp` | `.rb`, `.rake`, `.gemspec`, `.ru` | `Gemfile` | +| **Python** | `pyright` | `.py`, `.pyi` | `pyproject.toml`, `requirements.txt`, etc. | +| **Elixir** | `elixir-ls` | `.ex`, `.exs` | `mix.exs`, `mix.lock` | +| **Zig** | `zls` | `.zig`, `.zon` | `build.zig` | +| **C#** | `csharp` | `.cs` | `.sln`, `.csproj`, `global.json` | +| **Swift** | `sourcekit-lsp` | `.swift`, `.objc`, `.objcpp` | `Package.swift`, xcodeproj | +| **Rust** | `rust` | `.rs` | `Cargo.toml`, `Cargo.lock` | +| **C/C++** | `clangd` | `.c`, `.cpp`, `.cc`, `.h`, `.hpp` | `compile_commands.json`, `CMakeLists.txt` | +| **Svelte** | `svelte` | `.svelte` | Lockfiles | +| **Astro** | `astro` | `.astro` | Lockfiles | +| **Java** | `jdtls` | `.java` | `pom.xml`, `build.gradle` | +| **YAML** | `yaml-ls` | `.yaml`, `.yml` | Lockfiles | +| **Lua** | `lua-ls` | `.lua` | `.luarc.json`, `.luacheckrc` | +| **PHP** | `php intelephense` | `.php` | `composer.json` | + +--- + +## 5. Root Detection Methods + +### NearestRoot Pattern + +**File**: `packages/opencode/src/lsp/server.ts` (lines 23-45) + +```typescript +function NearestRoot(includePatterns: string[], excludePatterns?: string[]) { + return async (file: string) => { + let dir = path.dirname(file) + while (true) { + // Check exclude patterns first + if (excludePatterns) { + for (const pattern of excludePatterns) { + if (await Bun.file(path.join(dir, pattern)).exists()) { + return undefined + } + } + } + // Check include patterns + for (const pattern of includePatterns) { + if (await Bun.file(path.join(dir, pattern)).exists()) { + return dir + } + } + // Walk up directory tree + const parent = path.dirname(dir) + if (parent === dir) break + dir = parent + } + return Instance.directory // Fallback + } +} +``` + +### Language-Specific Root Detection + +**TypeScript** (lines 85-88): +- Looks for: `package-lock.json`, `bun.lockb`, `yarn.lock` +- Excludes: `deno.json`, `deno.jsonc` + +**Go** (lines 213-216): +- Prefers: `go.work` +- Falls back to: `go.mod`, `go.sum` + +**Rust** (lines 586-614): +- Finds workspace root by searching for `[workspace]` in `Cargo.toml` + +--- + +## 6. Custom LSP Configuration + +Custom LSP servers can be configured in `opencode.jsonc`: + +```jsonc +{ + "lsp": { + "my-custom-server": { + "command": ["node", "/path/to/server.js"], + "extensions": [".custom", ".myext"], + "env": { + "CUSTOM_VAR": "value" + }, + "initialization": { + "customOption": "value" + } + } + } +} +``` + +### Configuration Rules + +1. **Built-in override**: If server ID matches a built-in (e.g., "typescript"), it replaces that server +2. **Custom requirement**: Custom servers must specify the `extensions` array +3. **Validation**: Config validates that custom servers have extensions + +### Example: Override TypeScript LSP + +```jsonc +{ + "lsp": { + "typescript": { + "command": ["custom-ts-server", "--stdio"], + "extensions": [".ts", ".tsx"], + "initialization": { + "customOption": true + } + } + } +} +``` + +--- + +## 7. Per-Project Settings + +### Configuration Hierarchy + +**File**: `packages/opencode/src/config/config.ts` (lines 24-94) + +Priority order (later overrides earlier): + +1. Global config: `~/.opencode/config.json` +2. Worktree config: `.opencode/opencode.jsonc` +3. Project config: `/opencode.jsonc` +4. Environment variable: `OPENCODE_CONFIG` +5. Flag override: `OPENCODE_CONFIG_CONTENT` +6. Directory configs: All `.opencode` directories up the tree + +### Merge Strategy + +All configs are **deep-merged**. Example: + +``` +~/.opencode/config.json: # Global defaults + lsp: + typescript: { command: [...] } + +/.opencode/opencode.jsonc: # Workspace override + lsp: + typescript: + disabled: true + +/opencode.jsonc: # Project override + lsp: + typescript: + command: ["custom-ts-server"] # Re-enables with custom command +``` + +--- + +## 8. Auto-Discovery and Installation + +### Automatic Binary Download + +OpenCode auto-downloads LSP servers on-demand unless disabled: + +**Environment Variable**: `OPENCODE_DISABLE_LSP_DOWNLOAD` + +### Auto-Installation Methods + +| Server | Installation Method | +|--------|---------------------| +| **Gopls** | `go install golang.org/x/tools/gopls@latest` | +| **Pyright** | Downloads npm package to `$OPENCODE_BIN/node_modules/pyright` | +| **Clangd** | Downloads platform-specific binary from GitHub | +| **Zls** | Downloads and extracts platform-specific binary from GitHub | +| **ElixirLS** | Downloads from GitHub, compiles with `mix` | +| **JDTLS** | Downloads from Eclipse | +| **Ruby-LSP** | `gem install ruby-lsp` | +| **C#** | `dotnet tool install csharp-ls` | +| **Vue/Svelte/Astro** | Downloads from npm | + +### Binary Discovery + +Servers check multiple locations: + +```typescript +let bin = Bun.which("gopls", { + PATH: process.env["PATH"] + ":" + Global.Path.bin, +}) +``` + +- System PATH +- OpenCode's bin directory (`~/.opencode/bin`) + +--- + +## 9. Multiple Servers Per File + +OpenCode can run **multiple LSP servers for the same file** simultaneously. + +### Examples + +| File Type | Active Servers | +|-----------|----------------| +| `.ts` file | `typescript`, `eslint` | +| `.vue` file | `vue`, `typescript` | +| `.tsx` file | `typescript`, `eslint` | + +Each server provides different capabilities: +- TypeScript: Type checking, completions, hover +- ESLint: Linting, code style + +### Selection Result + +The `getClients()` function returns an **array** of all applicable clients: + +```typescript +const clients = await getClients("/project/src/app.ts") +// Returns: [typescriptClient, eslintClient] +``` + +--- + +## 10. Server Lifecycle + +### Initialization + +**File**: `packages/opencode/src/lsp/client.ts` (lines 76-106) + +```typescript +await connection.sendRequest("initialize", { + rootUri: "file://" + root, + initializationOptions: { + ...input.server.initialization, + }, + capabilities: { + window: { workDoneProgress: true }, + workspace: { configuration: true }, + textDocument: { + synchronization: { + didOpen: true, + didChange: true, + }, + publishDiagnostics: {}, + }, + }, +}) +await connection.sendNotification("initialized") +``` + +### Disabling Servers + +**Global disable**: +```jsonc +{ + "lsp": false +} +``` + +**Individual disable**: +```jsonc +{ + "lsp": { + "typescript": { "disabled": true }, + "eslint": { "disabled": true } + } +} +``` + +### Broken Server Tracking + +**Lines 165-172, 207** in `lsp/index.ts`: + +Failed servers are tracked to avoid repeated spawn attempts: + +```typescript +if (!handle) { + s.broken.add(key) // Key: "{root}{serverId}" + return undefined +} + +// Later, skip broken servers: +if (s.broken.has(root + server.id)) continue +``` + +### Shutdown + +**Lines 120-122** in `lsp/index.ts`: + +```typescript +async function shutdown() { + for (const client of Object.values(s.clients)) { + await client.shutdown() + } +} +``` + +--- + +## Summary: Decision Tree + +| Condition | Action | +|-----------|--------| +| `lsp: false` globally | No LSP servers run | +| Server `disabled: true` | Server removed from available servers | +| File extension not in `extensions` | Server skipped for that file | +| `server.root(file)` returns `undefined` | Server skipped (no project root) | +| Server in `s.broken` set | Server skipped (previously failed) | +| Existing client at (root, serverId) | Cached client reused | +| Server spawn inflight | Wait for spawn to complete | +| Otherwise | New server process spawned | + +--- + +## Key Files and Line Numbers + +| File | Lines | Purpose | +|------|-------|---------| +| `src/lsp/server.ts` | 13-1168 | LSP server definitions | +| `src/lsp/index.ts` | 156-240 | Server selection logic | +| `src/config/config.ts` | 565-600 | Configuration schema | +| `src/lsp/language.ts` | 1-106 | Language extensions mapping | +| `src/lsp/client.ts` | 1-216 | LSP client implementation | +| `src/config/config.ts` | 24-94 | Config loading and merging | diff --git a/packages/opencode/docs/analysis/lsp-utilization.md b/packages/opencode/docs/analysis/lsp-utilization.md new file mode 100644 index 00000000000..dc41ee0ddf1 --- /dev/null +++ b/packages/opencode/docs/analysis/lsp-utilization.md @@ -0,0 +1,389 @@ +# LSP (Language Server Protocol) Utilization Analysis + +This document provides a comprehensive analysis of how OpenCode utilizes LSP (Language Server Protocol) for enhanced code intelligence. + +## Table of Contents + +1. [Overview](#1-overview) +2. [LSP Core Integration](#2-lsp-core-integration) +3. [LSP Client Implementation](#3-lsp-client-implementation) +4. [Supported Language Servers](#4-supported-language-servers) +5. [LSP Data Usage](#5-lsp-data-usage) +6. [LSP Tools](#6-lsp-tools) +7. [LSP Lifecycle Management](#7-lsp-lifecycle-management) +8. [Event System](#8-event-system) +9. [Dependencies](#9-dependencies) + +--- + +## 1. Overview + +OpenCode has a comprehensive LSP integration that provides: + +- **25+ language servers** with automatic binary downloads +- **Diagnostics** automatically injected into edit tool context +- **Hover information** available for type inspection +- **Symbol search** for workspace and document symbols +- **On-demand server spawning** per file extension +- **Configurable per server** with custom commands and initialization options + +--- + +## 2. LSP Core Integration + +### Main API Location + +**File**: `packages/opencode/src/lsp/index.ts` (lines 1-370) + +The LSP namespace provides the primary interface for all LSP functionality. + +### Key Features Exposed + +| Function | Lines | Description | +|----------|-------|-------------| +| `LSP.init()` | 125-127 | Initializes LSP state via `Instance.state()` | +| `LSP.diagnostics()` | 256-266 | Aggregates diagnostics from all language servers | +| `LSP.hover()` | 268-280 | Sends `textDocument/hover` requests | +| `LSP.workspaceSymbol()` | 322-332 | Searches for symbols across workspace | +| `LSP.documentSymbol()` | 334-346 | Gets symbols within a specific file | +| `LSP.touchFile()` | 242-254 | Opens/updates files and optionally waits for diagnostics | +| `LSP.Diagnostic.pretty()` | 354-369 | Formats diagnostics with severity levels | + +### Workspace Symbol Filtering + +**Lines 322-332**: Workspace symbols are filtered to specific kinds: +- Classes +- Functions +- Methods +- Interfaces +- Variables +- Constants +- Structs +- Enums + +--- + +## 3. LSP Client Implementation + +### Transport Mechanism + +**File**: `packages/opencode/src/lsp/client.ts` (lines 1-215) + +Uses **stdio-based** transport (lines 41-44): + +```typescript +createMessageConnection( + new StreamMessageReader(input.server.process.stdout), + new StreamMessageWriter(input.server.process.stdin) +) +``` + +Uses `vscode-jsonrpc` for JSON-RPC communication with spawned language server processes. + +### LSP Client Initialization + +**Lines 76-116**: Sends LSP `initialize` request with: + +- Root URI and workspace folders +- Process ID +- Capabilities: + - Window: `workDoneProgress: true` + - Workspace: `configuration: true` + - TextDocument: `didOpen`, `didChange`, `publishDiagnostics` +- 5-second timeout (line 107) +- Followed by `initialized` notification (line 118) + +### Notification Handling + +**Diagnostics Publishing** (lines 47-56): +- Listens for `textDocument/publishDiagnostics` notifications +- Tracks diagnostics by file path +- Publishes `LSPClient.Event.Diagnostics` event + +**Window/Workspace Requests** (lines 57-72): +- `window/workDoneProgress/create` → returns null +- `workspace/configuration` → returns initialization options +- `client/registerCapability` and `unregisterCapability` → empty handlers +- `workspace/workspaceFolders` → returns workspace folder info + +### File Management + +**Lines 138-176**: + +```typescript +notify.open(file: string, text: string) +``` + +- Opens or updates files with language ID mapping via `LANGUAGE_EXTENSIONS` +- Tracks file versions to distinguish between `didOpen` and `didChange` +- Clears cached diagnostics on first open (line 165) + +### Diagnostics Waiting + +**Lines 181-201**: + +```typescript +waitForDiagnostics(file: string) +``` + +- 3-second timeout +- Subscribes to `LSPClient.Event.Diagnostics` bus events + +### Lifecycle + +**Lines 202-208**: + +```typescript +shutdown() +``` + +- Calls `connection.end()` +- Calls `connection.dispose()` +- Calls `process.kill()` + +--- + +## 4. Supported Language Servers + +**File**: `packages/opencode/src/lsp/server.ts` (lines 1-1168) + +OpenCode supports 19 built-in language servers: + +| Language Server | ID | Extensions | Root Finder | Auto-Install | +|---|---|---|---|---| +| **Deno** | `deno` | `.ts`, `.tsx`, `.js`, `.jsx`, `.mjs` | `deno.json`/`deno.jsonc` | No | +| **TypeScript** | `typescript` | `.ts`, `.tsx`, `.js`, `.jsx`, `.mjs`, `.cjs`, `.mts`, `.cts` | Lockfiles (excludes deno) | Yes (npm) | +| **Vue** | `vue` | `.vue` | Lockfiles | Yes (npm) | +| **ESLint** | `eslint` | `.ts`, `.tsx`, `.js`, `.jsx`, `.mts`, `.cts`, `.vue` | Lockfiles | Yes (VS Code server) | +| **Go (gopls)** | `gopls` | `.go` | `go.work`, `go.mod`/`go.sum` | Yes (`go install`) | +| **Ruby** | `ruby-lsp` | `.rb`, `.rake`, `.gemspec`, `.ru` | `Gemfile` | Yes (`gem install`) | +| **Python (Pyright)** | `pyright` | `.py`, `.pyi` | `pyproject.toml`, `requirements.txt`, etc. | Yes (npm) | +| **Elixir** | `elixir-ls` | `.ex`, `.exs` | `mix.exs`, `mix.lock` | Yes (GitHub) | +| **Zig** | `zls` | `.zig`, `.zon` | `build.zig` | Yes (GitHub) | +| **C#** | `csharp` | `.cs` | `.sln`, `.csproj`, `global.json` | Yes (`dotnet tool`) | +| **Swift** | `sourcekit-lsp` | `.swift`, `.objc`, `.objcpp` | `Package.swift`, xcodeproj | No | +| **Rust** | `rust` | `.rs` | `Cargo.toml`/`Cargo.lock` | No | +| **Clang (C++)** | `clangd` | `.c`, `.cpp`, `.cc`, `.h`, `.hpp` | `compile_commands.json`, `CMakeLists.txt` | Yes (GitHub) | +| **Svelte** | `svelte` | `.svelte` | Lockfiles | Yes (npm) | +| **Astro** | `astro` | `.astro` | Lockfiles | Yes (npm) | +| **Java (JDTLS)** | `jdtls` | `.java` | `pom.xml`, `build.gradle` | Yes (Eclipse) | +| **YAML** | `yaml-ls` | `.yaml`, `.yml` | Lockfiles | Yes (npm) | +| **Lua** | `lua-ls` | `.lua` | `.luarc.json`, `.luacheckrc` | Yes (GitHub) | +| **PHP** | `php intelephense` | `.php` | `composer.json` | Yes (npm) | + +### Root Finding Strategy + +**Lines 23-45**: `NearestRoot()` function: + +- Searches up directory tree for specific markers +- Supports `excludePatterns` to skip certain paths +- Falls back to instance directory if no markers found + +### Auto-Download Capability + +Respects `Flag.OPENCODE_DISABLE_LSP_DOWNLOAD` to disable automatic downloads. + +--- + +## 5. LSP Data Usage + +### In Edit Tool + +**File**: `packages/opencode/src/tool/edit.ts` (lines 139-150) + +After file edits, diagnostics are automatically fetched and displayed: + +```typescript +await LSP.touchFile(filePath, true) // Wait for diagnostics +const diagnostics = await LSP.diagnostics() +// Filter for errors (severity 1) and display to model +issues.filter((item) => item.severity === 1).map(LSP.Diagnostic.pretty) +``` + +This provides immediate feedback on syntax errors and type issues after edits. + +### In Prompt Generation + +**File**: `packages/opencode/src/session/prompt.ts` (lines 862-880) + +When file ranges from workspace symbol searches are incomplete: + +```typescript +const symbols = await LSP.documentSymbol(filePathURI) +// Matches symbol line numbers to refine start/end positions +// Uses range data to calculate file offset and limit for Read tool +``` + +### Symbol Source Tracking + +**File**: `packages/opencode/src/session/message-v2.ts` (lines 99-118) + +Symbols are tracked with source metadata: + +```typescript +SymbolSource = z.object({ + path: z.string(), // File path + range: LSP.Range, // Start/end line, character + name: z.string(), // Symbol name + kind: z.number(), // LSP symbol kind +}) +``` + +--- + +## 6. LSP Tools + +### Diagnostics Tool + +**File**: `packages/opencode/src/tool/lsp-diagnostics.ts` (lines 1-26) + +- **Tool ID**: `lsp_diagnostics` +- **Parameters**: `path` (string) +- **Execution**: Touches file, waits for diagnostics, returns formatted errors + +### Hover Tool + +**File**: `packages/opencode/src/tool/lsp-hover.ts` (lines 1-31) + +- **Tool ID**: `lsp_hover` +- **Parameters**: `file`, `line`, `character` (numbers) +- **Execution**: Touches file, sends hover request, returns JSON response + +**Note**: Both tools are marked "do not use" - not currently exposed to models directly. + +### Debug Commands + +**File**: `packages/opencode/src/cli/cmd/debug/lsp.ts` (lines 1-47) + +Available CLI commands for debugging: + +```bash +opencode debug lsp diagnostics +opencode debug lsp symbols +opencode debug lsp document-symbols +``` + +--- + +## 7. LSP Lifecycle Management + +### Initialization + +**File**: `packages/opencode/src/project/bootstrap.ts` (line 21) + +```typescript +await LSP.init() // Called during instance bootstrap +``` + +### Per-File Activation + +**Lines 156-240** in `lsp/index.ts`: + +The `getClients(file)` function: + +1. Determines which servers handle a file by extension +2. Spawns servers on-demand based on file extension match +3. Caches spawned clients to avoid duplication +4. Tracks "broken" servers to avoid repeated spawn attempts +5. Uses inflight promises to deduplicate simultaneous spawn requests + +### Configuration Loading + +LSP configuration is loaded from: + +- Config files (`opencode.jsonc`/`opencode.json`) +- Environment variable `OPENCODE_CONFIG` +- `Flag.OPENCODE_CONFIG_CONTENT` for inline config + +Configuration can disable LSP globally or per-server: + +```json +{ + "lsp": { + "typescript": { + "disabled": true, + "command": ["custom-ts-lsp"], + "env": { ... }, + "extensions": [".ts", ".tsx"], + "initialization": { ... } + } + } +} +``` + +### Shutdown + +**Lines 120-122** in `lsp/index.ts`: + +- Triggered during instance cleanup +- Calls `client.shutdown()` on all active clients +- Closes connections and kills processes + +--- + +## 8. Event System + +### LSP Events + +**File**: `packages/opencode/src/lsp/index.ts` (lines 14-16) + +```typescript +Event.Updated: Bus.event("lsp.updated", {}) +// Fired when new clients connect +``` + +### Client Events + +**File**: `packages/opencode/src/lsp/client.ts` (lines 27-35) + +```typescript +Event.Diagnostics: Bus.event("lsp.client.diagnostics", { + serverID: string, + path: string +}) +``` + +--- + +## 9. Dependencies + +**File**: `packages/opencode/package.json` + +```json +{ + "devDependencies": { + "vscode-languageserver-types": "3.17.5" // LSP type definitions + }, + "dependencies": { + "vscode-jsonrpc": "8.2.1" // JSON-RPC transport + } +} +``` + +--- + +## Language-to-Extension Mapping + +**File**: `packages/opencode/src/lsp/language.ts` (lines 1-106) + +Maps 100+ file extensions to LSP language IDs. Used by: + +- LSP client to determine language ID when opening files +- Session UI for syntax highlighting + +--- + +## Summary + +OpenCode's LSP integration is **comprehensive and modern**: + +| Feature | Implementation | +|---------|----------------| +| **Transport** | Stdio-based communication | +| **Servers** | 25+ language servers with auto-download | +| **Spawning** | On-demand per file extension | +| **Diagnostics** | Automatically injected into edit tool context | +| **Symbols** | Hover and symbol information for code navigation | +| **Configuration** | Per server with custom commands and init options | +| **Error Handling** | Timeouts, broken server tracking | +| **Events** | Bus-based event system for diagnostics updates | From da9ea8170346a763159debe33c2cf32e1f84bf53 Mon Sep 17 00:00:00 2001 From: Claude Date: Mon, 24 Nov 2025 04:28:59 +0000 Subject: [PATCH 03/58] docs: add comprehensive system prompt construction analysis Document the complete system prompt sent to LLMs including: - All prompt template files (anthropic.txt, beast.txt, gemini.txt, etc.) - Step-by-step construction process in resolveSystemPrompt() - Full anthropic.txt content (106 lines) with all sections - Environment context template with variable substitution - Custom instructions loading from AGENTS.md/CLAUDE.md - Final 2-message structure for caching optimization - Complete example of final prompt for Claude on Go project - Model-specific variations (GPT, Gemini, etc.) --- .../analysis/system-prompt-construction.md | 581 ++++++++++++++++++ 1 file changed, 581 insertions(+) create mode 100644 packages/opencode/docs/analysis/system-prompt-construction.md diff --git a/packages/opencode/docs/analysis/system-prompt-construction.md b/packages/opencode/docs/analysis/system-prompt-construction.md new file mode 100644 index 00000000000..d31e02c9849 --- /dev/null +++ b/packages/opencode/docs/analysis/system-prompt-construction.md @@ -0,0 +1,581 @@ +# System Prompt Construction Analysis + +This document provides a comprehensive analysis of the system prompt sent to LLMs in OpenCode, including the template processing and final prompt structure. + +## Table of Contents + +1. [Overview](#1-overview) +2. [Prompt Template Files](#2-prompt-template-files) +3. [Construction Process](#3-construction-process) +4. [Complete Anthropic Prompt](#4-complete-anthropic-prompt) +5. [Environment Context Template](#5-environment-context-template) +6. [Custom Instructions Loading](#6-custom-instructions-loading) +7. [Final Message Structure](#7-final-message-structure) +8. [Complete Example](#8-complete-example) +9. [Model-Specific Variations](#9-model-specific-variations) + +--- + +## 1. Overview + +The system prompt is constructed from multiple components assembled in a specific order: + +1. **Provider Header** (Anthropic only) +2. **Base Prompt** (model-specific) +3. **Environment Context** (dynamic) +4. **Custom Instructions** (user-defined) + +The final prompt is optimized into **2 system messages** for caching efficiency. + +--- + +## 2. Prompt Template Files + +**Location**: `packages/opencode/src/session/prompt/` + +### Main Prompts + +| File | Model | Lines | Purpose | +|------|-------|-------|---------| +| `anthropic.txt` | Claude models | 106 | Main coding assistant prompt | +| `beast.txt` | GPT-4/o1/o3 | lengthy | Autonomous problem-solving | +| `gemini.txt` | Gemini | 156 | Gemini-specific instructions | +| `qwen.txt` | Other models | minimal | Concise responses | +| `polaris.txt` | Polaris-alpha | - | Polaris-specific | +| `codex.txt` | GPT-5 | 319 | Detailed workflows | + +### Utility Prompts + +| File | Purpose | +|------|---------| +| `anthropic_spoof.txt` | Anthropic provider header | +| `summarize.txt` | Conversation summaries | +| `compaction.txt` | Context compression | +| `title.txt` | Thread title generation | +| `plan.txt` | Read-only phase constraint | +| `build-switch.txt` | Plan/build agent switching | + +--- + +## 3. Construction Process + +### Entry Point + +**File**: `packages/opencode/src/session/prompt.ts` (lines 621-641) + +```typescript +async function resolveSystemPrompt(input: { + system?: string + agent: Agent.Info + providerID: string + modelID: string +}) { + let system = SystemPrompt.header(input.providerID) // Step 1 + system.push( + ...(() => { + if (input.system) return [input.system] // Step 2a + if (input.agent.prompt) return [input.agent.prompt] // Step 2b + return SystemPrompt.provider(input.modelID) // Step 2c + })(), + ) + system.push(...(await SystemPrompt.environment())) // Step 3 + system.push(...(await SystemPrompt.custom())) // Step 4 + + // Optimization: Combine into 2 messages for caching + const [first, ...rest] = system + system = [first, rest.join("\n")] + return system +} +``` + +### Step-by-Step Assembly + +**Step 1: Provider Header** + +**File**: `packages/opencode/src/session/system.ts` (lines 22-25) + +```typescript +export function header(providerID: string) { + if (providerID.includes("anthropic")) return [PROMPT_ANTHROPIC_SPOOF.trim()] + return [] +} +``` + +Only Anthropic provider gets: `"You are Claude Code, Anthropic's official CLI for Claude."` + +**Step 2: Base Prompt Selection** + +**File**: `packages/opencode/src/session/system.ts` (lines 27-34) + +```typescript +export function provider(modelID: string) { + if (modelID.includes("gpt-5")) return [PROMPT_CODEX] + if (modelID.includes("gpt-") || modelID.includes("o1") || modelID.includes("o3")) + return [PROMPT_BEAST] + if (modelID.includes("gemini-")) return [PROMPT_GEMINI] + if (modelID.includes("claude")) return [PROMPT_ANTHROPIC] + if (modelID.includes("polaris-alpha")) return [PROMPT_POLARIS] + return [PROMPT_ANTHROPIC_WITHOUT_TODO] // Default (qwen.txt) +} +``` + +Priority order: +1. Custom system override (`input.system`) +2. Agent-specific prompt (`input.agent.prompt`) +3. Model-specific default + +**Step 3: Environment Context** (see [Section 5](#5-environment-context-template)) + +**Step 4: Custom Instructions** (see [Section 6](#6-custom-instructions-loading)) + +--- + +## 4. Complete Anthropic Prompt + +**File**: `packages/opencode/src/session/prompt/anthropic.txt` + +This is the main prompt for Claude models (106 lines): + +``` +You are OpenCode, the best coding agent on the planet. + +You are an interactive CLI tool that helps users with software engineering tasks. Use the instructions below and the tools available to you to assist the user. + +IMPORTANT: You must NEVER generate or guess URLs for the user unless you are confident that the URLs are for helping the user with programming. You may use URLs provided by the user in their messages or local files. + +If the user asks for help or wants to give feedback inform them of the following: +- ctrl+p to list available actions +- To give feedback, users should report the issue at + https://github.com/sst/opencode + +When the user directly asks about OpenCode (eg. "can OpenCode do...", "does OpenCode have..."), or asks in second person (eg. "are you able...", "can you do..."), or asks how to use a specific OpenCode feature (eg. implement a hook, write a slash command, or install an MCP server), use the WebFetch tool to gather information to answer the question from OpenCode docs. The list of available docs is available at https://opencode.ai/docs + +# Tone and style +- Only use emojis if the user explicitly requests it. Avoid using emojis in all communication unless asked. +- Your output will be displayed on a command line interface. Your responses should be short and concise. You can use Github-flavored markdown for formatting, and will be rendered in a monospace font using the CommonMark specification. +- Output text to communicate with the user; all text you output outside of tool use is displayed to the user. Only use tools to complete tasks. Never use tools like Bash or code comments as means to communicate with the user during the session. +- NEVER create files unless they're absolutely necessary for achieving your goal. ALWAYS prefer editing an existing file to creating a new one. This includes markdown files. + +# Professional objectivity +Prioritize technical accuracy and truthfulness over validating the user's beliefs. Focus on facts and problem-solving, providing direct, objective technical info without any unnecessary superlatives, praise, or emotional validation. It is best for the user if OpenCode honestly applies the same rigorous standards to all ideas and disagrees when necessary, even if it may not be what the user wants to hear. Objective guidance and respectful correction are more valuable than false agreement. Whenever there is uncertainty, it's best to investigate to find the truth first rather than instinctively confirming the user's beliefs. + +# Task Management +You have access to the TodoWrite tools to help you manage and plan tasks. Use these tools VERY frequently to ensure that you are tracking your tasks and giving the user visibility into your progress. +These tools are also EXTREMELY helpful for planning tasks, and for breaking down larger complex tasks into smaller steps. If you do not use this tool when planning, you may forget to do important tasks - and that is unacceptable. + +It is critical that you mark todos as completed as soon as you are done with a task. Do not batch up multiple tasks before marking them as completed. + +Examples: + + +user: Run the build and fix any type errors +assistant: I'm going to use the TodoWrite tool to write the following items to the todo list: +- Run the build +- Fix any type errors + +I'm now going to run the build using Bash. + +Looks like I found 10 type errors. I'm going to use the TodoWrite tool to write 10 items to the todo list. + +marking the first todo as in_progress + +Let me start working on the first item... + +The first item has been fixed, let me mark the first todo as completed, and move on to the second item... +.. +.. + +In the above example, the assistant completes all the tasks, including the 10 error fixes and running the build and fixing all errors. + + +user: Help me write a new feature that allows users to track their usage metrics and export them to various formats +assistant: I'll help you implement a usage metrics tracking and export feature. Let me first use the TodoWrite tool to plan this task. +Adding the following todos to the todo list: +1. Research existing metrics tracking in the codebase +2. Design the metrics collection system +3. Implement core metrics tracking functionality +4. Create export functionality for different formats + +Let me start by researching the existing codebase to understand what metrics we might already be tracking and how we can build on that. + +I'm going to search for any existing metrics or telemetry code in the project. + +I've found some existing telemetry code. Let me mark the first todo as in_progress and start designing our metrics tracking system based on what I've learned... + +[Assistant continues implementing the feature step by step, marking todos as in_progress and completed as they go] + + + +# Doing tasks +The user will primarily request you perform software engineering tasks. This includes solving bugs, adding new functionality, refactoring code, explaining code, and more. For these tasks the following steps are recommended: +- +- Use the TodoWrite tool to plan the task if required + +- Tool results and user messages may include tags. tags contain useful information and reminders. They are automatically added by the system, and bear no direct relation to the specific tool results or user messages in which they appear. + + +# Tool usage policy +- When doing file search, prefer to use the Task tool in order to reduce context usage. +- You should proactively use the Task tool with specialized agents when the task at hand matches the agent's description. + +- When WebFetch returns a message about a redirect to a different host, you should immediately make a new WebFetch request with the redirect URL provided in the response. +- You can call multiple tools in a single response. If you intend to call multiple tools and there are no dependencies between them, make all independent tool calls in parallel. Maximize use of parallel tool calls where possible to increase efficiency. However, if some tool calls depend on previous calls to inform dependent values, do NOT call these tools in parallel and instead call them sequentially. For instance, if one operation must complete before another starts, run these operations sequentially instead. Never use placeholders or guess missing parameters in tool calls. +- If the user specifies that they want you to run tools "in parallel", you MUST send a single message with multiple tool use content blocks. For example, if you need to launch multiple agents in parallel, send a single message with multiple Task tool calls. +- Use specialized tools instead of bash commands when possible, as this provides a better user experience. For file operations, use dedicated tools: Read for reading files instead of cat/head/tail, Edit for editing instead of sed/awk, and Write for creating files instead of cat with heredoc or echo redirection. Reserve bash tools exclusively for actual system commands and terminal operations that require shell execution. NEVER use bash echo or other command-line tools to communicate thoughts, explanations, or instructions to the user. Output all communication directly in your response text instead. +- VERY IMPORTANT: When exploring the codebase to gather context or to answer a question that is not a needle query for a specific file/class/function, it is CRITICAL that you use the Task tool instead of running search commands directly. + +user: Where are errors from the client handled? +assistant: [Uses the Task tool to find the files that handle client errors instead of using Glob or Grep directly] + + +user: What is the codebase structure? +assistant: [Uses the Task tool] + + +IMPORTANT: Always use the TodoWrite tool to plan and track tasks throughout the conversation. + +# Code References + +When referencing specific functions or pieces of code include the pattern `file_path:line_number` to allow the user to easily navigate to the source code location. + + +user: Where are errors from the client handled? +assistant: Clients are marked as failed in the `connectToServer` function in src/services/process.ts:712. + +``` + +--- + +## 5. Environment Context Template + +**File**: `packages/opencode/src/session/system.ts` (lines 36-59) + +```typescript +export async function environment() { + const project = Instance.project + return [ + [ + `Here is some useful information about the environment you are running in:`, + ``, + ` Working directory: ${Instance.directory}`, + ` Is directory a git repo: ${project.vcs === "git" ? "yes" : "no"}`, + ` Platform: ${process.platform}`, + ` Today's date: ${new Date().toDateString()}`, + ``, + ``, + ` ${ + project.vcs === "git" + ? await Ripgrep.tree({ + cwd: Instance.directory, + limit: 200, + }) + : "" + }`, + ``, + ].join("\n"), + ] +} +``` + +### Variables Substituted + +| Variable | Source | Example | +|----------|--------|---------| +| `${Instance.directory}` | Current working directory | `/home/user/myproject` | +| `${project.vcs === "git" ? "yes" : "no"}` | Git status | `yes` | +| `${process.platform}` | OS platform | `linux`, `darwin`, `win32` | +| `${new Date().toDateString()}` | Current date | `Sun Nov 24 2024` | +| File tree | Ripgrep.tree (limit 200) | Indented file listing | + +### Example Output + +``` +Here is some useful information about the environment you are running in: + + Working directory: /home/user/myproject + Is directory a git repo: yes + Platform: linux + Today's date: Sun Nov 24 2024 + + + myproject/ + .git/ + src/ + main.go + handlers/ + api.go + models/ + user.go + go.mod + go.sum + README.md + +``` + +--- + +## 6. Custom Instructions Loading + +**File**: `packages/opencode/src/session/system.ts` (lines 61-118) + +### Search Paths + +**Local files** (project-specific, searched in order): +1. `AGENTS.md` +2. `CLAUDE.md` +3. `CONTEXT.md` (deprecated) + +**Global files** (user-level, searched in order): +1. `~/.opencode/AGENTS.md` (Global.Path.config) +2. `~/.claude/CLAUDE.md` + +### Loading Logic + +```typescript +export async function custom() { + const config = await Config.get() + const paths = new Set() + + // Search for local rule files (first match wins per category) + for (const localRuleFile of LOCAL_RULE_FILES) { + const matches = await Filesystem.findUp(localRuleFile, Instance.directory, Instance.worktree) + if (matches.length > 0) { + matches.forEach((path) => paths.add(path)) + break + } + } + + // Search for global rule files + for (const globalRuleFile of GLOBAL_RULE_FILES) { + if (await Bun.file(globalRuleFile).exists()) { + paths.add(globalRuleFile) + break + } + } + + // Config-based instructions + if (config.instructions) { + for (let instruction of config.instructions) { + if (instruction.startsWith("~/")) { + instruction = path.join(os.homedir(), instruction.slice(2)) + } + // ... glob pattern resolution + } + } + + // Format each instruction + const found = Array.from(paths).map((p) => + Bun.file(p) + .text() + .then((x) => "Instructions from: " + p + "\n" + x), + ) + return Promise.all(found) +} +``` + +### Output Format + +Each instruction file is prefixed with its source: + +``` +Instructions from: /home/user/myproject/AGENTS.md +## Project Guidelines + +- Use Go idioms and error handling patterns +- Write table-driven tests +- Run `go fmt` before committing + +Instructions from: /home/user/.claude/CLAUDE.md +## Personal Preferences + +- Always explain your reasoning +- Prefer simple solutions +``` + +--- + +## 7. Final Message Structure + +### Message Assembly + +**File**: `packages/opencode/src/session/prompt.ts` (lines 559-581) + +```typescript +messages: [ + ...system.map( + (x): ModelMessage => ({ + role: "system", + content: x, + }), + ), + ...MessageV2.toModelMessage( + msgs.filter(...) // Conversation history + ), +] +``` + +### Structure + +The final system prompt is **2 messages** (for caching optimization): + +**Message 1 (System)**: +- Provider header (Anthropic only) +- Base prompt (anthropic.txt, beast.txt, etc.) + +**Message 2 (System)**: +- Environment context +- Custom instructions (joined with `\n`) + +**Messages 3+ (User/Assistant)**: +- Converted conversation history + +--- + +## 8. Complete Example + +For a Claude model working on a Go project: + +### System Message 1 + +``` +You are Claude Code, Anthropic's official CLI for Claude. +You are OpenCode, the best coding agent on the planet. + +You are an interactive CLI tool that helps users with software engineering tasks. Use the instructions below and the tools available to you to assist the user. + +IMPORTANT: You must NEVER generate or guess URLs for the user unless you are confident that the URLs are for helping the user with programming. You may use URLs provided by the user in their messages or local files. + +If the user asks for help or wants to give feedback inform them of the following: +- ctrl+p to list available actions +- To give feedback, users should report the issue at + https://github.com/sst/opencode + +[... rest of anthropic.txt - 106 lines total ...] + +IMPORTANT: Always use the TodoWrite tool to plan and track tasks throughout the conversation. + +# Code References + +When referencing specific functions or pieces of code include the pattern `file_path:line_number` to allow the user to easily navigate to the source code location. + + +user: Where are errors from the client handled? +assistant: Clients are marked as failed in the `connectToServer` function in src/services/process.ts:712. + +``` + +### System Message 2 + +``` +Here is some useful information about the environment you are running in: + + Working directory: /home/user/mygoproject + Is directory a git repo: yes + Platform: linux + Today's date: Sun Nov 24 2024 + + + mygoproject/ + .git/ + cmd/ + server/ + main.go + internal/ + handler/ + handler.go + service/ + service.go + pkg/ + utils/ + helpers.go + go.mod + go.sum + main.go + README.md + Dockerfile + .gitignore + + +Instructions from: /home/user/mygoproject/AGENTS.md +## Go Development Guidelines + +- Follow standard Go project layout (cmd/, internal/, pkg/) +- Use `go fmt` for formatting +- Handle errors explicitly, don't ignore them +- Use table-driven tests +- Run `go vet` before committing + +Instructions from: /home/user/.claude/CLAUDE.md +## Personal Preferences + +- Always explain your reasoning +- Show file paths with line numbers +``` + +--- + +## 9. Model-Specific Variations + +### Anthropic (Claude) + +- **Header**: "You are Claude Code, Anthropic's official CLI for Claude." +- **Base**: anthropic.txt (full TodoWrite instructions) +- **Focus**: Task management, tool parallelism, code references + +### OpenAI (GPT-4, o1, o3) + +- **Header**: None +- **Base**: beast.txt +- **Focus**: Autonomous problem-solving, extensive research, rigorous testing + +### OpenAI (GPT-5) + +- **Header**: None +- **Base**: codex.txt +- **Focus**: Detailed workflows, sandbox/approvals, AGENTS.md spec + +### Google (Gemini) + +- **Header**: None +- **Base**: gemini.txt +- **Focus**: Gemini-specific capabilities + +### Other Models + +- **Header**: None +- **Base**: qwen.txt (minimal) +- **Focus**: Concise responses (1-3 sentences), safety warnings + +--- + +## Key Implementation Details + +### Caching Optimization + +The system prompt is limited to 2 messages: +- First message: Raw header + base prompt +- Second message: Combined environment + custom instructions + +This enables prompt caching at LLM provider level. + +### Dynamic File References + +**File**: `packages/opencode/src/session/prompt.ts` (lines 145-191) + +Prompts support `[[file:path]]` syntax for dynamic file inclusion. + +### Agent Overrides + +Agents can completely replace the base prompt with their own via the `agent.prompt` field. + +--- + +## Files Summary + +| File | Lines | Purpose | +|------|-------|---------| +| `src/session/system.ts` | 1-146 | Core system prompt assembly | +| `src/session/prompt.ts` | 621-641 | resolveSystemPrompt function | +| `src/session/prompt/anthropic.txt` | 106 | Claude base prompt | +| `src/session/prompt/beast.txt` | - | GPT base prompt | +| `src/session/prompt/anthropic_spoof.txt` | 1 | Anthropic header | From e2ad354d391b51127b053d6a026629d59dc0cf3f Mon Sep 17 00:00:00 2001 From: Claude Date: Mon, 24 Nov 2025 05:37:20 +0000 Subject: [PATCH 04/58] docs: add comprehensive design for client-side tools Add design document for enabling clients to register tool definitions with the server and have the server delegate execution back to the client. Key aspects covered: - Protocol design with new message types for tool requests/responses - Client tool registry for server-side management - SDK client tools manager for registering and handling tools - Both SSE and WebSocket communication options - Security considerations (auth, sandboxing, rate limiting) - Error handling and timeout management - Usage examples for common scenarios - Phased implementation plan --- docs/design/client-side-tools.md | 1257 ++++++++++++++++++++++++++++++ 1 file changed, 1257 insertions(+) create mode 100644 docs/design/client-side-tools.md diff --git a/docs/design/client-side-tools.md b/docs/design/client-side-tools.md new file mode 100644 index 00000000000..c39fe58c7ab --- /dev/null +++ b/docs/design/client-side-tools.md @@ -0,0 +1,1257 @@ +# Client-Side Tools Design Document + +## Overview + +This document describes the design for client-side tools in OpenCode, where clients can register tool definitions with the server, and the server delegates tool execution back to the client. + +### Goals + +1. **Client Tool Registration**: Allow SDK clients to define and register tools with the server +2. **Server Delegation**: Enable the server to delegate tool execution to the originating client +3. **Bidirectional Communication**: Support real-time communication for tool execution requests/responses +4. **Seamless Integration**: Integrate with existing tool infrastructure (permissions, hooks, streaming) +5. **Multi-Client Support**: Handle multiple clients with different tool sets + +### Non-Goals + +- Replacing existing server-side tools +- Cross-client tool sharing (tools are scoped to their registering client) +- Persistent tool registration (tools exist only for session lifetime) + +--- + +## Architecture Overview + +``` +┌─────────────────┐ ┌─────────────────┐ +│ SDK Client │ │ OpenCode │ +│ │ │ Server │ +│ ┌─────────────┐ │ Register Tools │ │ +│ │ Tool Defs │─┼───────────────────►│ ┌─────────────┐ │ +│ └─────────────┘ │ │ │Client Tool │ │ +│ │ │ │Registry │ │ +│ ┌─────────────┐ │ Execute Request │ └─────────────┘ │ +│ │ Tool │◄├────────────────────┤ │ +│ │ Handlers │ │ │ ┌─────────────┐ │ +│ └──────┬──────┘ │ Execute Result │ │Session │ │ +│ │ ├───────────────────►│ │Processor │ │ +│ ▼ │ │ └─────────────┘ │ +│ ┌─────────────┐ │ │ │ +│ │ Local │ │ │ ┌─────────────┐ │ +│ │ Execution │ │ Stream │ │AI Model │ │ +│ └─────────────┘ │◄───────────────────┤ └─────────────┘ │ +└─────────────────┘ └─────────────────┘ +``` + +--- + +## Protocol Design + +### New Message Types + +Add to `/packages/opencode/src/session/message-v2.ts`: + +```typescript +// Client tool definition sent during registration +export type ClientToolDefinition = { + id: string + description: string + parameters: JsonSchema7 // JSON Schema for tool parameters +} + +// Request sent from server to client for tool execution +export type ClientToolExecutionRequest = { + type: "client-tool-request" + requestID: string + sessionID: string + messageID: string + callID: string + tool: string + input: Record +} + +// Response sent from client to server after execution +export type ClientToolExecutionResponse = { + type: "client-tool-response" + requestID: string + result: ClientToolResult | ClientToolError +} + +export type ClientToolResult = { + status: "success" + title: string + output: string + metadata?: Record + attachments?: FilePart[] +} + +export type ClientToolError = { + status: "error" + error: string +} +``` + +### New API Endpoints + +Add to server API (in `/packages/opencode/src/server/`): + +```typescript +// POST /client-tools/register +// Register client tools for a session +interface RegisterClientToolsRequest { + sessionID: string + clientID: string + tools: ClientToolDefinition[] +} + +interface RegisterClientToolsResponse { + registered: string[] // Tool IDs that were registered +} + +// POST /client-tools/result +// Submit tool execution result +interface SubmitToolResultRequest { + requestID: string + result: ClientToolResult | ClientToolError +} + +// GET /client-tools/pending/:clientID (SSE endpoint) +// Stream pending tool execution requests to client +// Returns: Server-Sent Events stream of ClientToolExecutionRequest + +// DELETE /client-tools/unregister +// Unregister client tools +interface UnregisterClientToolsRequest { + sessionID: string + clientID: string + toolIDs?: string[] // If omitted, unregister all +} +``` + +### WebSocket Alternative + +For lower latency, support WebSocket connections: + +```typescript +// WS /client-tools/ws/:clientID +// Bidirectional WebSocket for tool requests/responses + +// Client -> Server messages: +type WSClientMessage = + | { type: "register"; tools: ClientToolDefinition[] } + | { type: "result"; requestID: string; result: ClientToolResult | ClientToolError } + | { type: "unregister"; toolIDs?: string[] } + +// Server -> Client messages: +type WSServerMessage = + | { type: "registered"; toolIDs: string[] } + | { type: "request"; request: ClientToolExecutionRequest } + | { type: "error"; error: string } +``` + +--- + +## Server-Side Implementation + +### 1. Client Tool Registry + +Create `/packages/opencode/src/tool/client-registry.ts`: + +```typescript +import { z } from "zod" +import { Bus } from "../bus" +import { Tool } from "./tool" +import type { ClientToolDefinition, ClientToolExecutionRequest } from "../session/message-v2" + +export namespace ClientToolRegistry { + // Store client tools by clientID -> toolID -> definition + const registry = new Map>() + + // Pending execution requests by requestID + const pendingRequests = new Map void + reject: (error: Error) => void + timeout: Timer + }>() + + // Event emitter for tool execution requests + export const Event = { + ToolRequest: Bus.event( + "client-tool.request", + z.object({ + clientID: z.string(), + request: z.custom(), + }) + ), + } + + /** + * Register tools for a client + */ + export function register( + clientID: string, + tools: ClientToolDefinition[] + ): string[] { + if (!registry.has(clientID)) { + registry.set(clientID, new Map()) + } + + const clientTools = registry.get(clientID)! + const registered: string[] = [] + + for (const tool of tools) { + // Prefix with client ID to avoid collisions + const toolID = `client_${clientID}_${tool.id}` + clientTools.set(toolID, { + ...tool, + id: toolID, + }) + registered.push(toolID) + } + + return registered + } + + /** + * Unregister tools for a client + */ + export function unregister(clientID: string, toolIDs?: string[]): void { + const clientTools = registry.get(clientID) + if (!clientTools) return + + if (toolIDs) { + for (const id of toolIDs) { + clientTools.delete(id) + } + } else { + registry.delete(clientID) + } + } + + /** + * Get all tools for a client + */ + export function getTools(clientID: string): ClientToolDefinition[] { + const clientTools = registry.get(clientID) + if (!clientTools) return [] + return Array.from(clientTools.values()) + } + + /** + * Get all client tools across all clients + */ + export function getAllTools(): Map { + const all = new Map() + for (const [_, clientTools] of registry) { + for (const [toolID, tool] of clientTools) { + all.set(toolID, tool) + } + } + return all + } + + /** + * Find which client owns a tool + */ + export function findClientForTool(toolID: string): string | undefined { + for (const [clientID, clientTools] of registry) { + if (clientTools.has(toolID)) { + return clientID + } + } + return undefined + } + + /** + * Execute a client tool + * Sends request to client and waits for response + */ + export async function execute( + clientID: string, + request: Omit, + timeoutMs: number = 30000 + ): Promise { + const fullRequest: ClientToolExecutionRequest = { + type: "client-tool-request", + ...request, + } + + return new Promise((resolve, reject) => { + const timeout = setTimeout(() => { + pendingRequests.delete(request.requestID) + reject(new Error(`Client tool execution timed out after ${timeoutMs}ms`)) + }, timeoutMs) + + pendingRequests.set(request.requestID, { + request: fullRequest, + resolve, + reject, + timeout, + }) + + // Emit event for client to receive + Event.ToolRequest.publish({ + clientID, + request: fullRequest, + }) + }) + } + + /** + * Submit result from client + */ + export function submitResult( + requestID: string, + result: ClientToolResult | ClientToolError + ): boolean { + const pending = pendingRequests.get(requestID) + if (!pending) return false + + clearTimeout(pending.timeout) + pendingRequests.delete(requestID) + + if (result.status === "error") { + pending.reject(new Error(result.error)) + } else { + pending.resolve(result) + } + + return true + } + + /** + * Clean up all tools for a client (on disconnect) + */ + export function cleanup(clientID: string): void { + // Cancel all pending requests for this client + for (const [requestID, pending] of pendingRequests) { + if (pending.request.requestID.startsWith(clientID)) { + clearTimeout(pending.timeout) + pending.reject(new Error("Client disconnected")) + pendingRequests.delete(requestID) + } + } + + // Remove all tools + registry.delete(clientID) + } +} +``` + +### 2. Integration with Tool Registry + +Modify `/packages/opencode/src/tool/registry.ts`: + +```typescript +import { ClientToolRegistry } from "./client-registry" + +export namespace ToolRegistry { + // ... existing code ... + + /** + * Get all tools including client tools + */ + export async function tools( + providerID: string, + modelID: string, + clientID?: string + ) { + const serverTools = await all() + const result = await Promise.all( + serverTools.map(async (t) => ({ + id: t.id, + ...(await t.init()), + })), + ) + + // Add client tools if clientID provided + if (clientID) { + const clientTools = ClientToolRegistry.getTools(clientID) + for (const tool of clientTools) { + result.push({ + id: tool.id, + description: tool.description, + parameters: tool.parameters as any, + execute: createClientToolExecutor(clientID, tool.id), + }) + } + } + + return result + } + + /** + * Create executor function for client tool + */ + function createClientToolExecutor(clientID: string, toolID: string) { + return async ( + args: Record, + ctx: Tool.Context + ): Promise => { + const requestID = `${clientID}_${ctx.callID}_${Date.now()}` + + const result = await ClientToolRegistry.execute(clientID, { + requestID, + sessionID: ctx.sessionID, + messageID: ctx.messageID, + callID: ctx.callID!, + tool: toolID, + input: args, + }) + + return { + title: result.title, + metadata: result.metadata ?? {}, + output: result.output, + attachments: result.attachments, + } + } + } +} +``` + +### 3. API Routes + +Create `/packages/opencode/src/server/routes/client-tools.ts`: + +```typescript +import { Hono } from "hono" +import { streamSSE } from "hono/streaming" +import { ClientToolRegistry } from "../../tool/client-registry" +import { Identifier } from "../../util/identifier" + +export const clientToolsRouter = new Hono() + +// Register client tools +clientToolsRouter.post("/register", async (c) => { + const body = await c.req.json() + const { sessionID, clientID, tools } = body + + const registered = ClientToolRegistry.register(clientID, tools) + + return c.json({ registered }) +}) + +// Unregister client tools +clientToolsRouter.delete("/unregister", async (c) => { + const body = await c.req.json() + const { sessionID, clientID, toolIDs } = body + + ClientToolRegistry.unregister(clientID, toolIDs) + + return c.json({ success: true }) +}) + +// Submit tool execution result +clientToolsRouter.post("/result", async (c) => { + const body = await c.req.json() + const { requestID, result } = body + + const success = ClientToolRegistry.submitResult(requestID, result) + + if (!success) { + return c.json({ error: "Unknown request ID" }, 404) + } + + return c.json({ success: true }) +}) + +// SSE endpoint for tool execution requests +clientToolsRouter.get("/pending/:clientID", async (c) => { + const clientID = c.req.param("clientID") + + return streamSSE(c, async (stream) => { + // Subscribe to tool request events + const unsubscribe = ClientToolRegistry.Event.ToolRequest.subscribe( + async (event) => { + if (event.clientID === clientID) { + await stream.writeSSE({ + event: "tool-request", + data: JSON.stringify(event.request), + }) + } + } + ) + + // Keep connection alive + const keepAlive = setInterval(async () => { + await stream.writeSSE({ + event: "ping", + data: "", + }) + }, 30000) + + // Cleanup on disconnect + c.req.raw.signal.addEventListener("abort", () => { + unsubscribe() + clearInterval(keepAlive) + ClientToolRegistry.cleanup(clientID) + }) + + // Block until client disconnects + await new Promise(() => {}) + }) +}) +``` + +### 4. WebSocket Handler + +Create `/packages/opencode/src/server/routes/client-tools-ws.ts`: + +```typescript +import { Hono } from "hono" +import { upgradeWebSocket } from "hono/cloudflare-workers" +import { ClientToolRegistry } from "../../tool/client-registry" + +export const clientToolsWSRouter = new Hono() + +clientToolsWSRouter.get( + "/ws/:clientID", + upgradeWebSocket((c) => { + const clientID = c.req.param("clientID") + let unsubscribe: (() => void) | undefined + + return { + onOpen(event, ws) { + // Subscribe to tool requests for this client + unsubscribe = ClientToolRegistry.Event.ToolRequest.subscribe( + (evt) => { + if (evt.clientID === clientID) { + ws.send(JSON.stringify({ + type: "request", + request: evt.request, + })) + } + } + ) + }, + + onMessage(event, ws) { + try { + const message = JSON.parse(event.data as string) + + switch (message.type) { + case "register": { + const registered = ClientToolRegistry.register( + clientID, + message.tools + ) + ws.send(JSON.stringify({ + type: "registered", + toolIDs: registered, + })) + break + } + + case "result": { + ClientToolRegistry.submitResult( + message.requestID, + message.result + ) + break + } + + case "unregister": { + ClientToolRegistry.unregister(clientID, message.toolIDs) + break + } + } + } catch (error) { + ws.send(JSON.stringify({ + type: "error", + error: String(error), + })) + } + }, + + onClose() { + unsubscribe?.() + ClientToolRegistry.cleanup(clientID) + }, + + onError(event) { + unsubscribe?.() + ClientToolRegistry.cleanup(clientID) + }, + } + }) +) +``` + +--- + +## Client SDK Implementation + +### 1. Types + +Add to `/packages/sdk/js/src/types.ts`: + +```typescript +export interface ClientToolDefinition { + id: string + description: string + parameters: Record // JSON Schema +} + +export interface ClientToolHandler { + (input: Record, context: ClientToolContext): Promise +} + +export interface ClientToolContext { + sessionID: string + messageID: string + callID: string + signal: AbortSignal +} + +export interface ClientToolResult { + title: string + output: string + metadata?: Record +} + +export interface ClientTool { + definition: ClientToolDefinition + handler: ClientToolHandler +} + +export interface ClientToolsConfig { + /** Timeout for tool execution in ms (default: 30000) */ + timeout?: number + /** Use WebSocket instead of SSE (default: false) */ + useWebSocket?: boolean +} +``` + +### 2. Client Tools Manager + +Create `/packages/sdk/js/src/client-tools.ts`: + +```typescript +import type { + ClientTool, + ClientToolDefinition, + ClientToolHandler, + ClientToolResult, + ClientToolsConfig, +} from "./types" + +export class ClientToolsManager { + private clientID: string + private baseUrl: string + private tools = new Map() + private eventSource?: EventSource + private ws?: WebSocket + private config: Required + private abortController = new AbortController() + + constructor( + clientID: string, + baseUrl: string, + config?: ClientToolsConfig + ) { + this.clientID = clientID + this.baseUrl = baseUrl + this.config = { + timeout: config?.timeout ?? 30000, + useWebSocket: config?.useWebSocket ?? false, + } + } + + /** + * Register a tool with the server + */ + async register( + id: string, + definition: Omit, + handler: ClientToolHandler + ): Promise { + const tool: ClientTool = { + definition: { id, ...definition }, + handler, + } + this.tools.set(id, tool) + + // If already connected, register immediately + if (this.eventSource || this.ws) { + await this.syncTools() + } + } + + /** + * Unregister a tool + */ + async unregister(id: string): Promise { + this.tools.delete(id) + + if (this.eventSource || this.ws) { + await fetch(`${this.baseUrl}/client-tools/unregister`, { + method: "DELETE", + headers: { "Content-Type": "application/json" }, + body: JSON.stringify({ + clientID: this.clientID, + toolIDs: [id], + }), + }) + } + } + + /** + * Start listening for tool execution requests + */ + async connect(sessionID: string): Promise { + // Register all tools first + await this.syncTools() + + if (this.config.useWebSocket) { + await this.connectWebSocket() + } else { + await this.connectSSE() + } + } + + /** + * Stop listening and cleanup + */ + disconnect(): void { + this.abortController.abort() + this.eventSource?.close() + this.ws?.close() + } + + private async syncTools(): Promise { + const definitions = Array.from(this.tools.values()).map(t => t.definition) + + await fetch(`${this.baseUrl}/client-tools/register`, { + method: "POST", + headers: { "Content-Type": "application/json" }, + body: JSON.stringify({ + clientID: this.clientID, + tools: definitions, + }), + }) + } + + private async connectSSE(): Promise { + this.eventSource = new EventSource( + `${this.baseUrl}/client-tools/pending/${this.clientID}` + ) + + this.eventSource.addEventListener("tool-request", async (event) => { + const request = JSON.parse(event.data) + await this.handleToolRequest(request) + }) + + this.eventSource.onerror = (error) => { + console.error("Client tools SSE error:", error) + } + } + + private async connectWebSocket(): Promise { + const wsUrl = this.baseUrl.replace(/^http/, "ws") + this.ws = new WebSocket(`${wsUrl}/client-tools/ws/${this.clientID}`) + + this.ws.onopen = async () => { + // Register tools via WebSocket + const definitions = Array.from(this.tools.values()).map(t => t.definition) + this.ws!.send(JSON.stringify({ + type: "register", + tools: definitions, + })) + } + + this.ws.onmessage = async (event) => { + const message = JSON.parse(event.data) + + if (message.type === "request") { + await this.handleToolRequest(message.request) + } + } + + this.ws.onerror = (error) => { + console.error("Client tools WebSocket error:", error) + } + } + + private async handleToolRequest(request: { + requestID: string + sessionID: string + messageID: string + callID: string + tool: string + input: Record + }): Promise { + // Extract original tool ID (remove client_ prefix) + const prefixedID = request.tool + const originalID = prefixedID.replace(`client_${this.clientID}_`, "") + + const tool = this.tools.get(originalID) + + if (!tool) { + await this.submitResult(request.requestID, { + status: "error", + error: `Unknown tool: ${originalID}`, + }) + return + } + + try { + // Create abort controller for this execution + const controller = new AbortController() + const timeout = setTimeout(() => { + controller.abort() + }, this.config.timeout) + + const result = await tool.handler(request.input, { + sessionID: request.sessionID, + messageID: request.messageID, + callID: request.callID, + signal: controller.signal, + }) + + clearTimeout(timeout) + + await this.submitResult(request.requestID, { + status: "success", + title: result.title, + output: result.output, + metadata: result.metadata, + }) + } catch (error) { + await this.submitResult(request.requestID, { + status: "error", + error: error instanceof Error ? error.message : String(error), + }) + } + } + + private async submitResult( + requestID: string, + result: { status: "success" | "error"; [key: string]: unknown } + ): Promise { + if (this.ws) { + this.ws.send(JSON.stringify({ + type: "result", + requestID, + result, + })) + } else { + await fetch(`${this.baseUrl}/client-tools/result`, { + method: "POST", + headers: { "Content-Type": "application/json" }, + body: JSON.stringify({ requestID, result }), + }) + } + } +} +``` + +### 3. Integration with OpencodeClient + +Modify `/packages/sdk/js/src/client.ts`: + +```typescript +import { ClientToolsManager } from "./client-tools" + +export class OpencodeClient { + private _client: Client + private _clientTools?: ClientToolsManager + private _clientID: string + + constructor(config: { client: Client }) { + this._client = config.client + this._clientID = crypto.randomUUID() + } + + /** + * Get client tools manager for registering and handling client-side tools + */ + get clientTools(): ClientToolsManager { + if (!this._clientTools) { + const baseUrl = (this._client as any).baseUrl + this._clientTools = new ClientToolsManager(this._clientID, baseUrl) + } + return this._clientTools + } + + /** + * Start a session with client tools support + */ + async startSession(options?: { + tools?: boolean + }): Promise { + const session = await this.session.create() + + if (options?.tools !== false) { + await this.clientTools.connect(session.id) + } + + return { + session, + prompt: (input: string) => this.session.prompt(session.id, input), + close: () => { + this.clientTools.disconnect() + }, + } + } +} + +interface SessionHandle { + session: Session + prompt: (input: string) => Promise + close: () => void +} +``` + +--- + +## Security Considerations + +### 1. Client Authentication + +```typescript +// Validate client owns the session +export function validateClientSession( + clientID: string, + sessionID: string +): boolean { + const session = Session.get(sessionID) + return session?.clientID === clientID +} + +// Add clientID to session creation +export async function createSession(clientID: string) { + return Session.create({ + clientID, + // ... other fields + }) +} +``` + +### 2. Tool Sandboxing + +- Client tools run in the client's environment (inherently sandboxed from server) +- Server tools continue to run on server +- Clear naming convention distinguishes client vs server tools + +### 3. Input Validation + +```typescript +// Validate tool input against JSON Schema before sending to client +import Ajv from "ajv" + +const ajv = new Ajv() + +export function validateToolInput( + tool: ClientToolDefinition, + input: Record +): boolean { + const validate = ajv.compile(tool.parameters) + return validate(input) +} +``` + +### 4. Timeout and Rate Limiting + +```typescript +// Server-side timeout for client tool execution +const CLIENT_TOOL_TIMEOUT = 30000 // 30 seconds + +// Rate limiting per client +const rateLimiter = new Map() + +export function checkRateLimit(clientID: string): boolean { + const limit = rateLimiter.get(clientID) + const now = Date.now() + + if (!limit || now > limit.reset) { + rateLimiter.set(clientID, { + count: 1, + reset: now + 60000, // 1 minute window + }) + return true + } + + if (limit.count >= 100) { // 100 requests per minute + return false + } + + limit.count++ + return true +} +``` + +### 5. Permission Integration + +```typescript +// Add client tool permission to Agent +export interface AgentPermission { + // ... existing permissions + client_tools: "allow" | "ask" | "deny" +} + +// Check permission before executing client tool +if (agent.permission.client_tools === "deny") { + throw new Error("Client tools are not allowed for this agent") +} + +if (agent.permission.client_tools === "ask") { + await Permission.ask({ + type: "client_tool", + tool: toolID, + sessionID, + messageID, + callID, + }) +} +``` + +--- + +## Error Handling + +### 1. Connection Errors + +```typescript +// Auto-reconnect with exponential backoff +class ClientToolsManager { + private reconnectAttempts = 0 + private maxReconnectAttempts = 5 + + private async reconnect(): Promise { + if (this.reconnectAttempts >= this.maxReconnectAttempts) { + throw new Error("Max reconnection attempts reached") + } + + const delay = Math.min(1000 * Math.pow(2, this.reconnectAttempts), 30000) + await new Promise(resolve => setTimeout(resolve, delay)) + + this.reconnectAttempts++ + await this.connect(this.sessionID) + this.reconnectAttempts = 0 + } +} +``` + +### 2. Tool Execution Errors + +```typescript +// Graceful error handling in tool execution +try { + const result = await tool.handler(input, context) + return { status: "success", ...result } +} catch (error) { + // Log error for debugging + console.error(`Client tool ${toolID} failed:`, error) + + // Return error to server + return { + status: "error", + error: error instanceof Error ? error.message : "Unknown error", + } +} +``` + +### 3. Timeout Handling + +```typescript +// Server-side timeout +const timeoutPromise = new Promise((_, reject) => { + setTimeout(() => { + reject(new Error(`Client tool timed out after ${timeout}ms`)) + }, timeout) +}) + +const result = await Promise.race([ + ClientToolRegistry.execute(clientID, request), + timeoutPromise, +]) +``` + +--- + +## Usage Examples + +### Basic Client Tool + +```typescript +import { createOpencode } from "@opencode/sdk" + +const { client, server } = await createOpencode() + +// Register a client tool +await client.clientTools.register( + "get_local_time", + { + description: "Get the current local time on the client machine", + parameters: { + type: "object", + properties: { + timezone: { + type: "string", + description: "Timezone (e.g., 'America/New_York')", + }, + }, + }, + }, + async (input, ctx) => { + const tz = input.timezone as string || "UTC" + const time = new Date().toLocaleString("en-US", { timeZone: tz }) + + return { + title: `Local time (${tz})`, + output: time, + } + } +) + +// Start session with client tools +const { session, prompt, close } = await client.startSession() + +// Use the session - model can now call get_local_time +const response = await prompt("What time is it locally?") + +// Cleanup +close() +server.close() +``` + +### File System Access Tool + +```typescript +import { readFile } from "fs/promises" + +await client.clientTools.register( + "read_local_file", + { + description: "Read a file from the client's local filesystem", + parameters: { + type: "object", + properties: { + path: { + type: "string", + description: "Absolute path to the file", + }, + }, + required: ["path"], + }, + }, + async (input) => { + const path = input.path as string + const content = await readFile(path, "utf-8") + + return { + title: `Read ${path}`, + output: content, + } + } +) +``` + +### Database Query Tool + +```typescript +import { createConnection } from "mysql2/promise" + +const connection = await createConnection({ + host: "localhost", + user: "root", + database: "myapp", +}) + +await client.clientTools.register( + "query_database", + { + description: "Execute a read-only SQL query on the local database", + parameters: { + type: "object", + properties: { + query: { + type: "string", + description: "SQL SELECT query to execute", + }, + }, + required: ["query"], + }, + }, + async (input) => { + const query = input.query as string + + // Security: only allow SELECT queries + if (!query.trim().toLowerCase().startsWith("select")) { + throw new Error("Only SELECT queries are allowed") + } + + const [rows] = await connection.execute(query) + + return { + title: "Query results", + output: JSON.stringify(rows, null, 2), + metadata: { rowCount: (rows as any[]).length }, + } + } +) +``` + +--- + +## Implementation Plan + +### Phase 1: Core Infrastructure +1. Add message types to `message-v2.ts` +2. Create `ClientToolRegistry` module +3. Add API routes for registration and results +4. Integrate with `ToolRegistry` + +### Phase 2: SDK Implementation +1. Create `ClientToolsManager` class +2. Add SSE connection support +3. Integrate with `OpencodeClient` +4. Add TypeScript types + +### Phase 3: WebSocket Support +1. Add WebSocket handler on server +2. Add WebSocket connection option in SDK +3. Implement bidirectional messaging + +### Phase 4: Security & Polish +1. Add client authentication +2. Implement rate limiting +3. Add permission integration +4. Add comprehensive error handling + +### Phase 5: Testing & Documentation +1. Unit tests for registry and manager +2. Integration tests for full flow +3. Update SDK documentation +4. Add usage examples + +--- + +## Appendix: Modified Files Summary + +### New Files +- `/packages/opencode/src/tool/client-registry.ts` +- `/packages/opencode/src/server/routes/client-tools.ts` +- `/packages/opencode/src/server/routes/client-tools-ws.ts` +- `/packages/sdk/js/src/client-tools.ts` +- `/packages/sdk/js/src/types.ts` (new types) + +### Modified Files +- `/packages/opencode/src/session/message-v2.ts` (new message types) +- `/packages/opencode/src/tool/registry.ts` (integrate client tools) +- `/packages/opencode/src/server/index.ts` (add routes) +- `/packages/sdk/js/src/client.ts` (add clientTools property) +- `/packages/sdk/js/src/index.ts` (export new types) + +--- + +## Future Enhancements + +1. **Tool Discovery**: Allow clients to query available server tools +2. **Tool Streaming**: Support streaming output from client tools +3. **Tool Composition**: Allow client tools to call server tools +4. **Persistent Tools**: Option to persist tool registrations across sessions +5. **Tool Marketplace**: Share and discover community tools +6. **Tool Versioning**: Support multiple versions of the same tool From 690e46ab5f7fdbcf9bb36e66c1f9e6aa547d0cd0 Mon Sep 17 00:00:00 2001 From: Claude Date: Mon, 24 Nov 2025 06:25:35 +0000 Subject: [PATCH 05/58] docs: add server-side web service deployment design Comprehensive design documentation for deploying OpenCode as a multi-tenant web service including: - System architecture and component design - Authentication, authorization, and multi-tenancy - Database schema and storage strategies - Horizontal scaling and Kubernetes deployment - Security controls and compliance requirements - API design with versioning and streaming support --- docs/design/server-side-deployment/README.md | 94 ++ docs/design/server-side-deployment/api.md | 743 +++++++++++++++ .../server-side-deployment/architecture.md | 530 +++++++++++ .../server-side-deployment/authentication.md | 695 ++++++++++++++ docs/design/server-side-deployment/scaling.md | 866 ++++++++++++++++++ .../design/server-side-deployment/security.md | 751 +++++++++++++++ docs/design/server-side-deployment/storage.md | 740 +++++++++++++++ 7 files changed, 4419 insertions(+) create mode 100644 docs/design/server-side-deployment/README.md create mode 100644 docs/design/server-side-deployment/api.md create mode 100644 docs/design/server-side-deployment/architecture.md create mode 100644 docs/design/server-side-deployment/authentication.md create mode 100644 docs/design/server-side-deployment/scaling.md create mode 100644 docs/design/server-side-deployment/security.md create mode 100644 docs/design/server-side-deployment/storage.md diff --git a/docs/design/server-side-deployment/README.md b/docs/design/server-side-deployment/README.md new file mode 100644 index 00000000000..066a8a77336 --- /dev/null +++ b/docs/design/server-side-deployment/README.md @@ -0,0 +1,94 @@ +# OpenCode Server-Side Web Service Design + +## Overview + +This document describes the architecture for deploying OpenCode as a multi-tenant web service, enabling organizations to provide AI-powered coding assistance to multiple users through a centralized, scalable deployment. + +## Goals + +1. **Multi-tenancy**: Support multiple users and organizations with proper isolation +2. **Scalability**: Handle thousands of concurrent users with horizontal scaling +3. **Security**: Enterprise-grade authentication, authorization, and data protection +4. **Reliability**: High availability with fault tolerance and disaster recovery +5. **Observability**: Comprehensive monitoring, logging, and tracing + +## Current Architecture vs. Target Architecture + +| Aspect | Current (Desktop/CLI) | Target (Web Service) | +|--------|----------------------|---------------------| +| Users | Single user | Multi-tenant | +| Storage | Local filesystem (JSON) | Distributed database | +| Auth | Provider API keys only | User auth + provider delegation | +| Scaling | Single instance | Horizontal scaling | +| State | Per-directory instance | Per-user/workspace scoped | +| Networking | Local only | Internet-facing | + +## Design Documents + +1. **[Architecture](./architecture.md)** - System architecture and component design +2. **[Authentication](./authentication.md)** - User authentication and authorization +3. **[Storage](./storage.md)** - Data persistence and caching strategies +4. **[Scaling](./scaling.md)** - Horizontal scaling and deployment patterns +5. **[Security](./security.md)** - Security controls and compliance +6. **[API](./api.md)** - API design and versioning + +## High-Level Architecture + +``` + ┌─────────────────┐ + │ CDN/WAF │ + │ (Cloudflare) │ + └────────┬────────┘ + │ + ┌────────▼────────┐ + │ Load Balancer │ + │ (L7/HTTP) │ + └────────┬────────┘ + │ + ┌────────────────────────┼────────────────────────┐ + │ │ │ + ┌────────▼────────┐ ┌────────▼────────┐ ┌────────▼────────┐ + │ API Server │ │ API Server │ │ API Server │ + │ (Stateless) │ │ (Stateless) │ │ (Stateless) │ + └────────┬────────┘ └────────┬────────┘ └────────┬────────┘ + │ │ │ + └────────────────────────┼────────────────────────┘ + │ + ┌──────────────┬───────────────┼───────────────┬──────────────┐ + │ │ │ │ │ + ┌────────▼────────┐ ┌───▼───┐ ┌───────▼───────┐ ┌────▼────┐ ┌──────▼──────┐ + │ PostgreSQL │ │ Redis │ │ Object Store │ │ Queue │ │ Metrics │ + │ (Sessions) │ │(Cache)│ │ (S3/R2/GCS) │ │ (NATS) │ │ (Prometheus)│ + └─────────────────┘ └───────┘ └───────────────┘ └─────────┘ └─────────────┘ +``` + +## Key Design Decisions + +### 1. Stateless API Servers +API servers are stateless, enabling horizontal scaling. Session state is stored in Redis, persistent data in PostgreSQL. + +### 2. Workspace-Based Multi-Tenancy +Each user has isolated workspaces. Workspaces contain projects, sessions, and configurations. + +### 3. Federated LLM Provider Access +Users can bring their own API keys or use organization-provided quotas with usage tracking. + +### 4. Event-Driven Architecture +Real-time updates via Server-Sent Events (SSE) with Redis Pub/Sub for cross-instance coordination. + +### 5. Git-First Project Model +Projects are identified by Git repositories. The service can integrate with GitHub/GitLab for workspace provisioning. + +## Deployment Options + +1. **Kubernetes** - Recommended for production (see [scaling.md](./scaling.md)) +2. **Docker Compose** - Development and small deployments +3. **Serverless** - AWS Lambda/Cloudflare Workers for specific endpoints + +## Getting Started + +See the individual design documents for detailed specifications: + +- Start with [Architecture](./architecture.md) for system overview +- Review [Authentication](./authentication.md) for auth implementation +- Check [Security](./security.md) for compliance requirements diff --git a/docs/design/server-side-deployment/api.md b/docs/design/server-side-deployment/api.md new file mode 100644 index 00000000000..752fef825eb --- /dev/null +++ b/docs/design/server-side-deployment/api.md @@ -0,0 +1,743 @@ +# API Design + +## Overview + +This document specifies the API design for the OpenCode server-side deployment, including versioning strategy, authentication, error handling, and endpoint specifications. + +## API Versioning + +### Versioning Strategy + +Use URL path versioning for major versions with header-based minor versioning: + +``` +https://api.opencode.io/v1/sessions + ^^ + Major version + +Accept: application/json; version=1.2 + ^^^ + Minor version +``` + +### Version Lifecycle + +| Status | Description | Support | +|--------|-------------|---------| +| Current | Latest stable version | Full support | +| Deprecated | Previous version | 6 months | +| Sunset | End of life | No support | + +### Deprecation Headers + +```typescript +// Response headers for deprecated endpoints +c.header("Deprecation", "Sun, 01 Jan 2025 00:00:00 GMT") +c.header("Sunset", "Sun, 01 Jul 2025 00:00:00 GMT") +c.header("Link", '; rel="successor-version"') +``` + +## Authentication + +### Request Authentication + +```typescript +// Bearer token authentication +app.use("/api/*", async (c, next) => { + const authHeader = c.req.header("Authorization") + + if (!authHeader?.startsWith("Bearer ")) { + throw new AuthError("Missing authorization header", "MISSING_AUTH") + } + + const token = authHeader.substring(7) + + // Check if API key or JWT + if (token.startsWith("oc_")) { + // API key authentication + const apiKey = await validateApiKey(token) + c.set("auth", { type: "apikey", ...apiKey }) + } else { + // JWT authentication + const jwt = await validateJwt(token) + c.set("auth", { type: "jwt", ...jwt }) + } + + await next() +}) +``` + +### API Key Format + +``` +oc_live_xxxxxxxxxxxxxxxxxxxxxxxxxxxx +^^ ^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +| | | +| | +-- 24 bytes base64url +| +-- Environment (live/test) ++-- Prefix +``` + +## Request/Response Format + +### Request Headers + +``` +Content-Type: application/json +Authorization: Bearer +Accept: application/json +X-Request-ID: # Optional, for tracing +X-Idempotency-Key: # Optional, for idempotent operations +``` + +### Response Headers + +``` +Content-Type: application/json +X-Request-ID: +X-RateLimit-Limit: 100 +X-RateLimit-Remaining: 95 +X-RateLimit-Reset: 1609459200 +``` + +### Pagination + +```typescript +// Cursor-based pagination +interface PaginatedResponse { + data: T[] + pagination: { + cursor?: string + hasMore: boolean + total?: number + } +} + +// Query parameters +interface PaginationParams { + cursor?: string // Opaque cursor + limit?: number // Default: 50, Max: 100 +} + +// Example request +// GET /api/v1/sessions?limit=20&cursor=eyJpZCI6IjEyMyJ9 +``` + +### Filtering & Sorting + +```typescript +// Query parameter format +interface ListParams { + // Filtering + filter?: { + status?: string[] + createdAfter?: string // ISO 8601 + createdBefore?: string + } + // Sorting + sort?: string // Field name + order?: "asc" | "desc" +} + +// Example +// GET /api/v1/sessions?filter[status]=active&sort=createdAt&order=desc +``` + +## Error Handling + +### Error Response Format + +```typescript +interface ErrorResponse { + error: { + code: string // Machine-readable error code + message: string // Human-readable message + details?: unknown // Additional context + requestId: string // For support reference + docs?: string // Link to documentation + } +} +``` + +### Error Codes + +```typescript +// Error code hierarchy +const ErrorCodes = { + // Authentication errors (401) + AUTH_MISSING_TOKEN: "Missing authentication token", + AUTH_INVALID_TOKEN: "Invalid or expired token", + AUTH_INSUFFICIENT_SCOPE: "Token lacks required scope", + + // Authorization errors (403) + FORBIDDEN: "Access denied", + ORG_ACCESS_DENIED: "Not a member of this organization", + RESOURCE_ACCESS_DENIED: "No access to this resource", + + // Validation errors (400) + VALIDATION_ERROR: "Request validation failed", + INVALID_PARAMETER: "Invalid parameter value", + MISSING_PARAMETER: "Required parameter missing", + + // Not found errors (404) + NOT_FOUND: "Resource not found", + SESSION_NOT_FOUND: "Session not found", + PROJECT_NOT_FOUND: "Project not found", + + // Conflict errors (409) + CONFLICT: "Resource conflict", + SESSION_ALREADY_EXISTS: "Session already exists", + CONCURRENT_MODIFICATION: "Resource was modified", + + // Rate limiting (429) + RATE_LIMITED: "Too many requests", + QUOTA_EXCEEDED: "Usage quota exceeded", + + // Server errors (500) + INTERNAL_ERROR: "Internal server error", + SERVICE_UNAVAILABLE: "Service temporarily unavailable", + PROVIDER_ERROR: "LLM provider error", +} +``` + +### HTTP Status Codes + +| Code | Usage | +|------|-------| +| 200 | Success with body | +| 201 | Resource created | +| 204 | Success, no body | +| 400 | Validation error | +| 401 | Authentication required | +| 403 | Authorization denied | +| 404 | Resource not found | +| 409 | Conflict | +| 422 | Unprocessable entity | +| 429 | Rate limited | +| 500 | Server error | +| 503 | Service unavailable | + +## Streaming Responses + +### Server-Sent Events + +```typescript +// SSE endpoint for real-time events +app.get("/api/v1/events", async (c) => { + return streamSSE(c, async (stream) => { + // Connection established + await stream.writeSSE({ + event: "connected", + data: JSON.stringify({ timestamp: Date.now() }), + }) + + // Subscribe to events + const unsub = eventBus.subscribe(c.get("userId"), async (event) => { + await stream.writeSSE({ + event: event.type, + data: JSON.stringify(event.payload), + id: event.id, + }) + }) + + // Heartbeat every 30 seconds + const heartbeat = setInterval(() => { + stream.writeSSE({ event: "ping", data: "" }) + }, 30000) + + // Cleanup on disconnect + stream.onAbort(() => { + clearInterval(heartbeat) + unsub() + }) + }) +}) +``` + +### Streaming Chat Response + +```typescript +// POST /api/v1/sessions/:id/messages +// Returns streaming response +app.post("/api/v1/sessions/:id/messages", async (c) => { + const { id } = c.req.param() + const body = await c.req.json() + + return streamSSE(c, async (stream) => { + const generator = sessionOrchestrator.chat(id, body) + + for await (const event of generator) { + await stream.writeSSE({ + event: event.type, + data: JSON.stringify(event), + }) + } + + // Signal completion + await stream.writeSSE({ + event: "done", + data: JSON.stringify({ messageId: "..." }), + }) + }) +}) +``` + +### Event Types + +```typescript +type StreamEvent = + | { type: "message.start"; messageId: string } + | { type: "text.delta"; content: string } + | { type: "text.done"; content: string } + | { type: "tool.start"; toolId: string; name: string } + | { type: "tool.input"; content: string } + | { type: "tool.output"; content: string } + | { type: "tool.done"; result: unknown } + | { type: "message.done"; usage: Usage } + | { type: "error"; error: Error } +``` + +## API Endpoints + +### Sessions + +```typescript +// List sessions +// GET /api/v1/sessions +interface ListSessionsResponse { + data: Session[] + pagination: Pagination +} + +// Create session +// POST /api/v1/sessions +interface CreateSessionRequest { + workspaceId: string + title?: string + model?: { + providerId: string + modelId: string + } +} + +// Get session +// GET /api/v1/sessions/:id +interface GetSessionResponse { + data: Session +} + +// Update session +// PATCH /api/v1/sessions/:id +interface UpdateSessionRequest { + title?: string +} + +// Delete session +// DELETE /api/v1/sessions/:id + +// Send message (streaming) +// POST /api/v1/sessions/:id/messages +interface SendMessageRequest { + content: string + files?: FileAttachment[] +} + +// List messages +// GET /api/v1/sessions/:id/messages +interface ListMessagesResponse { + data: Message[] + pagination: Pagination +} + +// Abort session +// POST /api/v1/sessions/:id/abort + +// Fork session +// POST /api/v1/sessions/:id/fork +interface ForkSessionRequest { + messageId: string +} + +// Share session +// POST /api/v1/sessions/:id/share +interface ShareSessionResponse { + url: string + expiresAt: string +} +``` + +### Workspaces + +```typescript +// List workspaces +// GET /api/v1/workspaces +interface ListWorkspacesResponse { + data: Workspace[] + pagination: Pagination +} + +// Create workspace +// POST /api/v1/workspaces +interface CreateWorkspaceRequest { + name: string + description?: string + gitConfig?: { + provider: "github" | "gitlab" + repoUrl: string + branch?: string + } +} + +// Get workspace +// GET /api/v1/workspaces/:id + +// Update workspace +// PATCH /api/v1/workspaces/:id + +// Delete workspace +// DELETE /api/v1/workspaces/:id + +// List workspace projects +// GET /api/v1/workspaces/:id/projects +``` + +### Projects + +```typescript +// List projects +// GET /api/v1/projects + +// Create project +// POST /api/v1/projects +interface CreateProjectRequest { + workspaceId: string + name: string + path?: string +} + +// Get project +// GET /api/v1/projects/:id + +// Update project +// PATCH /api/v1/projects/:id + +// Delete project +// DELETE /api/v1/projects/:id +``` + +### Files + +```typescript +// List files in workspace +// GET /api/v1/workspaces/:id/files +interface ListFilesRequest { + path?: string // Directory path + pattern?: string // Glob pattern +} + +// Get file content +// GET /api/v1/workspaces/:id/files/content +interface GetFileContentRequest { + path: string + encoding?: "utf8" | "base64" +} + +// Search in files +// GET /api/v1/workspaces/:id/files/search +interface SearchFilesRequest { + query: string + path?: string + type?: string // File type filter +} + +// Git status +// GET /api/v1/workspaces/:id/git/status +``` + +### Providers + +```typescript +// List available providers +// GET /api/v1/providers +interface ListProvidersResponse { + data: Provider[] +} + +// List models for provider +// GET /api/v1/providers/:id/models +interface ListModelsResponse { + data: Model[] +} + +// Get user's provider config +// GET /api/v1/providers/:id/config + +// Set provider API key (BYOK) +// PUT /api/v1/providers/:id/key +interface SetProviderKeyRequest { + apiKey: string +} + +// Delete provider key +// DELETE /api/v1/providers/:id/key +``` + +### Users & Organizations + +```typescript +// Get current user +// GET /api/v1/users/me +interface GetCurrentUserResponse { + data: User +} + +// Update user preferences +// PATCH /api/v1/users/me +interface UpdateUserRequest { + name?: string + preferences?: UserPreferences +} + +// Get organization +// GET /api/v1/organizations/:id + +// List organization members +// GET /api/v1/organizations/:id/members + +// Invite member +// POST /api/v1/organizations/:id/invitations + +// Remove member +// DELETE /api/v1/organizations/:id/members/:userId +``` + +### API Keys + +```typescript +// List API keys +// GET /api/v1/api-keys +interface ListApiKeysResponse { + data: ApiKey[] // Keys shown with prefix only +} + +// Create API key +// POST /api/v1/api-keys +interface CreateApiKeyRequest { + name: string + scopes: Scope[] + expiresAt?: string +} +interface CreateApiKeyResponse { + key: string // Full key shown once + data: ApiKey +} + +// Delete API key +// DELETE /api/v1/api-keys/:id +``` + +### Usage & Billing + +```typescript +// Get usage summary +// GET /api/v1/usage +interface GetUsageRequest { + period?: "day" | "week" | "month" + startDate?: string + endDate?: string +} +interface GetUsageResponse { + data: { + tokens: { + input: number + output: number + total: number + } + cost: number + byProvider: Record + byModel: Record + } +} + +// Get usage breakdown +// GET /api/v1/usage/breakdown +interface UsageBreakdownResponse { + data: UsageRecord[] + pagination: Pagination +} +``` + +## Webhooks + +### Webhook Configuration + +```typescript +// Register webhook +// POST /api/v1/webhooks +interface CreateWebhookRequest { + url: string + events: WebhookEvent[] + secret?: string +} + +// Webhook events +type WebhookEvent = + | "session.created" + | "session.completed" + | "session.error" + | "message.created" + | "usage.threshold" +``` + +### Webhook Payload + +```typescript +interface WebhookPayload { + id: string + type: WebhookEvent + timestamp: string + data: unknown +} + +// Signature verification +// X-Webhook-Signature: sha256= +function verifyWebhook(payload: string, signature: string, secret: string): boolean { + const expected = crypto + .createHmac("sha256", secret) + .update(payload) + .digest("hex") + return crypto.timingSafeEqual( + Buffer.from(signature), + Buffer.from(`sha256=${expected}`) + ) +} +``` + +## Rate Limiting + +### Limits by Plan + +| Plan | Requests/min | Messages/day | Tokens/month | +|------|-------------|--------------|--------------| +| Free | 20 | 100 | 100K | +| Team | 100 | 1,000 | 1M | +| Enterprise | Custom | Custom | Custom | + +### Rate Limit Headers + +``` +X-RateLimit-Limit: 100 +X-RateLimit-Remaining: 95 +X-RateLimit-Reset: 1609459200 +Retry-After: 30 +``` + +### Rate Limit Response + +```json +{ + "error": { + "code": "RATE_LIMITED", + "message": "Too many requests", + "details": { + "limit": 100, + "remaining": 0, + "reset": 1609459200, + "retryAfter": 30 + }, + "requestId": "req_xxx" + } +} +``` + +## SDK Examples + +### TypeScript/JavaScript + +```typescript +import { OpenCodeClient } from "@opencode/sdk" + +const client = new OpenCodeClient({ + apiKey: "oc_live_xxx", + baseUrl: "https://api.opencode.io", +}) + +// Create session +const session = await client.sessions.create({ + workspaceId: "ws_xxx", + title: "Debug authentication", +}) + +// Send message and stream response +const stream = client.sessions.chat(session.id, { + content: "Find and fix the authentication bug", +}) + +for await (const event of stream) { + if (event.type === "text.delta") { + process.stdout.write(event.content) + } +} + +// List sessions +const sessions = await client.sessions.list({ + limit: 20, + filter: { status: ["active"] }, +}) +``` + +### Python + +```python +from opencode import OpenCodeClient + +client = OpenCodeClient(api_key="oc_live_xxx") + +# Create session +session = client.sessions.create( + workspace_id="ws_xxx", + title="Debug authentication" +) + +# Send message and stream response +stream = client.sessions.chat( + session.id, + content="Find and fix the authentication bug" +) + +for event in stream: + if event.type == "text.delta": + print(event.content, end="", flush=True) +``` + +### cURL + +```bash +# Create session +curl -X POST https://api.opencode.io/v1/sessions \ + -H "Authorization: Bearer oc_live_xxx" \ + -H "Content-Type: application/json" \ + -d '{"workspaceId": "ws_xxx", "title": "Debug auth"}' + +# Send message (streaming) +curl -X POST https://api.opencode.io/v1/sessions/sess_xxx/messages \ + -H "Authorization: Bearer oc_live_xxx" \ + -H "Content-Type: application/json" \ + -H "Accept: text/event-stream" \ + -d '{"content": "Find and fix the authentication bug"}' +``` + +## OpenAPI Specification + +The complete OpenAPI 3.1 specification is available at: + +``` +GET /api/v1/openapi.json +GET /api/v1/openapi.yaml +``` + +Interactive documentation (Swagger UI): + +``` +GET /docs +``` diff --git a/docs/design/server-side-deployment/architecture.md b/docs/design/server-side-deployment/architecture.md new file mode 100644 index 00000000000..ef40ab3e6ef --- /dev/null +++ b/docs/design/server-side-deployment/architecture.md @@ -0,0 +1,530 @@ +# System Architecture + +## Component Overview + +### 1. API Gateway Layer + +**Purpose**: Entry point for all client requests, handling routing, rate limiting, and initial authentication. + +```typescript +interface GatewayConfig { + rateLimiting: { + requests: number // per window + window: "second" | "minute" | "hour" + byUser: boolean // per-user limits + byOrg: boolean // per-org limits + } + cors: { + origins: string[] + credentials: boolean + } + tls: { + minVersion: "1.2" | "1.3" + ciphers: string[] + } +} +``` + +**Responsibilities**: +- TLS termination +- Request routing +- Rate limiting (token bucket algorithm) +- Request/response logging +- CORS handling +- Request ID injection + +### 2. API Server (Hono) + +**Purpose**: Core business logic, session management, and LLM orchestration. + +```typescript +// Server initialization with multi-tenant support +export function createServer(config: ServerConfig) { + const app = new Hono() + + // Middleware stack + app.use(requestId()) + app.use(logger()) + app.use(authenticate()) // JWT validation + app.use(tenantContext()) // Inject user/org context + app.use(rateLimitMiddleware()) + + // Routes + app.route("/api/v1/sessions", sessionRoutes) + app.route("/api/v1/projects", projectRoutes) + app.route("/api/v1/workspaces", workspaceRoutes) + app.route("/api/v1/providers", providerRoutes) + + return app +} +``` + +**Key Modifications from Current Architecture**: + +| Current | Server-Side | +|---------|-------------| +| `Instance.provide({ directory })` | `TenantContext.provide({ userId, orgId, workspaceId })` | +| File-based storage | Database + Object storage | +| Single event bus | Redis Pub/Sub | +| Local Git operations | Remote Git service integration | + +### 3. Session Orchestrator + +**Purpose**: Manages AI sessions, tool execution, and streaming responses. + +```typescript +interface SessionOrchestrator { + // Create new session in workspace + create(ctx: TenantContext, input: CreateSessionInput): Promise + + // Send message and stream response + chat(ctx: TenantContext, sessionId: string, message: Message): AsyncGenerator + + // Execute tool with sandboxing + executeTool(ctx: TenantContext, sessionId: string, tool: ToolCall): Promise + + // Abort running session + abort(ctx: TenantContext, sessionId: string): Promise +} +``` + +**Session Lifecycle**: +``` +┌─────────┐ ┌──────────┐ ┌─────────┐ ┌───────────┐ +│ Created │ ──▶ │ Active │ ──▶ │ Idle │ ──▶ │ Archived │ +└─────────┘ └──────────┘ └─────────┘ └───────────┘ + │ │ + ▼ ▼ + ┌──────────┐ ┌──────────┐ + │ Aborted │ │ Expired │ + └──────────┘ └──────────┘ +``` + +### 4. Tool Execution Engine + +**Purpose**: Sandboxed execution of code tools (Bash, file operations, etc.) + +```typescript +interface ToolExecutionConfig { + sandbox: { + type: "docker" | "firecracker" | "gvisor" + image: string + resources: { + cpuLimit: string // "1000m" + memoryLimit: string // "512Mi" + diskLimit: string // "1Gi" + timeout: number // ms + } + network: { + enabled: boolean + egress: string[] // allowed domains + } + } + workspace: { + mount: string // /workspace + readonly: string[] // paths + } +} +``` + +**Execution Flow**: +``` +Tool Request ──▶ Validate ──▶ Acquire Sandbox ──▶ Mount Workspace + │ + ▼ +Tool Response ◀── Cleanup ◀── Capture Output ◀── Execute Command +``` + +### 5. Provider Gateway + +**Purpose**: Manages LLM provider connections with key rotation and failover. + +```typescript +interface ProviderGateway { + // Route request to appropriate provider + route(ctx: TenantContext, request: LLMRequest): Promise + + // Stream response from provider + stream(ctx: TenantContext, request: LLMRequest): AsyncGenerator + + // Get available models for user + models(ctx: TenantContext): Promise +} + +interface ProviderConfig { + anthropic: { + apiKey: string | { vault: string } + baseUrl?: string + rateLimit: RateLimit + } + openai: { + apiKey: string | { vault: string } + organization?: string + rateLimit: RateLimit + } + // ... other providers +} +``` + +**Key Management**: +- Organization-level keys stored in Vault/KMS +- User BYOK (Bring Your Own Key) with encryption at rest +- Automatic key rotation support +- Usage attribution per key + +## Data Models + +### Tenant Hierarchy + +``` +Organization +├── Users (members) +├── Teams +├── API Keys +├── Provider Configs +└── Workspaces + ├── Projects + │ ├── Git Config + │ └── Project Settings + └── Sessions + ├── Messages + │ └── Parts + └── Diffs +``` + +### Core Entities + +```typescript +// Organization - top-level tenant +interface Organization { + id: string + name: string + slug: string + plan: "free" | "team" | "enterprise" + settings: OrgSettings + createdAt: Date + updatedAt: Date +} + +// User within organization +interface User { + id: string + orgId: string + email: string + name: string + role: "owner" | "admin" | "member" + preferences: UserPreferences + createdAt: Date + lastActiveAt: Date +} + +// Workspace - isolated environment +interface Workspace { + id: string + orgId: string + name: string + description?: string + gitConfig?: { + provider: "github" | "gitlab" | "bitbucket" + repoUrl: string + branch: string + credentials: EncryptedCredentials + } + settings: WorkspaceSettings + createdAt: Date + updatedAt: Date +} + +// Project within workspace +interface Project { + id: string + workspaceId: string + name: string + path: string + gitCommit?: string + settings: ProjectSettings + createdAt: Date + updatedAt: Date +} + +// Session (conversation) +interface Session { + id: string + projectId: string + userId: string + title: string + status: SessionStatus + model: { + providerId: string + modelId: string + } + summary?: SessionSummary + createdAt: Date + updatedAt: Date + expiresAt?: Date +} + +// Message within session +interface Message { + id: string + sessionId: string + role: "user" | "assistant" | "system" + content: MessageContent + metadata: MessageMetadata + createdAt: Date +} + +// Message part (text, tool, file, etc.) +interface MessagePart { + id: string + messageId: string + type: PartType + content: PartContent + order: number + createdAt: Date +} +``` + +## Request Flow + +### Chat Request Flow + +``` +1. Client sends POST /api/v1/sessions/:id/messages + │ +2. API Gateway validates JWT, applies rate limit + │ +3. API Server receives request + │ ├── Validate session ownership + │ ├── Load session context from DB + │ └── Check user quota + │ +4. Session Orchestrator processes message + │ ├── Build prompt with history + │ ├── Select provider/model + │ └── Apply system prompts + │ +5. Provider Gateway streams to LLM + │ ├── Apply org/user API key + │ ├── Track token usage + │ └── Handle retries/failover + │ +6. Tool Execution (if needed) + │ ├── Spawn sandboxed container + │ ├── Mount workspace files + │ ├── Execute tool + │ └── Capture output + │ +7. Stream response to client + │ ├── Publish events to Redis + │ ├── Persist to database + │ └── SSE to client + │ +8. Update usage metrics +``` + +### Event Distribution + +```typescript +// Cross-instance event distribution +interface EventDistributor { + // Publish event to all subscribers + publish(channel: string, event: Event): Promise + + // Subscribe to events for user/session + subscribe(channel: string, handler: EventHandler): Unsubscribe +} + +// Redis Pub/Sub channels +const channels = { + session: (sessionId: string) => `session:${sessionId}`, + user: (userId: string) => `user:${userId}`, + workspace: (workspaceId: string) => `workspace:${workspaceId}`, +} +``` + +**SSE Connection Management**: +```typescript +// Server-Sent Events with Redis coordination +app.get("/api/v1/events", async (c) => { + const { userId, sessionId } = c.get("tenant") + + return streamSSE(c, async (stream) => { + // Subscribe to user's events + const unsub = await eventDistributor.subscribe( + channels.user(userId), + async (event) => { + await stream.writeSSE({ data: JSON.stringify(event) }) + } + ) + + // Heartbeat to keep connection alive + const heartbeat = setInterval(() => { + stream.writeSSE({ event: "ping", data: "" }) + }, 30000) + + stream.onAbort(() => { + clearInterval(heartbeat) + unsub() + }) + }) +}) +``` + +## Service Dependencies + +### Required Services + +| Service | Purpose | Recommended | +|---------|---------|-------------| +| PostgreSQL | Primary database | PostgreSQL 15+ | +| Redis | Cache, pub/sub, sessions | Redis 7+ / Valkey | +| Object Storage | File storage, artifacts | S3/R2/GCS | +| Message Queue | Background jobs | NATS / Redis Streams | + +### Optional Services + +| Service | Purpose | Options | +|---------|---------|---------| +| Vault | Secret management | HashiCorp Vault, AWS KMS | +| Git Service | Repo management | GitHub, GitLab, Gitea | +| Metrics | Observability | Prometheus, Datadog | +| Tracing | Distributed tracing | Jaeger, Tempo | + +## Configuration + +### Environment Variables + +```bash +# Server +PORT=3000 +HOST=0.0.0.0 +NODE_ENV=production + +# Database +DATABASE_URL=postgresql://user:pass@host:5432/opencode +DATABASE_POOL_SIZE=20 + +# Redis +REDIS_URL=redis://host:6379 +REDIS_CLUSTER=true + +# Object Storage +STORAGE_PROVIDER=s3 +STORAGE_BUCKET=opencode-files +STORAGE_REGION=us-east-1 +AWS_ACCESS_KEY_ID=xxx +AWS_SECRET_ACCESS_KEY=xxx + +# Auth +JWT_SECRET=xxx +JWT_ISSUER=https://auth.opencode.io +OAUTH_GITHUB_CLIENT_ID=xxx +OAUTH_GITHUB_CLIENT_SECRET=xxx + +# LLM Providers (org defaults) +ANTHROPIC_API_KEY=xxx +OPENAI_API_KEY=xxx + +# Feature Flags +ENABLE_SANDBOXED_EXECUTION=true +ENABLE_GIT_INTEGRATION=true +MAX_CONCURRENT_SESSIONS=10 +``` + +### Runtime Configuration + +```typescript +interface ServerConfig { + server: { + port: number + host: string + trustProxy: boolean + } + database: { + url: string + poolSize: number + ssl: boolean + } + redis: { + url: string + cluster: boolean + } + storage: { + provider: "s3" | "r2" | "gcs" | "local" + bucket: string + region: string + } + auth: { + jwtSecret: string + jwtIssuer: string + sessionTtl: number + } + limits: { + maxSessionsPerUser: number + maxMessagesPerSession: number + maxFileSizeMb: number + requestTimeoutMs: number + } + sandbox: { + enabled: boolean + provider: "docker" | "firecracker" + poolSize: number + } +} +``` + +## Deployment Architecture + +### Kubernetes Deployment + +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: opencode-api +spec: + replicas: 3 + selector: + matchLabels: + app: opencode-api + template: + metadata: + labels: + app: opencode-api + spec: + containers: + - name: api + image: opencode/api:latest + ports: + - containerPort: 3000 + resources: + requests: + memory: "512Mi" + cpu: "500m" + limits: + memory: "2Gi" + cpu: "2000m" + env: + - name: DATABASE_URL + valueFrom: + secretKeyRef: + name: opencode-secrets + key: database-url + livenessProbe: + httpGet: + path: /health + port: 3000 + readinessProbe: + httpGet: + path: /ready + port: 3000 +``` + +### Service Mesh + +For production deployments, consider: +- **Istio/Linkerd** for service mesh +- **mTLS** between services +- **Circuit breakers** for provider calls +- **Retry policies** with exponential backoff diff --git a/docs/design/server-side-deployment/authentication.md b/docs/design/server-side-deployment/authentication.md new file mode 100644 index 00000000000..4cd3e914acd --- /dev/null +++ b/docs/design/server-side-deployment/authentication.md @@ -0,0 +1,695 @@ +# Authentication & Authorization + +## Overview + +The server-side deployment requires a comprehensive auth system supporting multiple authentication methods, organization-based multi-tenancy, and fine-grained access control. + +## Authentication Methods + +### 1. OAuth 2.0 / OIDC + +Primary authentication method for web and desktop clients. + +```typescript +interface OAuthConfig { + providers: { + github: { + clientId: string + clientSecret: string + scopes: ["user:email", "read:org"] + } + google: { + clientId: string + clientSecret: string + scopes: ["email", "profile"] + } + microsoft: { + clientId: string + clientSecret: string + tenant: string + } + // Custom OIDC provider for enterprise + oidc?: { + issuer: string + clientId: string + clientSecret: string + scopes: string[] + } + } +} +``` + +**OAuth Flow**: +``` +1. Client redirects to /auth/login/:provider +2. Server redirects to provider authorization URL +3. User authenticates with provider +4. Provider redirects to /auth/callback/:provider +5. Server exchanges code for tokens +6. Server creates/updates user record +7. Server issues JWT + refresh token +8. Client stores tokens securely +``` + +### 2. API Keys + +For programmatic access (CI/CD, SDK, CLI). + +```typescript +interface ApiKey { + id: string + orgId: string + userId: string + name: string + prefix: string // First 8 chars for identification + hash: string // Argon2 hash of full key + scopes: Scope[] + rateLimit?: RateLimit + expiresAt?: Date + lastUsedAt?: Date + createdAt: Date +} + +// Key format: oc_live_xxxxxxxxxxxxxxxxxxxx +// Prefix identifies key type (live/test) +``` + +**Key Generation**: +```typescript +async function generateApiKey(input: CreateKeyInput): Promise<{ key: string; record: ApiKey }> { + const key = `oc_live_${crypto.randomBytes(24).toString('base64url')}` + const hash = await argon2.hash(key) + + const record: ApiKey = { + id: generateId(), + orgId: input.orgId, + userId: input.userId, + name: input.name, + prefix: key.substring(0, 16), + hash, + scopes: input.scopes, + createdAt: new Date(), + } + + await db.apiKeys.insert(record) + + return { key, record } // Return full key only once +} +``` + +### 3. Personal Access Tokens (PAT) + +User-scoped tokens with limited lifetime. + +```typescript +interface PersonalAccessToken { + id: string + userId: string + name: string + hash: string + scopes: Scope[] + expiresAt: Date + createdAt: Date +} +``` + +## Token Management + +### JWT Structure + +```typescript +interface JWTPayload { + // Standard claims + iss: string // Issuer + sub: string // User ID + aud: string[] // Audience + exp: number // Expiration + iat: number // Issued at + jti: string // Token ID + + // Custom claims + org_id: string // Organization ID + org_role: OrgRole // Role in organization + scopes: string[] // Granted scopes + session_id?: string // For session-specific tokens +} +``` + +### Token Lifecycle + +```typescript +const tokenConfig = { + access: { + ttl: 15 * 60, // 15 minutes + algorithm: "RS256", + }, + refresh: { + ttl: 7 * 24 * 60 * 60, // 7 days + rotation: true, // Single-use refresh tokens + family: true, // Track token families + }, +} +``` + +**Refresh Token Rotation**: +```typescript +async function refreshTokens(refreshToken: string): Promise { + const payload = await verifyRefreshToken(refreshToken) + + // Check if token was already used (replay attack) + const tokenRecord = await db.refreshTokens.findById(payload.jti) + if (tokenRecord.used) { + // Token reuse detected - revoke entire family + await db.refreshTokens.revokeFamily(tokenRecord.familyId) + throw new AuthError("Token reuse detected", "TOKEN_REUSE") + } + + // Mark current token as used + await db.refreshTokens.markUsed(payload.jti) + + // Issue new token pair + return issueTokens(payload.sub, { + familyId: tokenRecord.familyId, + }) +} +``` + +## Authorization Model + +### Role-Based Access Control (RBAC) + +```typescript +type OrgRole = "owner" | "admin" | "member" | "guest" + +interface Permission { + resource: Resource + action: Action +} + +type Resource = + | "organization" + | "workspace" + | "project" + | "session" + | "user" + | "api_key" + | "provider" + | "billing" + +type Action = + | "create" + | "read" + | "update" + | "delete" + | "manage" + | "execute" +``` + +**Role Permissions Matrix**: + +| Permission | Owner | Admin | Member | Guest | +|------------|-------|-------|--------|-------| +| org:manage | yes | no | no | no | +| org:read | yes | yes | yes | yes | +| workspace:create | yes | yes | no | no | +| workspace:delete | yes | yes | no | no | +| project:create | yes | yes | yes | no | +| session:create | yes | yes | yes | yes | +| session:read (own) | yes | yes | yes | yes | +| session:read (all) | yes | yes | no | no | +| api_key:create | yes | yes | yes | no | +| provider:manage | yes | yes | no | no | +| billing:manage | yes | no | no | no | + +### Scope-Based Access (API Keys) + +```typescript +type Scope = + | "sessions:read" + | "sessions:write" + | "projects:read" + | "projects:write" + | "workspaces:read" + | "workspaces:write" + | "files:read" + | "files:write" + | "tools:execute" + | "admin" +``` + +**Scope Validation**: +```typescript +function requireScopes(...required: Scope[]) { + return async (c: Context, next: Next) => { + const granted = c.get("scopes") as Scope[] + + for (const scope of required) { + if (!granted.includes(scope) && !granted.includes("admin")) { + throw new AuthError(`Missing scope: ${scope}`, "INSUFFICIENT_SCOPE") + } + } + + await next() + } +} + +// Usage +app.post("/sessions/:id/messages", + requireScopes("sessions:write"), + sessionController.sendMessage +) +``` + +### Resource-Level Authorization + +```typescript +interface ResourcePolicy { + check(ctx: TenantContext, resource: Resource, action: Action): Promise +} + +class SessionPolicy implements ResourcePolicy { + async check(ctx: TenantContext, session: Session, action: Action): Promise { + // Owners can do anything + if (ctx.orgRole === "owner") return true + + // Check if user owns the session + const isOwner = session.userId === ctx.userId + + switch (action) { + case "read": + // Members can read own sessions, admins can read all + return isOwner || ctx.orgRole === "admin" + + case "update": + case "delete": + // Only owner or admin can modify + return isOwner || ctx.orgRole === "admin" + + case "execute": + // Only owner can execute tools in session + return isOwner + + default: + return false + } + } +} +``` + +## Multi-Tenancy + +### Tenant Context + +```typescript +interface TenantContext { + userId: string + orgId: string + orgRole: OrgRole + workspaceId?: string + sessionId?: string + scopes: Scope[] + metadata: { + ip: string + userAgent: string + requestId: string + } +} + +// Middleware to inject tenant context +async function tenantContext(c: Context, next: Next) { + const jwt = c.get("jwt") as JWTPayload + + const ctx: TenantContext = { + userId: jwt.sub, + orgId: jwt.org_id, + orgRole: jwt.org_role, + scopes: jwt.scopes, + metadata: { + ip: c.req.header("x-forwarded-for") || c.req.ip, + userAgent: c.req.header("user-agent") || "", + requestId: c.get("requestId"), + }, + } + + c.set("tenant", ctx) + await next() +} +``` + +### Organization Isolation + +```typescript +// Database queries automatically scoped to organization +class SessionRepository { + constructor(private ctx: TenantContext) {} + + async findById(id: string): Promise { + return db.sessions.findFirst({ + where: { + id, + project: { + workspace: { + orgId: this.ctx.orgId, // Automatic org scoping + }, + }, + }, + }) + } + + async list(filter: SessionFilter): Promise { + return db.sessions.findMany({ + where: { + ...filter, + project: { + workspace: { + orgId: this.ctx.orgId, + }, + }, + // Non-admins only see own sessions + ...(this.ctx.orgRole !== "admin" && { + userId: this.ctx.userId, + }), + }, + }) + } +} +``` + +## LLM Provider Authentication + +### User BYOK (Bring Your Own Key) + +```typescript +interface UserProviderKey { + id: string + userId: string + providerId: string + encryptedKey: string // AES-256-GCM encrypted + keyId: string // KMS key ID used + createdAt: Date + lastUsedAt?: Date +} + +// Encrypt user's API key before storage +async function storeProviderKey( + userId: string, + providerId: string, + apiKey: string +): Promise { + const { ciphertext, keyId } = await kms.encrypt(apiKey) + + await db.userProviderKeys.upsert({ + where: { userId, providerId }, + create: { + id: generateId(), + userId, + providerId, + encryptedKey: ciphertext, + keyId, + createdAt: new Date(), + }, + update: { + encryptedKey: ciphertext, + keyId, + }, + }) +} +``` + +### Organization Default Keys + +```typescript +interface OrgProviderConfig { + orgId: string + providerId: string + encryptedKey: string + rateLimit?: RateLimit + allowUserOverride: boolean + usageTracking: boolean +} + +// Key resolution order +async function resolveProviderKey( + ctx: TenantContext, + providerId: string +): Promise { + // 1. Check user BYOK + const userKey = await db.userProviderKeys.findFirst({ + where: { userId: ctx.userId, providerId }, + }) + if (userKey) { + return kms.decrypt(userKey.encryptedKey, userKey.keyId) + } + + // 2. Check org default + const orgConfig = await db.orgProviderConfigs.findFirst({ + where: { orgId: ctx.orgId, providerId }, + }) + if (orgConfig) { + return kms.decrypt(orgConfig.encryptedKey, orgConfig.keyId) + } + + throw new AuthError(`No API key for provider: ${providerId}`, "NO_PROVIDER_KEY") +} +``` + +## Session Management + +### Active Session Tracking + +```typescript +interface UserSession { + id: string + userId: string + tokenFamily: string + device: string + ip: string + location?: string + createdAt: Date + lastActiveAt: Date + expiresAt: Date +} + +// Track active sessions per user +async function createUserSession( + userId: string, + metadata: SessionMetadata +): Promise { + // Enforce max sessions per user + const activeSessions = await db.userSessions.count({ + where: { userId, expiresAt: { gt: new Date() } }, + }) + + if (activeSessions >= MAX_SESSIONS_PER_USER) { + // Revoke oldest session + const oldest = await db.userSessions.findFirst({ + where: { userId }, + orderBy: { lastActiveAt: "asc" }, + }) + if (oldest) { + await revokeSession(oldest.id) + } + } + + return db.userSessions.create({ + data: { + id: generateId(), + userId, + tokenFamily: generateId(), + device: metadata.device, + ip: metadata.ip, + createdAt: new Date(), + lastActiveAt: new Date(), + expiresAt: new Date(Date.now() + SESSION_TTL), + }, + }) +} +``` + +### Session Revocation + +```typescript +// Revoke specific session +async function revokeSession(sessionId: string): Promise { + const session = await db.userSessions.findById(sessionId) + if (!session) return + + // Revoke all tokens in family + await db.refreshTokens.updateMany({ + where: { familyId: session.tokenFamily }, + data: { revoked: true }, + }) + + // Delete session + await db.userSessions.delete({ id: sessionId }) + + // Publish revocation event + await redis.publish(`user:${session.userId}:revoke`, { + type: "session_revoked", + sessionId, + }) +} + +// Revoke all sessions for user +async function revokeAllSessions(userId: string): Promise { + const sessions = await db.userSessions.findMany({ + where: { userId }, + }) + + for (const session of sessions) { + await revokeSession(session.id) + } +} +``` + +## Security Controls + +### Rate Limiting + +```typescript +interface RateLimitConfig { + // Per-user limits + user: { + requests: number + window: number + burst?: number + } + // Per-organization limits + org: { + requests: number + window: number + } + // Per-endpoint limits + endpoints: { + [path: string]: { + requests: number + window: number + } + } +} + +// Example config +const rateLimitConfig: RateLimitConfig = { + user: { + requests: 100, + window: 60, // 100 req/min per user + burst: 20, // Allow burst of 20 + }, + org: { + requests: 10000, + window: 3600, // 10k req/hour per org + }, + endpoints: { + "POST /sessions/:id/messages": { + requests: 10, + window: 60, // 10 messages/min + }, + "POST /auth/login": { + requests: 5, + window: 300, // 5 attempts/5min + }, + }, +} +``` + +### Audit Logging + +```typescript +interface AuditLog { + id: string + timestamp: Date + userId: string + orgId: string + action: string + resource: string + resourceId?: string + metadata: Record + ip: string + userAgent: string + status: "success" | "failure" + errorCode?: string +} + +// Log security-sensitive actions +async function auditLog(entry: Omit): Promise { + await db.auditLogs.create({ + data: { + id: generateId(), + timestamp: new Date(), + ...entry, + }, + }) +} + +// Usage +await auditLog({ + userId: ctx.userId, + orgId: ctx.orgId, + action: "session.delete", + resource: "session", + resourceId: sessionId, + metadata: { reason: "user_request" }, + ip: ctx.metadata.ip, + userAgent: ctx.metadata.userAgent, + status: "success", +}) +``` + +### Brute Force Protection + +```typescript +// Failed login tracking +interface FailedAttempt { + identifier: string // email or IP + attempts: number + lastAttempt: Date + lockedUntil?: Date +} + +async function checkBruteForce(identifier: string): Promise { + const record = await redis.get(`failed:${identifier}`) + + if (record?.lockedUntil && record.lockedUntil > new Date()) { + const waitTime = Math.ceil((record.lockedUntil.getTime() - Date.now()) / 1000) + throw new AuthError( + `Too many attempts. Try again in ${waitTime}s`, + "RATE_LIMITED" + ) + } +} + +async function recordFailedAttempt(identifier: string): Promise { + const key = `failed:${identifier}` + const record = await redis.get(key) || { + identifier, + attempts: 0, + lastAttempt: new Date(), + } + + record.attempts++ + record.lastAttempt = new Date() + + // Progressive lockout + if (record.attempts >= 5) { + const lockoutMinutes = Math.min(Math.pow(2, record.attempts - 5), 60) + record.lockedUntil = new Date(Date.now() + lockoutMinutes * 60 * 1000) + } + + await redis.set(key, record, { ex: 3600 }) +} +``` + +## Implementation Checklist + +- [ ] OAuth 2.0 / OIDC integration +- [ ] API key generation and validation +- [ ] JWT issuance and validation +- [ ] Refresh token rotation +- [ ] Role-based access control +- [ ] Scope-based permissions +- [ ] Multi-tenant isolation +- [ ] Provider key management +- [ ] Session tracking +- [ ] Rate limiting +- [ ] Audit logging +- [ ] Brute force protection diff --git a/docs/design/server-side-deployment/scaling.md b/docs/design/server-side-deployment/scaling.md new file mode 100644 index 00000000000..de6b6162caa --- /dev/null +++ b/docs/design/server-side-deployment/scaling.md @@ -0,0 +1,866 @@ +# Scaling & Deployment + +## Overview + +This document covers horizontal scaling strategies, deployment patterns, and operational considerations for running OpenCode as a production web service. + +## Scaling Architecture + +### Horizontal Scaling Model + +``` + ┌─────────────────┐ + │ Global LB │ + │ (Cloudflare) │ + └────────┬────────┘ + │ + ┌───────────────────┼───────────────────┐ + │ │ │ + ┌────────▼────────┐ ┌────────▼────────┐ ┌────────▼────────┐ + │ Region: US │ │ Region: EU │ │ Region: APAC │ + └────────┬────────┘ └────────┬────────┘ └────────┬────────┘ + │ │ │ + ┌────────▼────────┐ ┌────────▼────────┐ ┌────────▼────────┐ + │ K8s Cluster │ │ K8s Cluster │ │ K8s Cluster │ + │ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │ + │ │ API Pods │ │ │ │ API Pods │ │ │ │ API Pods │ │ + │ │ (3-20) │ │ │ │ (3-20) │ │ │ │ (3-20) │ │ + │ └─────────────┘ │ │ └─────────────┘ │ │ └─────────────┘ │ + │ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │ + │ │ Worker Pods │ │ │ │ Worker Pods │ │ │ │ Worker Pods │ │ + │ │ (2-10) │ │ │ │ (2-10) │ │ │ │ (2-10) │ │ + │ └─────────────┘ │ │ └─────────────┘ │ │ └─────────────┘ │ + └─────────────────┘ └─────────────────┘ └─────────────────┘ +``` + +### Component Scaling Characteristics + +| Component | Scaling Type | Trigger | Min/Max | +|-----------|-------------|---------|---------| +| API Server | Horizontal | CPU/Memory | 3/50 | +| Tool Workers | Horizontal | Queue depth | 2/20 | +| WebSocket Handlers | Horizontal | Connection count | 2/20 | +| PostgreSQL | Vertical + Read Replicas | CPU/Connections | 1 primary | +| Redis | Cluster | Memory | 3 nodes | + +## Kubernetes Deployment + +### Namespace Structure + +```yaml +apiVersion: v1 +kind: Namespace +metadata: + name: opencode + labels: + istio-injection: enabled +--- +apiVersion: v1 +kind: Namespace +metadata: + name: opencode-workers + labels: + istio-injection: enabled +``` + +### API Server Deployment + +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: opencode-api + namespace: opencode +spec: + replicas: 3 + strategy: + type: RollingUpdate + rollingUpdate: + maxSurge: 1 + maxUnavailable: 0 + selector: + matchLabels: + app: opencode-api + template: + metadata: + labels: + app: opencode-api + version: v1 + annotations: + prometheus.io/scrape: "true" + prometheus.io/port: "9090" + spec: + serviceAccountName: opencode-api + containers: + - name: api + image: ghcr.io/opencode/api:latest + imagePullPolicy: Always + ports: + - name: http + containerPort: 3000 + - name: metrics + containerPort: 9090 + env: + - name: NODE_ENV + value: "production" + - name: DATABASE_URL + valueFrom: + secretKeyRef: + name: opencode-secrets + key: database-url + - name: REDIS_URL + valueFrom: + secretKeyRef: + name: opencode-secrets + key: redis-url + - name: JWT_SECRET + valueFrom: + secretKeyRef: + name: opencode-secrets + key: jwt-secret + resources: + requests: + memory: "512Mi" + cpu: "500m" + limits: + memory: "2Gi" + cpu: "2000m" + livenessProbe: + httpGet: + path: /health/live + port: 3000 + initialDelaySeconds: 10 + periodSeconds: 10 + timeoutSeconds: 5 + failureThreshold: 3 + readinessProbe: + httpGet: + path: /health/ready + port: 3000 + initialDelaySeconds: 5 + periodSeconds: 5 + timeoutSeconds: 3 + failureThreshold: 3 + lifecycle: + preStop: + exec: + command: ["/bin/sh", "-c", "sleep 10"] + affinity: + podAntiAffinity: + preferredDuringSchedulingIgnoredDuringExecution: + - weight: 100 + podAffinityTerm: + labelSelector: + matchLabels: + app: opencode-api + topologyKey: kubernetes.io/hostname + topologySpreadConstraints: + - maxSkew: 1 + topologyKey: topology.kubernetes.io/zone + whenUnsatisfiable: ScheduleAnyway + labelSelector: + matchLabels: + app: opencode-api +``` + +### Horizontal Pod Autoscaler + +```yaml +apiVersion: autoscaling/v2 +kind: HorizontalPodAutoscaler +metadata: + name: opencode-api-hpa + namespace: opencode +spec: + scaleTargetRef: + apiVersion: apps/v1 + kind: Deployment + name: opencode-api + minReplicas: 3 + maxReplicas: 50 + metrics: + - type: Resource + resource: + name: cpu + target: + type: Utilization + averageUtilization: 70 + - type: Resource + resource: + name: memory + target: + type: Utilization + averageUtilization: 80 + - type: Pods + pods: + metric: + name: http_requests_per_second + target: + type: AverageValue + averageValue: "100" + behavior: + scaleDown: + stabilizationWindowSeconds: 300 + policies: + - type: Percent + value: 10 + periodSeconds: 60 + scaleUp: + stabilizationWindowSeconds: 60 + policies: + - type: Percent + value: 100 + periodSeconds: 15 + - type: Pods + value: 4 + periodSeconds: 15 + selectPolicy: Max +``` + +### Tool Worker Deployment + +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: opencode-worker + namespace: opencode-workers +spec: + replicas: 2 + selector: + matchLabels: + app: opencode-worker + template: + metadata: + labels: + app: opencode-worker + spec: + serviceAccountName: opencode-worker + containers: + - name: worker + image: ghcr.io/opencode/worker:latest + env: + - name: WORKER_TYPE + value: "tool-execution" + - name: REDIS_URL + valueFrom: + secretKeyRef: + name: opencode-secrets + key: redis-url + resources: + requests: + memory: "1Gi" + cpu: "1000m" + limits: + memory: "4Gi" + cpu: "4000m" + securityContext: + privileged: false + runAsNonRoot: true + readOnlyRootFilesystem: true + volumeMounts: + - name: workspace + mountPath: /workspace + - name: tmp + mountPath: /tmp + volumes: + - name: workspace + emptyDir: + sizeLimit: 10Gi + - name: tmp + emptyDir: + sizeLimit: 1Gi +``` + +### Service & Ingress + +```yaml +apiVersion: v1 +kind: Service +metadata: + name: opencode-api + namespace: opencode +spec: + selector: + app: opencode-api + ports: + - name: http + port: 80 + targetPort: 3000 + type: ClusterIP +--- +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + name: opencode-api + namespace: opencode + annotations: + kubernetes.io/ingress.class: nginx + nginx.ingress.kubernetes.io/proxy-body-size: "100m" + nginx.ingress.kubernetes.io/proxy-read-timeout: "3600" + nginx.ingress.kubernetes.io/proxy-send-timeout: "3600" + cert-manager.io/cluster-issuer: letsencrypt-prod +spec: + tls: + - hosts: + - api.opencode.io + secretName: opencode-tls + rules: + - host: api.opencode.io + http: + paths: + - path: / + pathType: Prefix + backend: + service: + name: opencode-api + port: + number: 80 +``` + +## Database Scaling + +### PostgreSQL High Availability + +```yaml +# Using CloudNativePG operator +apiVersion: postgresql.cnpg.io/v1 +kind: Cluster +metadata: + name: opencode-postgres + namespace: opencode +spec: + instances: 3 + primaryUpdateStrategy: unsupervised + + postgresql: + parameters: + max_connections: "200" + shared_buffers: "2GB" + effective_cache_size: "6GB" + maintenance_work_mem: "512MB" + checkpoint_completion_target: "0.9" + wal_buffers: "64MB" + default_statistics_target: "100" + random_page_cost: "1.1" + effective_io_concurrency: "200" + work_mem: "10MB" + min_wal_size: "1GB" + max_wal_size: "4GB" + + storage: + size: 100Gi + storageClass: fast-ssd + + backup: + barmanObjectStore: + destinationPath: s3://opencode-backups/postgres + s3Credentials: + accessKeyId: + name: aws-creds + key: ACCESS_KEY_ID + secretAccessKey: + name: aws-creds + key: SECRET_ACCESS_KEY + wal: + compression: gzip + data: + compression: gzip + retentionPolicy: "30d" + + monitoring: + enablePodMonitor: true +``` + +### Read Replica Configuration + +```typescript +// Database client with read replica routing +const db = createDatabase({ + primary: { + connectionString: process.env.DATABASE_URL, + poolSize: 10, + }, + replicas: [ + { + connectionString: process.env.DATABASE_REPLICA_1_URL, + poolSize: 20, + }, + { + connectionString: process.env.DATABASE_REPLICA_2_URL, + poolSize: 20, + }, + ], + // Route read queries to replicas + router: (query) => { + if (query.type === "SELECT" && !query.inTransaction) { + return "replica" + } + return "primary" + }, +}) +``` + +### Connection Pooling with PgBouncer + +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: pgbouncer + namespace: opencode +spec: + replicas: 2 + selector: + matchLabels: + app: pgbouncer + template: + spec: + containers: + - name: pgbouncer + image: pgbouncer/pgbouncer:latest + ports: + - containerPort: 5432 + env: + - name: PGBOUNCER_POOL_MODE + value: "transaction" + - name: PGBOUNCER_MAX_CLIENT_CONN + value: "1000" + - name: PGBOUNCER_DEFAULT_POOL_SIZE + value: "20" + - name: PGBOUNCER_MIN_POOL_SIZE + value: "5" +``` + +## Redis Scaling + +### Redis Cluster + +```yaml +apiVersion: redis.redis.opstreelabs.in/v1beta1 +kind: RedisCluster +metadata: + name: opencode-redis + namespace: opencode +spec: + clusterSize: 3 + clusterVersion: v7 + persistenceEnabled: true + kubernetesConfig: + image: redis:7-alpine + resources: + requests: + cpu: 500m + memory: 1Gi + limits: + cpu: 1000m + memory: 2Gi + storage: + volumeClaimTemplate: + spec: + accessModes: ["ReadWriteOnce"] + resources: + requests: + storage: 10Gi + redisExporter: + enabled: true + image: oliver006/redis_exporter:latest +``` + +## Load Balancing + +### Global Load Balancing (Cloudflare) + +```typescript +// Cloudflare Worker for intelligent routing +export default { + async fetch(request: Request): Promise { + const url = new URL(request.url) + + // Determine best region based on latency + const region = request.cf?.region || "us" + const backend = getBackendForRegion(region) + + // Add request tracing + const headers = new Headers(request.headers) + headers.set("x-request-id", crypto.randomUUID()) + headers.set("x-forwarded-region", region) + + return fetch(backend + url.pathname + url.search, { + method: request.method, + headers, + body: request.body, + }) + }, +} + +function getBackendForRegion(region: string): string { + const backends = { + us: "https://us.api.opencode.io", + eu: "https://eu.api.opencode.io", + apac: "https://apac.api.opencode.io", + } + return backends[region] || backends.us +} +``` + +### Internal Load Balancing + +```yaml +# Istio VirtualService for traffic management +apiVersion: networking.istio.io/v1beta1 +kind: VirtualService +metadata: + name: opencode-api + namespace: opencode +spec: + hosts: + - opencode-api + http: + - match: + - headers: + x-api-version: + exact: "v2" + route: + - destination: + host: opencode-api-v2 + port: + number: 80 + - route: + - destination: + host: opencode-api + port: + number: 80 + weight: 100 + retries: + attempts: 3 + perTryTimeout: 10s + retryOn: 5xx,reset,connect-failure + timeout: 300s +``` + +## SSE Connection Scaling + +### Sticky Sessions for SSE + +```yaml +# Nginx Ingress with sticky sessions +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + name: opencode-events + annotations: + nginx.ingress.kubernetes.io/affinity: "cookie" + nginx.ingress.kubernetes.io/session-cookie-name: "opencode-route" + nginx.ingress.kubernetes.io/session-cookie-expires: "172800" + nginx.ingress.kubernetes.io/session-cookie-max-age: "172800" + nginx.ingress.kubernetes.io/proxy-read-timeout: "3600" +spec: + rules: + - host: events.opencode.io + http: + paths: + - path: / + pathType: Prefix + backend: + service: + name: opencode-api + port: + number: 80 +``` + +### Connection Draining + +```typescript +// Graceful shutdown with connection draining +const connections = new Set() + +async function gracefulShutdown(): Promise { + // Stop accepting new connections + server.close() + + // Notify existing connections + for (const conn of connections) { + conn.send({ type: "server.shutdown", reconnectIn: 5000 }) + } + + // Wait for connections to drain (max 30s) + const deadline = Date.now() + 30000 + while (connections.size > 0 && Date.now() < deadline) { + await sleep(1000) + } + + // Force close remaining + for (const conn of connections) { + conn.close() + } + + process.exit(0) +} + +process.on("SIGTERM", gracefulShutdown) +``` + +## Monitoring & Observability + +### Prometheus Metrics + +```typescript +import { Registry, Counter, Histogram, Gauge } from "prom-client" + +const registry = new Registry() + +// Request metrics +const httpRequestsTotal = new Counter({ + name: "http_requests_total", + help: "Total HTTP requests", + labelNames: ["method", "path", "status"], + registers: [registry], +}) + +const httpRequestDuration = new Histogram({ + name: "http_request_duration_seconds", + help: "HTTP request duration", + labelNames: ["method", "path"], + buckets: [0.01, 0.05, 0.1, 0.5, 1, 5, 10], + registers: [registry], +}) + +// Business metrics +const activeSessions = new Gauge({ + name: "opencode_active_sessions", + help: "Number of active sessions", + registers: [registry], +}) + +const llmTokensTotal = new Counter({ + name: "opencode_llm_tokens_total", + help: "Total LLM tokens consumed", + labelNames: ["provider", "model", "type"], + registers: [registry], +}) + +const toolExecutionDuration = new Histogram({ + name: "opencode_tool_execution_seconds", + help: "Tool execution duration", + labelNames: ["tool"], + buckets: [0.1, 0.5, 1, 5, 10, 30, 60], + registers: [registry], +}) +``` + +### Grafana Dashboards + +```json +{ + "title": "OpenCode Overview", + "panels": [ + { + "title": "Request Rate", + "type": "graph", + "targets": [ + { + "expr": "sum(rate(http_requests_total[5m])) by (status)", + "legendFormat": "{{status}}" + } + ] + }, + { + "title": "P99 Latency", + "type": "graph", + "targets": [ + { + "expr": "histogram_quantile(0.99, sum(rate(http_request_duration_seconds_bucket[5m])) by (le))" + } + ] + }, + { + "title": "Active Sessions", + "type": "stat", + "targets": [ + { + "expr": "sum(opencode_active_sessions)" + } + ] + }, + { + "title": "Token Usage", + "type": "graph", + "targets": [ + { + "expr": "sum(rate(opencode_llm_tokens_total[1h])) by (provider)", + "legendFormat": "{{provider}}" + } + ] + } + ] +} +``` + +### Distributed Tracing + +```typescript +import { trace, SpanKind } from "@opentelemetry/api" + +const tracer = trace.getTracer("opencode-api") + +async function handleChatRequest(ctx: Context): Promise { + return tracer.startActiveSpan( + "chat.request", + { kind: SpanKind.SERVER }, + async (span) => { + try { + span.setAttributes({ + "session.id": ctx.params.id, + "user.id": ctx.get("tenant").userId, + }) + + // Process request with child spans + const messages = await tracer.startActiveSpan( + "load.messages", + async (loadSpan) => { + const result = await loadMessages(ctx.params.id) + loadSpan.end() + return result + } + ) + + const response = await tracer.startActiveSpan( + "llm.request", + { kind: SpanKind.CLIENT }, + async (llmSpan) => { + llmSpan.setAttributes({ + "llm.provider": "anthropic", + "llm.model": "claude-3-sonnet", + }) + const result = await callLLM(messages) + llmSpan.setAttributes({ + "llm.tokens.input": result.tokens.input, + "llm.tokens.output": result.tokens.output, + }) + llmSpan.end() + return result + } + ) + + span.setStatus({ code: SpanStatusCode.OK }) + return ctx.json(response) + } catch (error) { + span.setStatus({ code: SpanStatusCode.ERROR, message: error.message }) + throw error + } finally { + span.end() + } + } + ) +} +``` + +## Deployment Strategies + +### Blue-Green Deployment + +```yaml +# Argo Rollouts for blue-green deployment +apiVersion: argoproj.io/v1alpha1 +kind: Rollout +metadata: + name: opencode-api +spec: + replicas: 5 + strategy: + blueGreen: + activeService: opencode-api + previewService: opencode-api-preview + autoPromotionEnabled: false + scaleDownDelaySeconds: 30 + previewReplicaCount: 2 + prePromotionAnalysis: + templates: + - templateName: success-rate + args: + - name: service-name + value: opencode-api-preview +``` + +### Canary Deployment + +```yaml +apiVersion: argoproj.io/v1alpha1 +kind: Rollout +metadata: + name: opencode-api +spec: + strategy: + canary: + steps: + - setWeight: 5 + - pause: { duration: 5m } + - setWeight: 20 + - pause: { duration: 10m } + - setWeight: 50 + - pause: { duration: 10m } + - setWeight: 100 + analysis: + templates: + - templateName: success-rate + startingStep: 1 + canaryService: opencode-api-canary + stableService: opencode-api +``` + +## Disaster Recovery + +### Multi-Region Failover + +```typescript +// Health check and failover logic +interface RegionHealth { + region: string + healthy: boolean + latency: number + lastCheck: Date +} + +class RegionManager { + private regions: Map = new Map() + + async checkHealth(region: string): Promise { + const start = Date.now() + try { + const response = await fetch(`https://${region}.api.opencode.io/health`) + return { + region, + healthy: response.ok, + latency: Date.now() - start, + lastCheck: new Date(), + } + } catch { + return { + region, + healthy: false, + latency: -1, + lastCheck: new Date(), + } + } + } + + getBestRegion(): string { + const healthy = Array.from(this.regions.values()) + .filter((r) => r.healthy) + .sort((a, b) => a.latency - b.latency) + + return healthy[0]?.region || "us" // fallback + } +} +``` + +### RTO/RPO Targets + +| Scenario | RTO | RPO | +|----------|-----|-----| +| Single pod failure | 0 (auto-recovery) | 0 | +| Node failure | 2 minutes | 0 | +| AZ failure | 5 minutes | 0 | +| Region failure | 15 minutes | 1 minute | +| Complete outage | 1 hour | 5 minutes | diff --git a/docs/design/server-side-deployment/security.md b/docs/design/server-side-deployment/security.md new file mode 100644 index 00000000000..5583507433a --- /dev/null +++ b/docs/design/server-side-deployment/security.md @@ -0,0 +1,751 @@ +# Security + +## Overview + +This document outlines security controls, threat mitigations, and compliance requirements for the OpenCode server-side deployment. + +## Threat Model + +### Assets to Protect + +1. **User Data**: Sessions, messages, code, credentials +2. **Provider Keys**: API keys for LLM providers +3. **Infrastructure**: Servers, databases, networks +4. **Service Availability**: Protection against DoS + +### Threat Actors + +1. **External Attackers**: Unauthorized access attempts +2. **Malicious Users**: Abuse of legitimate access +3. **Compromised Accounts**: Stolen credentials +4. **Insider Threats**: Rogue employees/contractors + +### Attack Vectors + +| Vector | Risk | Mitigation | +|--------|------|------------| +| SQL Injection | High | Parameterized queries, ORM | +| XSS | Medium | Content Security Policy, sanitization | +| CSRF | Medium | SameSite cookies, CSRF tokens | +| Command Injection | Critical | Sandboxed execution | +| API Key Theft | High | Encryption at rest, KMS | +| Session Hijacking | High | Secure cookies, token rotation | +| DoS/DDoS | High | Rate limiting, CDN protection | + +## Network Security + +### Architecture + +``` +Internet + │ + ▼ +┌─────────────┐ +│ WAF/CDN │ ← DDoS protection, bot filtering +│ (Cloudflare)│ +└──────┬──────┘ + │ + ┌──▼──┐ + │ VPC │ + │ │ + │ ┌──┴──────────────────────┐ + │ │ Public Subnet │ + │ │ ┌─────────────────┐ │ + │ │ │ Load Balancer │ │ + │ │ └────────┬────────┘ │ + │ └───────────┼─────────────┘ + │ │ + │ ┌───────────▼─────────────┐ + │ │ Private Subnet │ + │ │ ┌─────────────────┐ │ + │ │ │ API Servers │ │ + │ │ └────────┬────────┘ │ + │ │ │ │ + │ │ ┌────────▼────────┐ │ + │ │ │ Database │ │ + │ │ └─────────────────┘ │ + │ └─────────────────────────┘ + └─────────────────────────────┘ +``` + +### Firewall Rules + +```yaml +# Network policies for Kubernetes +apiVersion: networking.k8s.io/v1 +kind: NetworkPolicy +metadata: + name: api-server-policy + namespace: opencode +spec: + podSelector: + matchLabels: + app: opencode-api + policyTypes: + - Ingress + - Egress + ingress: + - from: + - namespaceSelector: + matchLabels: + name: ingress-nginx + ports: + - port: 3000 + egress: + # Database + - to: + - podSelector: + matchLabels: + app: postgres + ports: + - port: 5432 + # Redis + - to: + - podSelector: + matchLabels: + app: redis + ports: + - port: 6379 + # External LLM APIs + - to: + - ipBlock: + cidr: 0.0.0.0/0 + ports: + - port: 443 +``` + +### TLS Configuration + +```typescript +// Minimum TLS 1.2, prefer 1.3 +const tlsConfig = { + minVersion: "TLSv1.2", + ciphers: [ + "TLS_AES_256_GCM_SHA384", + "TLS_CHACHA20_POLY1305_SHA256", + "TLS_AES_128_GCM_SHA256", + "ECDHE-RSA-AES256-GCM-SHA384", + "ECDHE-RSA-AES128-GCM-SHA256", + ].join(":"), + honorCipherOrder: true, +} +``` + +### mTLS for Internal Services + +```yaml +# Istio PeerAuthentication for mTLS +apiVersion: security.istio.io/v1beta1 +kind: PeerAuthentication +metadata: + name: default + namespace: opencode +spec: + mtls: + mode: STRICT +``` + +## Application Security + +### Input Validation + +```typescript +import { z } from "zod" + +// Strict input validation schemas +const CreateSessionSchema = z.object({ + title: z.string() + .min(1) + .max(500) + .regex(/^[\w\s\-.,!?]+$/), + projectId: z.string().uuid(), + model: z.object({ + providerId: z.enum(["anthropic", "openai", "google"]), + modelId: z.string().max(100), + }), +}) + +const MessageSchema = z.object({ + content: z.string().max(100000), // 100KB limit + files: z.array(z.object({ + name: z.string().max(255), + size: z.number().max(10 * 1024 * 1024), // 10MB + mimeType: z.string().regex(/^[\w\-]+\/[\w\-+.]+$/), + })).max(10).optional(), +}) + +// Middleware for validation +function validate(schema: z.ZodSchema) { + return async (c: Context, next: Next) => { + const result = schema.safeParse(await c.req.json()) + if (!result.success) { + throw new ValidationError(result.error) + } + c.set("body", result.data) + await next() + } +} +``` + +### Output Encoding + +```typescript +// Sanitize output for different contexts +import DOMPurify from "isomorphic-dompurify" + +function sanitizeForHtml(input: string): string { + return DOMPurify.sanitize(input, { + ALLOWED_TAGS: ["b", "i", "em", "strong", "code", "pre", "a"], + ALLOWED_ATTR: ["href"], + }) +} + +function sanitizeForJson(input: unknown): unknown { + // Remove any prototype pollution attempts + return JSON.parse(JSON.stringify(input, (key, value) => { + if (key === "__proto__" || key === "constructor" || key === "prototype") { + return undefined + } + return value + })) +} +``` + +### Content Security Policy + +```typescript +// CSP headers for web UI +const cspPolicy = { + "default-src": ["'self'"], + "script-src": ["'self'", "'wasm-unsafe-eval'"], + "style-src": ["'self'", "'unsafe-inline'"], + "img-src": ["'self'", "data:", "https:"], + "connect-src": [ + "'self'", + "https://api.anthropic.com", + "https://api.openai.com", + ], + "frame-ancestors": ["'none'"], + "form-action": ["'self'"], + "base-uri": ["'self'"], + "object-src": ["'none'"], +} + +app.use((c, next) => { + const csp = Object.entries(cspPolicy) + .map(([key, values]) => `${key} ${values.join(" ")}`) + .join("; ") + c.header("Content-Security-Policy", csp) + return next() +}) +``` + +### Security Headers + +```typescript +// Security headers middleware +app.use((c, next) => { + // Prevent MIME sniffing + c.header("X-Content-Type-Options", "nosniff") + + // Clickjacking protection + c.header("X-Frame-Options", "DENY") + + // XSS protection (legacy browsers) + c.header("X-XSS-Protection", "1; mode=block") + + // Referrer policy + c.header("Referrer-Policy", "strict-origin-when-cross-origin") + + // Permissions policy + c.header("Permissions-Policy", "camera=(), microphone=(), geolocation=()") + + // HSTS (1 year) + c.header( + "Strict-Transport-Security", + "max-age=31536000; includeSubDomains; preload" + ) + + return next() +}) +``` + +## Sandboxed Code Execution + +### Isolation Strategy + +Tool execution (Bash, file operations) runs in isolated containers to prevent: +- Filesystem escape +- Network access to internal services +- Resource exhaustion +- Privilege escalation + +### Container Security + +```yaml +# Security context for worker pods +apiVersion: v1 +kind: Pod +spec: + securityContext: + runAsNonRoot: true + runAsUser: 1000 + runAsGroup: 1000 + fsGroup: 1000 + seccompProfile: + type: RuntimeDefault + containers: + - name: sandbox + securityContext: + allowPrivilegeEscalation: false + readOnlyRootFilesystem: true + capabilities: + drop: + - ALL + resources: + limits: + cpu: "1" + memory: "512Mi" + ephemeral-storage: "1Gi" +``` + +### Firecracker/gVisor Integration + +```typescript +interface SandboxConfig { + // Firecracker microVM settings + firecracker: { + kernelPath: string + rootfsPath: string + vcpuCount: number + memSizeMib: number + networkInterface?: { + hostDevName: string + guestMac: string + } + } + // Or gVisor runtime + gvisor: { + platform: "ptrace" | "kvm" + network: "none" | "host" + } +} +``` + +### Command Filtering + +```typescript +// Block dangerous commands +const blockedCommands = [ + /\brm\s+-rf\s+\//, // rm -rf / + /\bmkfs\b/, + /\bdd\b.*of=\/dev/, + /\b(sudo|su)\b/, + /\bchmod\s+777/, + /\bcurl\b.*\|\s*(bash|sh)/, + /\bwget\b.*\|\s*(bash|sh)/, +] + +function validateCommand(cmd: string): boolean { + for (const pattern of blockedCommands) { + if (pattern.test(cmd)) { + return false + } + } + return true +} +``` + +## Data Protection + +### Encryption at Rest + +```typescript +// All sensitive data encrypted with AES-256-GCM +interface EncryptionConfig { + algorithm: "aes-256-gcm" + keyManagement: "aws-kms" | "hashicorp-vault" | "gcp-kms" + keyRotationDays: 90 +} + +// Encrypt provider API keys +async function encryptApiKey(key: string): Promise { + const kmsKeyId = process.env.KMS_KEY_ID + const { CiphertextBlob, KeyId } = await kms.encrypt({ + KeyId: kmsKeyId, + Plaintext: Buffer.from(key), + EncryptionContext: { + purpose: "provider-api-key", + }, + }) + + return { + ciphertext: CiphertextBlob.toString("base64"), + keyId: KeyId, + } +} +``` + +### Encryption in Transit + +- TLS 1.2+ for all external connections +- mTLS for internal service communication +- Certificate pinning for LLM provider connections + +### Data Classification + +| Classification | Examples | Controls | +|---------------|----------|----------| +| Public | Marketing content | None | +| Internal | Usage metrics | Access control | +| Confidential | User sessions | Encryption, audit logs | +| Restricted | API keys, PII | Encryption, KMS, strict access | + +### Key Management + +```typescript +// HashiCorp Vault integration +interface VaultConfig { + address: string + authMethod: "kubernetes" | "token" | "aws-iam" + secretEngine: "kv-v2" + transitEngine: "transit" +} + +class VaultClient { + // Get encryption key for data + async getDataKey(purpose: string): Promise { + const response = await this.client.write( + `transit/datakey/plaintext/${purpose}`, + { context: Buffer.from(purpose).toString("base64") } + ) + return Buffer.from(response.plaintext, "base64") + } + + // Encrypt with transit engine + async encrypt(plaintext: string, keyName: string): Promise { + const response = await this.client.write( + `transit/encrypt/${keyName}`, + { plaintext: Buffer.from(plaintext).toString("base64") } + ) + return response.ciphertext + } +} +``` + +## Secret Management + +### Secret Storage + +```yaml +# External Secrets Operator +apiVersion: external-secrets.io/v1beta1 +kind: ExternalSecret +metadata: + name: opencode-secrets + namespace: opencode +spec: + refreshInterval: 1h + secretStoreRef: + kind: ClusterSecretStore + name: vault-backend + target: + name: opencode-secrets + data: + - secretKey: database-url + remoteRef: + key: opencode/database + property: url + - secretKey: jwt-secret + remoteRef: + key: opencode/auth + property: jwt-secret + - secretKey: anthropic-api-key + remoteRef: + key: opencode/providers + property: anthropic-key +``` + +### Secret Rotation + +```typescript +// Automatic secret rotation +interface RotationConfig { + // Database credentials + database: { + rotationSchedule: "0 0 * * 0", // Weekly + maxAge: 90, // Days + }, + // API keys + apiKeys: { + rotationSchedule: "0 0 1 * *", // Monthly + maxAge: 365, + }, + // JWT signing keys + jwtKeys: { + rotationSchedule: "0 0 1 */3 *", // Quarterly + gracePeriod: 7, // Days to accept old key + }, +} +``` + +## Audit & Compliance + +### Audit Logging + +```typescript +// Comprehensive audit logging +interface AuditEvent { + id: string + timestamp: Date + actor: { + userId: string + orgId: string + ip: string + userAgent: string + } + action: string + resource: { + type: string + id: string + } + outcome: "success" | "failure" + metadata: Record +} + +// Log security-sensitive actions +const auditableActions = [ + "user.login", + "user.logout", + "user.mfa_enabled", + "user.mfa_disabled", + "user.password_changed", + "apikey.created", + "apikey.deleted", + "session.created", + "session.deleted", + "session.shared", + "provider.key_added", + "provider.key_removed", + "org.member_added", + "org.member_removed", + "org.settings_changed", +] +``` + +### Log Aggregation + +```yaml +# Fluent Bit for log collection +apiVersion: v1 +kind: ConfigMap +metadata: + name: fluent-bit-config +data: + fluent-bit.conf: | + [SERVICE] + Flush 5 + Log_Level info + Parsers_File parsers.conf + + [INPUT] + Name tail + Path /var/log/containers/opencode-*.log + Parser docker + Tag opencode.* + Mem_Buf_Limit 5MB + + [OUTPUT] + Name es + Match opencode.* + Host elasticsearch + Port 9200 + Index opencode-logs + Type _doc +``` + +### Compliance Controls + +#### SOC 2 Type II + +- [ ] Access control policies +- [ ] Encryption at rest and in transit +- [ ] Audit logging +- [ ] Incident response plan +- [ ] Vulnerability management +- [ ] Change management + +#### GDPR + +- [ ] Data processing agreements +- [ ] Right to erasure (data deletion) +- [ ] Data portability (export) +- [ ] Consent management +- [ ] Privacy policy +- [ ] DPO appointment + +#### HIPAA (if applicable) + +- [ ] BAA with customers +- [ ] PHI encryption +- [ ] Access controls +- [ ] Audit trails +- [ ] Breach notification + +## Vulnerability Management + +### Dependency Scanning + +```yaml +# GitHub Actions for dependency scanning +name: Security Scan +on: + push: + branches: [main] + schedule: + - cron: "0 0 * * *" + +jobs: + scan: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + + - name: Run Trivy vulnerability scanner + uses: aquasecurity/trivy-action@master + with: + scan-type: 'fs' + severity: 'CRITICAL,HIGH' + exit-code: '1' + + - name: Run Snyk + uses: snyk/actions/node@master + env: + SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }} +``` + +### Container Image Scanning + +```yaml +# Scan images before deployment +- name: Scan container image + uses: aquasecurity/trivy-action@master + with: + image-ref: 'ghcr.io/opencode/api:${{ github.sha }}' + format: 'sarif' + output: 'trivy-results.sarif' + +- name: Upload scan results + uses: github/codeql-action/upload-sarif@v2 + with: + sarif_file: 'trivy-results.sarif' +``` + +### Penetration Testing + +- Annual third-party penetration tests +- Quarterly internal security assessments +- Bug bounty program for external researchers + +## Incident Response + +### Incident Classification + +| Severity | Description | Response Time | +|----------|-------------|--------------| +| P1 - Critical | Data breach, complete outage | 15 minutes | +| P2 - High | Partial outage, security vulnerability | 1 hour | +| P3 - Medium | Degraded service, minor vulnerability | 4 hours | +| P4 - Low | Cosmetic issues, minor bugs | 24 hours | + +### Response Procedures + +```typescript +interface IncidentResponse { + // 1. Detection & Alerting + detection: { + source: "monitoring" | "user_report" | "automated_scan" + alertChannels: ["pagerduty", "slack", "email"] + } + + // 2. Triage & Classification + triage: { + severity: "P1" | "P2" | "P3" | "P4" + impactAssessment: string + affectedSystems: string[] + } + + // 3. Containment + containment: { + isolateAffectedSystems: boolean + preserveEvidence: boolean + communicateToStakeholders: boolean + } + + // 4. Eradication + eradication: { + rootCauseAnalysis: string + remediationSteps: string[] + } + + // 5. Recovery + recovery: { + restoreServices: boolean + verifyIntegrity: boolean + monitorForRecurrence: boolean + } + + // 6. Post-Incident + postIncident: { + incidentReport: string + lessonsLearned: string[] + preventiveMeasures: string[] + } +} +``` + +### Security Contacts + +```yaml +# PagerDuty escalation policy +escalation_policy: + name: "Security Incidents" + escalation_rules: + - escalation_delay_in_minutes: 5 + targets: + - type: "user_reference" + id: "security-oncall" + - escalation_delay_in_minutes: 15 + targets: + - type: "user_reference" + id: "security-lead" + - escalation_delay_in_minutes: 30 + targets: + - type: "user_reference" + id: "cto" +``` + +## Security Checklist + +### Pre-Deployment + +- [ ] Security review of architecture +- [ ] Threat modeling complete +- [ ] Penetration test passed +- [ ] Dependency vulnerabilities addressed +- [ ] Secrets rotated and secured +- [ ] Network policies configured +- [ ] TLS certificates valid +- [ ] Audit logging enabled +- [ ] Monitoring alerts configured +- [ ] Incident response plan tested + +### Ongoing + +- [ ] Weekly dependency updates +- [ ] Monthly security patches +- [ ] Quarterly access reviews +- [ ] Annual penetration tests +- [ ] Continuous vulnerability scanning +- [ ] Regular backup verification +- [ ] Incident response drills diff --git a/docs/design/server-side-deployment/storage.md b/docs/design/server-side-deployment/storage.md new file mode 100644 index 00000000000..e1631b18d4f --- /dev/null +++ b/docs/design/server-side-deployment/storage.md @@ -0,0 +1,740 @@ +# Storage & Data Persistence + +## Overview + +The server-side deployment replaces the file-based storage system with a distributed storage architecture optimized for multi-tenancy, scalability, and reliability. + +## Storage Architecture + +``` +┌─────────────────────────────────────────────────────────────┐ +│ Application Layer │ +└─────────────────────────────────────────────────────────────┘ + │ + ┌───────────────┼───────────────┐ + │ │ │ + ┌────────▼────────┐ ┌────▼────┐ ┌────────▼────────┐ + │ PostgreSQL │ │ Redis │ │ Object Store │ + │ (Primary DB) │ │ (Cache) │ │ (Files/Blobs) │ + └─────────────────┘ └─────────┘ └─────────────────┘ + │ + ▼ + ┌─────────────────┐ + │ Replicas │ + │ (Read scaling) │ + └─────────────────┘ +``` + +## PostgreSQL Schema + +### Core Tables + +```sql +-- Organizations (tenants) +CREATE TABLE organizations ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + name VARCHAR(255) NOT NULL, + slug VARCHAR(100) UNIQUE NOT NULL, + plan VARCHAR(50) NOT NULL DEFAULT 'free', + settings JSONB NOT NULL DEFAULT '{}', + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW() +); + +-- Users +CREATE TABLE users ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + org_id UUID NOT NULL REFERENCES organizations(id) ON DELETE CASCADE, + email VARCHAR(255) NOT NULL, + name VARCHAR(255), + avatar_url VARCHAR(500), + role VARCHAR(50) NOT NULL DEFAULT 'member', + preferences JSONB NOT NULL DEFAULT '{}', + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + last_active_at TIMESTAMPTZ, + UNIQUE(org_id, email) +); + +-- Workspaces +CREATE TABLE workspaces ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + org_id UUID NOT NULL REFERENCES organizations(id) ON DELETE CASCADE, + name VARCHAR(255) NOT NULL, + description TEXT, + git_config JSONB, + settings JSONB NOT NULL DEFAULT '{}', + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW() +); + +-- Projects +CREATE TABLE projects ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + workspace_id UUID NOT NULL REFERENCES workspaces(id) ON DELETE CASCADE, + name VARCHAR(255) NOT NULL, + path VARCHAR(1000) NOT NULL, + git_commit VARCHAR(40), + settings JSONB NOT NULL DEFAULT '{}', + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW() +); + +-- Sessions +CREATE TABLE sessions ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE, + user_id UUID NOT NULL REFERENCES users(id), + parent_id UUID REFERENCES sessions(id), + title VARCHAR(500) NOT NULL, + status VARCHAR(50) NOT NULL DEFAULT 'active', + model_provider VARCHAR(100) NOT NULL, + model_id VARCHAR(100) NOT NULL, + summary JSONB, + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + expires_at TIMESTAMPTZ +); + +-- Messages +CREATE TABLE messages ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + session_id UUID NOT NULL REFERENCES sessions(id) ON DELETE CASCADE, + role VARCHAR(50) NOT NULL, + metadata JSONB NOT NULL DEFAULT '{}', + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + completed_at TIMESTAMPTZ +); + +-- Message Parts (text, tools, files, etc.) +CREATE TABLE message_parts ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + message_id UUID NOT NULL REFERENCES messages(id) ON DELETE CASCADE, + type VARCHAR(50) NOT NULL, + content JSONB NOT NULL, + sort_order INTEGER NOT NULL, + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() +); + +-- Session Diffs (code changes) +CREATE TABLE session_diffs ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + session_id UUID NOT NULL REFERENCES sessions(id) ON DELETE CASCADE, + message_id UUID NOT NULL REFERENCES messages(id), + file_path VARCHAR(1000) NOT NULL, + diff_content TEXT NOT NULL, + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() +); +``` + +### Authentication Tables + +```sql +-- API Keys +CREATE TABLE api_keys ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + org_id UUID NOT NULL REFERENCES organizations(id) ON DELETE CASCADE, + user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE, + name VARCHAR(255) NOT NULL, + prefix VARCHAR(20) NOT NULL, + hash VARCHAR(255) NOT NULL, + scopes VARCHAR(50)[] NOT NULL DEFAULT '{}', + rate_limit JSONB, + expires_at TIMESTAMPTZ, + last_used_at TIMESTAMPTZ, + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() +); + +-- Refresh Tokens +CREATE TABLE refresh_tokens ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE, + family_id UUID NOT NULL, + hash VARCHAR(255) NOT NULL, + used BOOLEAN NOT NULL DEFAULT FALSE, + revoked BOOLEAN NOT NULL DEFAULT FALSE, + expires_at TIMESTAMPTZ NOT NULL, + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() +); + +-- User Sessions (login sessions) +CREATE TABLE user_sessions ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE, + token_family UUID NOT NULL, + device VARCHAR(255), + ip INET, + location VARCHAR(255), + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + last_active_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + expires_at TIMESTAMPTZ NOT NULL +); + +-- OAuth Connections +CREATE TABLE oauth_connections ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE, + provider VARCHAR(50) NOT NULL, + provider_user_id VARCHAR(255) NOT NULL, + access_token_encrypted TEXT NOT NULL, + refresh_token_encrypted TEXT, + expires_at TIMESTAMPTZ, + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + UNIQUE(provider, provider_user_id) +); +``` + +### Provider & Usage Tables + +```sql +-- User Provider Keys (BYOK) +CREATE TABLE user_provider_keys ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE, + provider_id VARCHAR(100) NOT NULL, + encrypted_key TEXT NOT NULL, + key_id VARCHAR(255) NOT NULL, + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + last_used_at TIMESTAMPTZ, + UNIQUE(user_id, provider_id) +); + +-- Organization Provider Config +CREATE TABLE org_provider_configs ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + org_id UUID NOT NULL REFERENCES organizations(id) ON DELETE CASCADE, + provider_id VARCHAR(100) NOT NULL, + encrypted_key TEXT NOT NULL, + key_id VARCHAR(255) NOT NULL, + rate_limit JSONB, + allow_user_override BOOLEAN NOT NULL DEFAULT TRUE, + usage_tracking BOOLEAN NOT NULL DEFAULT TRUE, + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + UNIQUE(org_id, provider_id) +); + +-- Usage Tracking +CREATE TABLE usage_records ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + org_id UUID NOT NULL REFERENCES organizations(id), + user_id UUID NOT NULL REFERENCES users(id), + session_id UUID REFERENCES sessions(id), + provider_id VARCHAR(100) NOT NULL, + model_id VARCHAR(100) NOT NULL, + tokens_input INTEGER NOT NULL, + tokens_output INTEGER NOT NULL, + tokens_cache_read INTEGER DEFAULT 0, + tokens_cache_write INTEGER DEFAULT 0, + cost_cents INTEGER NOT NULL, + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() +); + +-- Audit Logs +CREATE TABLE audit_logs ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + org_id UUID NOT NULL, + user_id UUID NOT NULL, + action VARCHAR(100) NOT NULL, + resource VARCHAR(100) NOT NULL, + resource_id UUID, + metadata JSONB NOT NULL DEFAULT '{}', + ip INET, + user_agent TEXT, + status VARCHAR(50) NOT NULL, + error_code VARCHAR(100), + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() +); +``` + +### Indexes + +```sql +-- Performance indexes +CREATE INDEX idx_sessions_project_id ON sessions(project_id); +CREATE INDEX idx_sessions_user_id ON sessions(user_id); +CREATE INDEX idx_sessions_created_at ON sessions(created_at DESC); +CREATE INDEX idx_messages_session_id ON messages(session_id); +CREATE INDEX idx_message_parts_message_id ON message_parts(message_id); +CREATE INDEX idx_session_diffs_session_id ON session_diffs(session_id); + +-- Multi-tenant indexes +CREATE INDEX idx_users_org_id ON users(org_id); +CREATE INDEX idx_workspaces_org_id ON workspaces(org_id); +CREATE INDEX idx_api_keys_prefix ON api_keys(prefix); + +-- Usage and audit indexes +CREATE INDEX idx_usage_records_org_id_created ON usage_records(org_id, created_at DESC); +CREATE INDEX idx_usage_records_user_id_created ON usage_records(user_id, created_at DESC); +CREATE INDEX idx_audit_logs_org_id_created ON audit_logs(org_id, created_at DESC); +CREATE INDEX idx_audit_logs_user_id_created ON audit_logs(user_id, created_at DESC); + +-- Full-text search +CREATE INDEX idx_sessions_title_fts ON sessions USING gin(to_tsvector('english', title)); +``` + +## Redis Data Structures + +### Caching Strategy + +```typescript +interface CacheConfig { + // Session metadata cache + session: { + key: (id: string) => `session:${id}`, + ttl: 3600, // 1 hour + }, + // User preferences cache + user: { + key: (id: string) => `user:${id}`, + ttl: 1800, // 30 minutes + }, + // Provider config cache + provider: { + key: (orgId: string, providerId: string) => `provider:${orgId}:${providerId}`, + ttl: 300, // 5 minutes + }, + // Rate limit counters + rateLimit: { + key: (id: string, window: string) => `rl:${id}:${window}`, + ttl: 60, // 1 minute + }, +} +``` + +### Real-time Data + +```typescript +// Active session tracking +interface ActiveSession { + key: `active:session:${sessionId}`, + value: { + userId: string + status: "idle" | "processing" | "streaming" + lastActivity: number + currentMessageId?: string + }, + ttl: 3600 +} + +// SSE connection tracking +interface SSEConnection { + key: `sse:user:${userId}`, + value: Set, + ttl: 86400 +} + +// Pub/Sub channels +const channels = { + session: (id: string) => `events:session:${id}`, + user: (id: string) => `events:user:${id}`, + workspace: (id: string) => `events:workspace:${id}`, +} +``` + +### Job Queue + +```typescript +// Background job queues using Redis Streams +interface JobQueue { + // Session compaction jobs + compaction: { + stream: "jobs:compaction", + group: "compaction-workers", + }, + // Usage aggregation + usage: { + stream: "jobs:usage", + group: "usage-workers", + }, + // Cleanup expired sessions + cleanup: { + stream: "jobs:cleanup", + group: "cleanup-workers", + }, +} +``` + +## Object Storage + +### File Organization + +``` +bucket/ +├── workspaces/ +│ └── {workspaceId}/ +│ └── {projectId}/ +│ ├── files/ # Project files +│ │ └── {hash} +│ └── snapshots/ # Git snapshots +│ └── {snapshotId} +├── sessions/ +│ └── {sessionId}/ +│ ├── attachments/ # User uploads +│ │ └── {attachmentId} +│ └── artifacts/ # Generated files +│ └── {artifactId} +├── exports/ +│ └── {exportId}/ # Session exports +│ └── export.zip +└── avatars/ + └── {userId}/ + └── avatar.{ext} +``` + +### Storage Operations + +```typescript +interface ObjectStorage { + // Upload file + upload(key: string, content: Buffer | Stream, options?: UploadOptions): Promise + + // Download file + download(key: string): Promise + + // Get signed URL for client-side download + getSignedUrl(key: string, expiresIn: number): Promise + + // Delete file + delete(key: string): Promise + + // List files by prefix + list(prefix: string): Promise +} + +interface UploadOptions { + contentType?: string + metadata?: Record + acl?: "private" | "public-read" +} +``` + +### Content-Addressable Storage + +```typescript +// Store files by content hash for deduplication +async function storeFile( + workspaceId: string, + projectId: string, + content: Buffer +): Promise { + const hash = crypto.createHash("sha256").update(content).digest("hex") + const key = `workspaces/${workspaceId}/${projectId}/files/${hash}` + + // Check if already exists + const exists = await storage.exists(key) + if (!exists) { + await storage.upload(key, content) + } + + return hash +} +``` + +## Data Access Layer + +### Repository Pattern + +```typescript +// Base repository with tenant scoping +abstract class BaseRepository { + constructor( + protected db: Database, + protected ctx: TenantContext + ) {} + + protected get orgId() { + return this.ctx.orgId + } + + protected get userId() { + return this.ctx.userId + } +} + +// Session repository +class SessionRepository extends BaseRepository { + async findById(id: string): Promise { + return this.db.query` + SELECT s.* + FROM sessions s + JOIN projects p ON s.project_id = p.id + JOIN workspaces w ON p.workspace_id = w.id + WHERE s.id = ${id} + AND w.org_id = ${this.orgId} + `.first() + } + + async create(input: CreateSessionInput): Promise { + return this.db.query` + INSERT INTO sessions ( + project_id, user_id, title, model_provider, model_id + ) VALUES ( + ${input.projectId}, + ${this.userId}, + ${input.title}, + ${input.modelProvider}, + ${input.modelId} + ) + RETURNING * + `.first() + } + + async listByUser(options: ListOptions): Promise { + return this.db.query` + SELECT s.* + FROM sessions s + JOIN projects p ON s.project_id = p.id + JOIN workspaces w ON p.workspace_id = w.id + WHERE w.org_id = ${this.orgId} + AND s.user_id = ${this.userId} + ORDER BY s.created_at DESC + LIMIT ${options.limit} + OFFSET ${options.offset} + `.all() + } +} +``` + +### Caching Layer + +```typescript +// Cache-aside pattern +class CachedSessionRepository { + constructor( + private repo: SessionRepository, + private cache: Redis, + private ctx: TenantContext + ) {} + + async findById(id: string): Promise { + const cacheKey = `session:${id}` + + // Try cache first + const cached = await this.cache.get(cacheKey) + if (cached) return cached + + // Fetch from database + const session = await this.repo.findById(id) + if (session) { + await this.cache.set(cacheKey, session, { ex: 3600 }) + } + + return session + } + + async update(id: string, input: UpdateSessionInput): Promise { + const session = await this.repo.update(id, input) + + // Invalidate cache + await this.cache.del(`session:${id}`) + + // Publish update event + await this.cache.publish(`events:session:${id}`, { + type: "session.updated", + session, + }) + + return session + } +} +``` + +## Migration Strategy + +### From File-Based to Database + +```typescript +// Migration script for existing data +async function migrateFromFiles( + sourceDir: string, + targetDb: Database +): Promise { + const result: MigrationResult = { + sessions: 0, + messages: 0, + parts: 0, + errors: [], + } + + // Read existing sessions + const sessionFiles = await glob(`${sourceDir}/session/**/*.json`) + + for (const file of sessionFiles) { + try { + const data = JSON.parse(await fs.readFile(file, "utf-8")) + + // Map to new schema + const session = mapLegacySession(data) + await targetDb.sessions.create(session) + result.sessions++ + + // Migrate messages + const messageFiles = await glob(`${sourceDir}/message/${data.id}/*.json`) + for (const msgFile of messageFiles) { + const msgData = JSON.parse(await fs.readFile(msgFile, "utf-8")) + const message = mapLegacyMessage(msgData) + await targetDb.messages.create(message) + result.messages++ + + // Migrate parts + const partFiles = await glob(`${sourceDir}/part/${msgData.id}/*.json`) + for (const partFile of partFiles) { + const partData = JSON.parse(await fs.readFile(partFile, "utf-8")) + const part = mapLegacyPart(partData) + await targetDb.messageParts.create(part) + result.parts++ + } + } + } catch (error) { + result.errors.push({ file, error: error.message }) + } + } + + return result +} +``` + +## Backup & Recovery + +### Backup Strategy + +```typescript +interface BackupConfig { + // PostgreSQL backups + database: { + schedule: "0 */6 * * *", // Every 6 hours + retention: 30, // Days + method: "pg_dump" | "wal", + }, + // Object storage + objects: { + versioning: true, + retention: 90, // Days + replication: "cross-region", + }, +} +``` + +### Point-in-Time Recovery + +```sql +-- Enable WAL archiving for PITR +ALTER SYSTEM SET archive_mode = on; +ALTER SYSTEM SET archive_command = 'aws s3 cp %p s3://backups/wal/%f'; +ALTER SYSTEM SET wal_level = replica; +``` + +## Data Retention + +### Retention Policies + +```typescript +interface RetentionPolicy { + // Session data + sessions: { + active: "indefinite", + archived: 365, // Days + deleted: 30, // Soft delete grace period + }, + // Usage records + usage: { + detailed: 90, // Days + aggregated: 730, // 2 years + }, + // Audit logs + audit: { + security: 730, // 2 years + general: 90, // Days + }, +} +``` + +### Cleanup Jobs + +```typescript +// Scheduled cleanup job +async function cleanupExpiredData(): Promise { + const cutoff = new Date(Date.now() - 30 * 24 * 60 * 60 * 1000) + + // Delete soft-deleted sessions + await db.query` + DELETE FROM sessions + WHERE status = 'deleted' + AND updated_at < ${cutoff} + ` + + // Archive old usage records + await db.query` + INSERT INTO usage_records_archive + SELECT * FROM usage_records + WHERE created_at < ${cutoff} + ` + + await db.query` + DELETE FROM usage_records + WHERE created_at < ${cutoff} + ` + + // Clean up orphaned object storage + await cleanupOrphanedObjects() +} +``` + +## Performance Optimization + +### Query Optimization + +```typescript +// Efficient message loading with pagination +async function loadMessages( + sessionId: string, + cursor?: string, + limit: number = 50 +): Promise<{ messages: MessageWithParts[]; nextCursor?: string }> { + const messages = await db.query` + SELECT m.*, + json_agg( + json_build_object( + 'id', mp.id, + 'type', mp.type, + 'content', mp.content, + 'order', mp.sort_order + ) ORDER BY mp.sort_order + ) as parts + FROM messages m + LEFT JOIN message_parts mp ON mp.message_id = m.id + WHERE m.session_id = ${sessionId} + ${cursor ? sql`AND m.id < ${cursor}` : sql``} + GROUP BY m.id + ORDER BY m.created_at DESC + LIMIT ${limit + 1} + `.all() + + const hasMore = messages.length > limit + if (hasMore) messages.pop() + + return { + messages, + nextCursor: hasMore ? messages[messages.length - 1].id : undefined, + } +} +``` + +### Connection Pooling + +```typescript +// PostgreSQL connection pool config +const poolConfig = { + min: 5, + max: 20, + idleTimeoutMillis: 30000, + connectionTimeoutMillis: 5000, + // Read replicas for queries + replicas: [ + { host: "replica-1.db.internal", port: 5432 }, + { host: "replica-2.db.internal", port: 5432 }, + ], +} +``` From bb0988c79988292ead64be6e6a2f137e2653c747 Mon Sep 17 00:00:00 2001 From: Claude Date: Mon, 24 Nov 2025 06:54:42 +0000 Subject: [PATCH 06/58] docs: add MySQL storage design with BIGINT primary keys Alternative storage design for MySQL deployments with: - Snowflake-style BIGINT ID generation (8 bytes vs 16) - No foreign keys, stored procedures, or triggers - Application-level referential integrity - Efficient cursor-based pagination - Sharding strategy by organization - Connection pooling and read/write splitting --- docs/design/server-side-deployment/README.md | 9 +- .../server-side-deployment/storage-mysql.md | 1005 +++++++++++++++++ 2 files changed, 1010 insertions(+), 4 deletions(-) create mode 100644 docs/design/server-side-deployment/storage-mysql.md diff --git a/docs/design/server-side-deployment/README.md b/docs/design/server-side-deployment/README.md index 066a8a77336..23c09053bc4 100644 --- a/docs/design/server-side-deployment/README.md +++ b/docs/design/server-side-deployment/README.md @@ -27,10 +27,11 @@ This document describes the architecture for deploying OpenCode as a multi-tenan 1. **[Architecture](./architecture.md)** - System architecture and component design 2. **[Authentication](./authentication.md)** - User authentication and authorization -3. **[Storage](./storage.md)** - Data persistence and caching strategies -4. **[Scaling](./scaling.md)** - Horizontal scaling and deployment patterns -5. **[Security](./security.md)** - Security controls and compliance -6. **[API](./api.md)** - API design and versioning +3. **[Storage](./storage.md)** - Data persistence and caching strategies (PostgreSQL) +4. **[Storage - MySQL](./storage-mysql.md)** - Alternative MySQL design with BIGINT keys +5. **[Scaling](./scaling.md)** - Horizontal scaling and deployment patterns +6. **[Security](./security.md)** - Security controls and compliance +7. **[API](./api.md)** - API design and versioning ## High-Level Architecture diff --git a/docs/design/server-side-deployment/storage-mysql.md b/docs/design/server-side-deployment/storage-mysql.md new file mode 100644 index 00000000000..00167870aab --- /dev/null +++ b/docs/design/server-side-deployment/storage-mysql.md @@ -0,0 +1,1005 @@ +# MySQL Storage Design + +## Overview + +This document describes an alternative storage design using MySQL optimized for high-scale deployments. The design avoids stored procedures, foreign keys, and triggers for maximum portability and performance, using efficient `BIGINT` primary keys instead of UUIDs. + +## Design Principles + +### Why These Constraints? + +| Constraint | Reason | +|------------|--------| +| No Foreign Keys | Eliminates FK checks on writes, enables easier sharding | +| No Stored Procedures | Application-level logic, better portability | +| No Triggers | Predictable performance, easier debugging | +| BIGINT Keys | 8 bytes vs 16 bytes (UUID), better index performance | + +### Trade-offs + +**Advantages**: +- 50% smaller primary key storage +- Faster index lookups (sequential vs random) +- No FK constraint overhead on inserts +- Easier horizontal sharding +- Better cache locality + +**Considerations**: +- Application must enforce referential integrity +- Need distributed ID generation strategy +- Orphan cleanup requires background jobs + +## ID Generation + +### Snowflake ID Structure + +Use Twitter Snowflake-style IDs for distributed, time-ordered, unique identifiers: + +``` +┌─────────────────────────────────────────────────────────────────┐ +│ 63 bits total (signed BIGINT) │ +├─────────────────────┬──────────────┬────────────┬───────────────┤ +│ Timestamp (41 bits) │ Worker (10) │ Seq (12) │ Sign (1) │ +│ ~69 years │ 1024 workers │ 4096/ms │ Always 0 │ +└─────────────────────┴──────────────┴────────────┴───────────────┘ +``` + +### ID Generator Implementation + +```typescript +class SnowflakeGenerator { + private readonly epoch = 1704067200000n // 2024-01-01 00:00:00 UTC + private readonly workerIdBits = 10n + private readonly sequenceBits = 12n + + private readonly maxWorkerId = (1n << this.workerIdBits) - 1n + private readonly maxSequence = (1n << this.sequenceBits) - 1n + + private readonly workerIdShift = this.sequenceBits + private readonly timestampShift = this.sequenceBits + this.workerIdBits + + private workerId: bigint + private sequence = 0n + private lastTimestamp = -1n + + constructor(workerId: number) { + if (workerId < 0 || BigInt(workerId) > this.maxWorkerId) { + throw new Error(`Worker ID must be between 0 and ${this.maxWorkerId}`) + } + this.workerId = BigInt(workerId) + } + + nextId(): bigint { + let timestamp = BigInt(Date.now()) - this.epoch + + if (timestamp === this.lastTimestamp) { + this.sequence = (this.sequence + 1n) & this.maxSequence + if (this.sequence === 0n) { + // Wait for next millisecond + while (timestamp <= this.lastTimestamp) { + timestamp = BigInt(Date.now()) - this.epoch + } + } + } else { + this.sequence = 0n + } + + this.lastTimestamp = timestamp + + return ( + (timestamp << this.timestampShift) | + (this.workerId << this.workerIdShift) | + this.sequence + ) + } + + // Extract timestamp from ID + static getTimestamp(id: bigint): Date { + const epoch = 1704067200000n + const timestamp = (id >> 22n) + epoch + return new Date(Number(timestamp)) + } +} + +// Usage +const idGen = new SnowflakeGenerator(parseInt(process.env.WORKER_ID || "1")) +const sessionId = idGen.nextId() // 7159429562834944001n +``` + +### Worker ID Assignment + +```typescript +// Assign worker IDs via environment or coordination service +interface WorkerIdConfig { + // Static assignment via environment + static: { + workerId: number + } + // Dynamic assignment via Redis + redis: { + key: "workers:ids" + ttl: 60 // seconds, heartbeat interval + } + // Kubernetes pod ordinal + kubernetes: { + statefulSetName: string + // Pod name: opencode-api-3 → workerId: 3 + } +} + +// Redis-based dynamic assignment +async function acquireWorkerId(redis: Redis): Promise { + for (let id = 0; id < 1024; id++) { + const key = `worker:${id}` + const acquired = await redis.set(key, process.pid, { + nx: true, + ex: 60, + }) + if (acquired) { + // Start heartbeat + setInterval(() => redis.expire(key, 60), 30000) + return id + } + } + throw new Error("No available worker IDs") +} +``` + +## MySQL Schema + +### Core Tables + +```sql +-- Organizations (tenants) +CREATE TABLE organizations ( + id BIGINT NOT NULL PRIMARY KEY, + name VARCHAR(255) NOT NULL, + slug VARCHAR(100) NOT NULL, + plan VARCHAR(50) NOT NULL DEFAULT 'free', + settings JSON NOT NULL, + created_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3), + updated_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3) ON UPDATE CURRENT_TIMESTAMP(3), + + UNIQUE KEY uk_slug (slug), + KEY idx_created_at (created_at) +) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; + +-- Users +CREATE TABLE users ( + id BIGINT NOT NULL PRIMARY KEY, + org_id BIGINT NOT NULL, + email VARCHAR(255) NOT NULL, + name VARCHAR(255), + avatar_url VARCHAR(500), + role VARCHAR(50) NOT NULL DEFAULT 'member', + preferences JSON NOT NULL, + created_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3), + updated_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3) ON UPDATE CURRENT_TIMESTAMP(3), + last_active_at TIMESTAMP(3) NULL, + + UNIQUE KEY uk_org_email (org_id, email), + KEY idx_org_id (org_id), + KEY idx_email (email) +) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; + +-- Workspaces +CREATE TABLE workspaces ( + id BIGINT NOT NULL PRIMARY KEY, + org_id BIGINT NOT NULL, + name VARCHAR(255) NOT NULL, + description TEXT, + git_config JSON, + settings JSON NOT NULL, + created_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3), + updated_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3) ON UPDATE CURRENT_TIMESTAMP(3), + + KEY idx_org_id (org_id), + KEY idx_org_name (org_id, name) +) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; + +-- Projects +CREATE TABLE projects ( + id BIGINT NOT NULL PRIMARY KEY, + workspace_id BIGINT NOT NULL, + name VARCHAR(255) NOT NULL, + path VARCHAR(1000) NOT NULL, + git_commit VARCHAR(40), + settings JSON NOT NULL, + created_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3), + updated_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3) ON UPDATE CURRENT_TIMESTAMP(3), + + KEY idx_workspace_id (workspace_id) +) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; + +-- Sessions +CREATE TABLE sessions ( + id BIGINT NOT NULL PRIMARY KEY, + project_id BIGINT NOT NULL, + user_id BIGINT NOT NULL, + parent_id BIGINT NULL, + title VARCHAR(500) NOT NULL, + status VARCHAR(50) NOT NULL DEFAULT 'active', + model_provider VARCHAR(100) NOT NULL, + model_id VARCHAR(100) NOT NULL, + summary JSON, + created_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3), + updated_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3) ON UPDATE CURRENT_TIMESTAMP(3), + expires_at TIMESTAMP(3) NULL, + + KEY idx_project_id (project_id), + KEY idx_user_id (user_id), + KEY idx_user_created (user_id, created_at DESC), + KEY idx_status (status), + KEY idx_parent_id (parent_id), + KEY idx_expires_at (expires_at) +) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; + +-- Messages +CREATE TABLE messages ( + id BIGINT NOT NULL PRIMARY KEY, + session_id BIGINT NOT NULL, + role VARCHAR(50) NOT NULL, + metadata JSON NOT NULL, + created_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3), + completed_at TIMESTAMP(3) NULL, + + KEY idx_session_id (session_id), + KEY idx_session_created (session_id, created_at) +) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; + +-- Message Parts +CREATE TABLE message_parts ( + id BIGINT NOT NULL PRIMARY KEY, + message_id BIGINT NOT NULL, + type VARCHAR(50) NOT NULL, + content JSON NOT NULL, + sort_order INT NOT NULL, + created_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3), + + KEY idx_message_id (message_id), + KEY idx_message_order (message_id, sort_order) +) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; + +-- Session Diffs +CREATE TABLE session_diffs ( + id BIGINT NOT NULL PRIMARY KEY, + session_id BIGINT NOT NULL, + message_id BIGINT NOT NULL, + file_path VARCHAR(1000) NOT NULL, + diff_content MEDIUMTEXT NOT NULL, + created_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3), + + KEY idx_session_id (session_id), + KEY idx_message_id (message_id) +) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; +``` + +### Authentication Tables + +```sql +-- API Keys +CREATE TABLE api_keys ( + id BIGINT NOT NULL PRIMARY KEY, + org_id BIGINT NOT NULL, + user_id BIGINT NOT NULL, + name VARCHAR(255) NOT NULL, + prefix VARCHAR(20) NOT NULL, + hash VARCHAR(255) NOT NULL, + scopes JSON NOT NULL, + rate_limit JSON, + expires_at TIMESTAMP(3) NULL, + last_used_at TIMESTAMP(3) NULL, + created_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3), + + KEY idx_org_id (org_id), + KEY idx_user_id (user_id), + KEY idx_prefix (prefix) +) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; + +-- Refresh Tokens +CREATE TABLE refresh_tokens ( + id BIGINT NOT NULL PRIMARY KEY, + user_id BIGINT NOT NULL, + family_id BIGINT NOT NULL, + hash VARCHAR(255) NOT NULL, + used TINYINT(1) NOT NULL DEFAULT 0, + revoked TINYINT(1) NOT NULL DEFAULT 0, + expires_at TIMESTAMP(3) NOT NULL, + created_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3), + + KEY idx_user_id (user_id), + KEY idx_family_id (family_id), + KEY idx_hash (hash), + KEY idx_expires_at (expires_at) +) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; + +-- User Sessions (login sessions) +CREATE TABLE user_sessions ( + id BIGINT NOT NULL PRIMARY KEY, + user_id BIGINT NOT NULL, + token_family BIGINT NOT NULL, + device VARCHAR(255), + ip VARCHAR(45), + location VARCHAR(255), + created_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3), + last_active_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3), + expires_at TIMESTAMP(3) NOT NULL, + + KEY idx_user_id (user_id), + KEY idx_token_family (token_family), + KEY idx_expires_at (expires_at) +) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; + +-- OAuth Connections +CREATE TABLE oauth_connections ( + id BIGINT NOT NULL PRIMARY KEY, + user_id BIGINT NOT NULL, + provider VARCHAR(50) NOT NULL, + provider_user_id VARCHAR(255) NOT NULL, + access_token_encrypted TEXT NOT NULL, + refresh_token_encrypted TEXT, + expires_at TIMESTAMP(3) NULL, + created_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3), + updated_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3) ON UPDATE CURRENT_TIMESTAMP(3), + + UNIQUE KEY uk_provider_user (provider, provider_user_id), + KEY idx_user_id (user_id) +) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; +``` + +### Provider & Usage Tables + +```sql +-- User Provider Keys (BYOK) +CREATE TABLE user_provider_keys ( + id BIGINT NOT NULL PRIMARY KEY, + user_id BIGINT NOT NULL, + provider_id VARCHAR(100) NOT NULL, + encrypted_key TEXT NOT NULL, + key_id VARCHAR(255) NOT NULL, + created_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3), + last_used_at TIMESTAMP(3) NULL, + + UNIQUE KEY uk_user_provider (user_id, provider_id) +) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; + +-- Organization Provider Config +CREATE TABLE org_provider_configs ( + id BIGINT NOT NULL PRIMARY KEY, + org_id BIGINT NOT NULL, + provider_id VARCHAR(100) NOT NULL, + encrypted_key TEXT NOT NULL, + key_id VARCHAR(255) NOT NULL, + rate_limit JSON, + allow_user_override TINYINT(1) NOT NULL DEFAULT 1, + usage_tracking TINYINT(1) NOT NULL DEFAULT 1, + created_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3), + updated_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3) ON UPDATE CURRENT_TIMESTAMP(3), + + UNIQUE KEY uk_org_provider (org_id, provider_id) +) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; + +-- Usage Records +CREATE TABLE usage_records ( + id BIGINT NOT NULL PRIMARY KEY, + org_id BIGINT NOT NULL, + user_id BIGINT NOT NULL, + session_id BIGINT NULL, + provider_id VARCHAR(100) NOT NULL, + model_id VARCHAR(100) NOT NULL, + tokens_input INT NOT NULL, + tokens_output INT NOT NULL, + tokens_cache_read INT NOT NULL DEFAULT 0, + tokens_cache_write INT NOT NULL DEFAULT 0, + cost_cents INT NOT NULL, + created_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3), + + KEY idx_org_created (org_id, created_at), + KEY idx_user_created (user_id, created_at), + KEY idx_session_id (session_id) +) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; + +-- Audit Logs +CREATE TABLE audit_logs ( + id BIGINT NOT NULL PRIMARY KEY, + org_id BIGINT NOT NULL, + user_id BIGINT NOT NULL, + action VARCHAR(100) NOT NULL, + resource VARCHAR(100) NOT NULL, + resource_id BIGINT NULL, + metadata JSON NOT NULL, + ip VARCHAR(45), + user_agent TEXT, + status VARCHAR(50) NOT NULL, + error_code VARCHAR(100), + created_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3), + + KEY idx_org_created (org_id, created_at), + KEY idx_user_created (user_id, created_at), + KEY idx_action (action), + KEY idx_resource (resource, resource_id) +) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; +``` + +## Application-Level Referential Integrity + +### Validation on Insert/Update + +```typescript +// Validate parent exists before insert +class SessionRepository { + async create(input: CreateSessionInput): Promise { + // Validate project exists + const project = await this.db.query` + SELECT id, workspace_id FROM projects WHERE id = ${input.projectId} + `.first() + + if (!project) { + throw new NotFoundError("Project not found", "PROJECT_NOT_FOUND") + } + + // Validate workspace belongs to org (for tenant isolation) + const workspace = await this.db.query` + SELECT id FROM workspaces + WHERE id = ${project.workspaceId} AND org_id = ${this.ctx.orgId} + `.first() + + if (!workspace) { + throw new ForbiddenError("Access denied", "WORKSPACE_ACCESS_DENIED") + } + + // Insert session + const id = this.idGen.nextId() + await this.db.query` + INSERT INTO sessions (id, project_id, user_id, title, model_provider, model_id) + VALUES (${id}, ${input.projectId}, ${this.ctx.userId}, ${input.title}, + ${input.modelProvider}, ${input.modelId}) + ` + + return this.findById(id) + } +} +``` + +### Cascading Deletes + +```typescript +// Manual cascade delete (no FK constraints) +class SessionRepository { + async delete(id: bigint): Promise { + // Verify ownership + const session = await this.findById(id) + if (!session) { + throw new NotFoundError("Session not found") + } + + // Delete in order: parts → messages → diffs → session + // Use transaction for atomicity + await this.db.transaction(async (tx) => { + // Get all message IDs for this session + const messageIds = await tx.query<{ id: bigint }>` + SELECT id FROM messages WHERE session_id = ${id} + `.all() + + if (messageIds.length > 0) { + const ids = messageIds.map(m => m.id) + + // Delete parts for all messages + await tx.query` + DELETE FROM message_parts WHERE message_id IN (${ids}) + ` + + // Delete messages + await tx.query` + DELETE FROM messages WHERE session_id = ${id} + ` + } + + // Delete diffs + await tx.query` + DELETE FROM session_diffs WHERE session_id = ${id} + ` + + // Delete session + await tx.query` + DELETE FROM sessions WHERE id = ${id} + ` + }) + } +} +``` + +### Orphan Cleanup Job + +```typescript +// Background job to clean orphaned records +class OrphanCleanupJob { + async run(): Promise { + const result: CleanupResult = { + messageParts: 0, + messages: 0, + diffs: 0, + sessions: 0, + } + + // Find and delete orphaned message_parts + const orphanedParts = await this.db.query` + DELETE mp FROM message_parts mp + LEFT JOIN messages m ON mp.message_id = m.id + WHERE m.id IS NULL + ` + result.messageParts = orphanedParts.affectedRows + + // Find and delete orphaned messages + const orphanedMessages = await this.db.query` + DELETE m FROM messages m + LEFT JOIN sessions s ON m.session_id = s.id + WHERE s.id IS NULL + ` + result.messages = orphanedMessages.affectedRows + + // Find and delete orphaned session_diffs + const orphanedDiffs = await this.db.query` + DELETE sd FROM session_diffs sd + LEFT JOIN sessions s ON sd.session_id = s.id + WHERE s.id IS NULL + ` + result.diffs = orphanedDiffs.affectedRows + + // Find and delete orphaned sessions (no project) + const orphanedSessions = await this.db.query` + DELETE s FROM sessions s + LEFT JOIN projects p ON s.project_id = p.id + WHERE p.id IS NULL + ` + result.sessions = orphanedSessions.affectedRows + + return result + } +} + +// Schedule: run every hour +schedule.every("1 hour", () => orphanCleanupJob.run()) +``` + +## Query Patterns + +### Efficient Pagination with BIGINT + +```typescript +// Cursor-based pagination (efficient with BIGINT) +async function listSessions( + userId: bigint, + cursor?: bigint, + limit: number = 50 +): Promise> { + // Snowflake IDs are time-ordered, so we can use them directly + const sessions = await db.query` + SELECT * FROM sessions + WHERE user_id = ${userId} + ${cursor ? sql`AND id < ${cursor}` : sql``} + ORDER BY id DESC + LIMIT ${limit + 1} + `.all() + + const hasMore = sessions.length > limit + if (hasMore) sessions.pop() + + return { + data: sessions, + pagination: { + cursor: hasMore ? sessions[sessions.length - 1].id.toString() : undefined, + hasMore, + }, + } +} +``` + +### Batch Loading with IN Clause + +```typescript +// Efficient batch loading +async function getMessagesWithParts(sessionId: bigint): Promise { + // Load messages + const messages = await db.query` + SELECT * FROM messages + WHERE session_id = ${sessionId} + ORDER BY created_at ASC + `.all() + + if (messages.length === 0) return [] + + // Batch load all parts + const messageIds = messages.map(m => m.id) + const parts = await db.query` + SELECT * FROM message_parts + WHERE message_id IN (${messageIds}) + ORDER BY message_id, sort_order + `.all() + + // Group parts by message + const partsByMessage = new Map() + for (const part of parts) { + const list = partsByMessage.get(part.message_id) || [] + list.push(part) + partsByMessage.set(part.message_id, list) + } + + // Combine + return messages.map(msg => ({ + ...msg, + parts: partsByMessage.get(msg.id) || [], + })) +} +``` + +### Multi-Tenant Queries + +```typescript +// All queries scoped to organization +class TenantScopedRepository { + constructor( + protected db: Database, + protected ctx: TenantContext + ) {} + + // Helper to add org scope through joins + protected async withOrgScope( + table: string, + id: bigint + ): Promise { + // Different paths to org based on table + const scopeQueries: Record = { + sessions: ` + SELECT 1 FROM sessions s + JOIN projects p ON s.project_id = p.id + JOIN workspaces w ON p.workspace_id = w.id + WHERE s.id = ? AND w.org_id = ? + `, + messages: ` + SELECT 1 FROM messages m + JOIN sessions s ON m.session_id = s.id + JOIN projects p ON s.project_id = p.id + JOIN workspaces w ON p.workspace_id = w.id + WHERE m.id = ? AND w.org_id = ? + `, + projects: ` + SELECT 1 FROM projects p + JOIN workspaces w ON p.workspace_id = w.id + WHERE p.id = ? AND w.org_id = ? + `, + workspaces: ` + SELECT 1 FROM workspaces WHERE id = ? AND org_id = ? + `, + } + + const query = scopeQueries[table] + if (!query) { + throw new Error(`Unknown table: ${table}`) + } + + const result = await this.db.execute(query, [id, this.ctx.orgId]) + return result.length > 0 + } +} +``` + +## Index Optimization + +### Covering Indexes + +```sql +-- Covering index for common query patterns +-- Sessions by user with status filter +CREATE INDEX idx_sessions_user_status_created +ON sessions (user_id, status, created_at DESC, id, title, model_provider, model_id); + +-- Messages with metadata for listing +CREATE INDEX idx_messages_session_created +ON messages (session_id, created_at, id, role); +``` + +### JSON Indexing + +```sql +-- Virtual columns for JSON fields (MySQL 5.7+) +ALTER TABLE sessions +ADD COLUMN summary_files INT +GENERATED ALWAYS AS (JSON_EXTRACT(summary, '$.files')) VIRTUAL; + +CREATE INDEX idx_sessions_summary_files ON sessions (summary_files); + +-- Or use JSON_VALUE in MySQL 8.0+ +CREATE INDEX idx_sessions_plan +ON organizations ((CAST(JSON_VALUE(settings, '$.plan') AS CHAR(50)))); +``` + +### Composite Index Strategy + +```sql +-- Order matters: equality → range → sort +-- Good: WHERE user_id = ? AND status = ? ORDER BY created_at DESC +CREATE INDEX idx_sessions_user_status_created +ON sessions (user_id, status, created_at DESC); + +-- For time-range queries with org scope +CREATE INDEX idx_usage_org_created +ON usage_records (org_id, created_at); + +-- For prefix searches on API keys +CREATE INDEX idx_api_keys_prefix +ON api_keys (prefix(8)); +``` + +## Connection Management + +### Connection Pool Configuration + +```typescript +import mysql from "mysql2/promise" + +const pool = mysql.createPool({ + host: process.env.MYSQL_HOST, + port: parseInt(process.env.MYSQL_PORT || "3306"), + user: process.env.MYSQL_USER, + password: process.env.MYSQL_PASSWORD, + database: process.env.MYSQL_DATABASE, + + // Pool settings + connectionLimit: 20, + queueLimit: 0, + waitForConnections: true, + + // Timeouts + connectTimeout: 10000, + acquireTimeout: 10000, + + // Keep-alive + enableKeepAlive: true, + keepAliveInitialDelay: 30000, + + // Character set + charset: "utf8mb4", + + // Timezone + timezone: "+00:00", + + // Named placeholders + namedPlaceholders: true, +}) + +// Health check +async function checkHealth(): Promise { + try { + const conn = await pool.getConnection() + await conn.ping() + conn.release() + return true + } catch { + return false + } +} +``` + +### Read/Write Splitting + +```typescript +interface DatabaseConfig { + writer: mysql.PoolOptions + readers: mysql.PoolOptions[] +} + +class ReadWritePool { + private writer: mysql.Pool + private readers: mysql.Pool[] + private readerIndex = 0 + + constructor(config: DatabaseConfig) { + this.writer = mysql.createPool(config.writer) + this.readers = config.readers.map(r => mysql.createPool(r)) + } + + // Get writer for INSERT/UPDATE/DELETE + getWriter(): mysql.Pool { + return this.writer + } + + // Round-robin reader selection + getReader(): mysql.Pool { + if (this.readers.length === 0) { + return this.writer + } + const reader = this.readers[this.readerIndex] + this.readerIndex = (this.readerIndex + 1) % this.readers.length + return reader + } + + // Smart routing based on query + async query(sql: string, params?: unknown[]): Promise { + const isWrite = /^\s*(INSERT|UPDATE|DELETE|REPLACE)/i.test(sql) + const pool = isWrite ? this.getWriter() : this.getReader() + const [rows] = await pool.execute(sql, params) + return rows as T[] + } +} +``` + +## Sharding Strategy + +### Shard Key Selection + +```typescript +// Shard by organization for tenant isolation +interface ShardConfig { + shardKey: "org_id" + shardCount: 16 + shardMap: Map // shard_id → connection +} + +function getShardId(orgId: bigint, shardCount: number): number { + // Consistent hashing + return Number(orgId % BigInt(shardCount)) +} + +class ShardedDatabase { + private shards: Map + + constructor(config: ShardConfig) { + this.shards = new Map() + for (const [shardId, dbConfig] of config.shardMap) { + this.shards.set(shardId, new ReadWritePool(dbConfig)) + } + } + + getPool(orgId: bigint): ReadWritePool { + const shardId = getShardId(orgId, this.shards.size) + const pool = this.shards.get(shardId) + if (!pool) { + throw new Error(`Shard ${shardId} not configured`) + } + return pool + } + + // Cross-shard query (fan-out) + async queryAll(sql: string, params?: unknown[]): Promise { + const results = await Promise.all( + Array.from(this.shards.values()).map(pool => + pool.query(sql, params) + ) + ) + return results.flat() + } +} +``` + +### Schema Per Shard + +```sql +-- Each shard has identical schema +-- Shard 0: opencode_shard_0 +-- Shard 1: opencode_shard_1 +-- ... + +-- Global tables (not sharded) in separate database +-- opencode_global: organizations, users, api_keys +``` + +## Migration from UUID + +### Migration Script + +```typescript +// Add bigint columns alongside UUID +async function migrationStep1(): Promise { + await db.query` + ALTER TABLE sessions + ADD COLUMN id_new BIGINT NULL AFTER id, + ADD COLUMN project_id_new BIGINT NULL AFTER project_id, + ADD COLUMN user_id_new BIGINT NULL AFTER user_id + ` +} + +// Populate bigint columns +async function migrationStep2(): Promise { + // Generate mapping: UUID → BIGINT + const idGen = new SnowflakeGenerator(0) + + // Process in batches + let cursor: string | null = null + while (true) { + const sessions = await db.query` + SELECT id, project_id, user_id FROM sessions + WHERE id_new IS NULL + ${cursor ? sql`AND id > ${cursor}` : sql``} + ORDER BY id + LIMIT 1000 + `.all() + + if (sessions.length === 0) break + + for (const session of sessions) { + const newId = idGen.nextId() + await db.query` + UPDATE sessions SET id_new = ${newId} + WHERE id = ${session.id} + ` + } + + cursor = sessions[sessions.length - 1].id + } +} + +// Swap columns +async function migrationStep3(): Promise { + await db.query` + ALTER TABLE sessions + DROP COLUMN id, + CHANGE COLUMN id_new id BIGINT NOT NULL, + ADD PRIMARY KEY (id) + ` +} +``` + +## Performance Considerations + +### Batch Inserts + +```typescript +// Bulk insert for message parts +async function insertParts(parts: MessagePart[]): Promise { + if (parts.length === 0) return + + const values = parts.map(p => [ + p.id, + p.message_id, + p.type, + JSON.stringify(p.content), + p.sort_order, + ]) + + await db.query` + INSERT INTO message_parts (id, message_id, type, content, sort_order) + VALUES ${values} + ` +} +``` + +### Query Optimization Tips + +```sql +-- Use STRAIGHT_JOIN to force join order when optimizer chooses poorly +SELECT STRAIGHT_JOIN s.* +FROM sessions s +JOIN projects p ON s.project_id = p.id +JOIN workspaces w ON p.workspace_id = w.id +WHERE w.org_id = ?; + +-- Use index hints if needed +SELECT * FROM sessions USE INDEX (idx_user_status_created) +WHERE user_id = ? AND status = 'active' +ORDER BY created_at DESC; + +-- Avoid SELECT * in production +SELECT id, title, status, created_at FROM sessions WHERE user_id = ?; +``` + +### Monitoring Queries + +```sql +-- Find slow queries +SELECT * FROM performance_schema.events_statements_summary_by_digest +ORDER BY SUM_TIMER_WAIT DESC +LIMIT 10; + +-- Check index usage +SELECT * FROM sys.schema_unused_indexes; + +-- Table sizes +SELECT + table_name, + ROUND(data_length / 1024 / 1024, 2) AS data_mb, + ROUND(index_length / 1024 / 1024, 2) AS index_mb +FROM information_schema.tables +WHERE table_schema = 'opencode' +ORDER BY data_length DESC; +``` From b7830c342a04b9e4e7d2f3724decf00d7a06cf60 Mon Sep 17 00:00:00 2001 From: Claude Date: Mon, 24 Nov 2025 04:15:37 +0000 Subject: [PATCH 07/58] docs: add comprehensive subagent API reference and new client feasibility analysis - Document all server-side APIs for session, message, and task management - Document all client-side APIs including Bus events, Storage, and Provider - Analyze event system for real-time subagent monitoring - Provide implementation guide for new clients - Assess feasibility of building clients in various languages (TypeScript, Python, Go, Rust) - Include architecture patterns and feature parity matrix - Document existing SDK (@opencode-ai/sdk) and web client (packages/desktop) - Add packages overview showing all available client implementations --- docs/NEW_CLIENT_FEASIBILITY.md | 650 +++++++++++++++++ docs/SUBAGENT_API_REFERENCE.md | 1226 ++++++++++++++++++++++++++++++++ 2 files changed, 1876 insertions(+) create mode 100644 docs/NEW_CLIENT_FEASIBILITY.md create mode 100644 docs/SUBAGENT_API_REFERENCE.md diff --git a/docs/NEW_CLIENT_FEASIBILITY.md b/docs/NEW_CLIENT_FEASIBILITY.md new file mode 100644 index 00000000000..e23e73a022e --- /dev/null +++ b/docs/NEW_CLIENT_FEASIBILITY.md @@ -0,0 +1,650 @@ +# New Client Feasibility Analysis + +This document analyzes the feasibility of building a new client for OpenCode with full subagent and task management support. + +## Executive Summary + +**Verdict: Highly Feasible** + +OpenCode's architecture is well-suited for alternative client implementations. The HTTP API is comprehensive, events are streamed via SSE, and all schemas are well-defined with Zod. + +**Important:** OpenCode already has: +- A **generated TypeScript SDK** (`@opencode-ai/sdk`) with all API methods +- A **SolidJS web client** (`packages/desktop`) with full subagent support +- A **TUI client** in the core package + +New clients can either use the existing SDK (for TypeScript/JavaScript) or implement their own HTTP client based on the OpenAPI spec. + +--- + +## Current Architecture + +### Communication Patterns + +``` +┌─────────────┐ HTTP/SSE ┌──────────────┐ +│ Client │ ◄─────────────────────► │ Server │ +│ (TUI/Web) │ │ (Hono) │ +└─────────────┘ └──────────────┘ + │ + ┌───────┴───────┐ + │ │ + ┌─────▼─────┐ ┌─────▼─────┐ + │ Session │ │ Agent │ + │ Manager │ │ Executor │ + └───────────┘ └───────────┘ +``` + +### Key Components + +| Component | Role | Client Access | +|-----------|------|---------------| +| Server (Hono) | HTTP API gateway | Direct HTTP | +| Bus | Event pub/sub | SSE streaming | +| Storage | Persistence | Via API only | +| Session Manager | Session CRUD | HTTP endpoints | +| Prompt Executor | LLM execution | POST /message | +| Agent Registry | Agent config | GET /agent | + +--- + +## API Completeness Analysis + +### Session Management: Complete + +| Operation | Endpoint | Status | +|-----------|----------|--------| +| Create | POST /session | Available | +| Read | GET /session/:id | Available | +| List | GET /session | Available | +| Update | PATCH /session/:id | Available | +| Delete | DELETE /session/:id | Available | +| Fork | POST /session/:id/fork | Available | +| Children | GET /session/:id/children | Available | +| Share | POST /session/:id/share | Available | + +### Message Execution: Complete + +| Operation | Endpoint | Status | +|-----------|----------|--------| +| Create & Execute | POST /session/:id/message | Available (streams) | +| List Messages | GET /session/:id/message | Available | +| Get Message | GET /session/:id/message/:msgID | Available | +| Execute Command | POST /session/:id/command | Available | +| Execute Shell | POST /session/:id/shell | Available | +| Abort | POST /session/:id/abort | Available | +| Revert | POST /session/:id/revert | Available | + +### Event Streaming: Complete + +| Operation | Endpoint | Status | +|-----------|----------|--------| +| Session Events | GET /event | SSE stream | +| Global Events | GET /global/event | SSE stream | +| Status Polling | GET /session/status | Available | + +### Agent Configuration: Complete + +| Operation | Endpoint | Status | +|-----------|----------|--------| +| List Agents | GET /agent | Available | +| Permissions | POST /session/:id/permissions/:id | Available | + +--- + +## New Client Capabilities + +### Tier 1: Basic Client (1-2 weeks) + +**Features:** +- Session CRUD +- Message sending/receiving +- Basic streaming output +- Agent selection + +**APIs Required:** +- POST/GET/DELETE /session +- POST /session/:id/message +- GET /session/:id/message + +**Complexity:** Low + +--- + +### Tier 2: Full-Featured Client (3-4 weeks) + +**Additional Features:** +- Real-time event streaming +- Subagent monitoring +- File diff visualization +- Permission handling +- Session forking + +**APIs Required:** +- All Tier 1 APIs +- GET /event (SSE) +- GET /session/:id/children +- POST /session/:id/fork +- POST /session/:id/permissions/:id + +**Complexity:** Medium + +--- + +### Tier 3: Advanced Client (5-8 weeks) + +**Additional Features:** +- Custom agent creation +- Model management +- Cost analytics +- Session sharing +- Compaction handling + +**APIs Required:** +- All Tier 2 APIs +- Full event handling +- Share endpoints +- Usage aggregation logic + +**Complexity:** High + +--- + +## Implementation Approaches + +### Approach 1: Use Existing SDK (TypeScript/JavaScript) + +For TypeScript/JavaScript projects, use the existing generated SDK: + +```typescript +import { createOpencodeClient } from "@opencode-ai/sdk/client" + +const client = createOpencodeClient({ + baseUrl: "http://localhost:4096", + directory: "/path/to/project", +}) + +// Full type safety and all methods available +const session = await client.session.create() +const response = await client.session.prompt({ + path: { id: session.id }, + body: { parts: [{ type: "text", text: "Hello" }] } +}) +``` + +**Pros:** +- Pre-built, tested, and maintained +- Full TypeScript types +- Generated from OpenAPI spec +- Handles authentication and headers + +**Cons:** +- TypeScript/JavaScript only + +**Recommended for:** Web apps, Electron apps, Node.js tools, VS Code extensions + +--- + +### Approach 2: HTTP Client Only (Other Languages) + +**Pros:** +- Simplest implementation +- Works in any language +- No special dependencies + +**Cons:** +- Must poll for some operations +- No direct storage access + +**Recommended for:** Python, Go, Rust clients, mobile apps, integrations + +--- + +### Approach 3: WebSocket Enhancement + +Currently OpenCode uses SSE for events. A WebSocket client could be built: + +**Implementation:** +```typescript +// Wrap SSE in WebSocket adapter +class WebSocketAdapter { + private sse: EventSource + private ws: WebSocket + + connect() { + this.sse = new EventSource("/event") + this.sse.onmessage = (e) => { + this.ws.send(e.data) + } + } +} +``` + +**Pros:** +- Bidirectional communication +- Better mobile support + +**Cons:** +- Additional server changes needed + +--- + +### Approach 4: Direct Integration + +Import OpenCode modules directly: + +```typescript +import { Session, SessionPrompt, Bus } from "@opencode/core" + +// Direct access to all internals +const session = await Session.create({ title: "My Session" }) +Bus.subscribe(Session.Event.Created, handleCreated) +``` + +**Pros:** +- Full access to internals +- Best performance +- No network overhead + +**Cons:** +- Node.js/Bun only +- Tight coupling to internals + +**Recommended for:** CLI tools, IDE plugins + +--- + +## Language-Specific Implementations + +### TypeScript/JavaScript + +```typescript +import { OpencodeClient } from "@opencode/sdk" + +const client = new OpencodeClient("http://localhost:3000") +const session = await client.session.create() +const message = await client.message.prompt({ + sessionID: session.id, + parts: [{ type: "text", text: "Hello" }], +}) +``` + +**Advantages:** Native types, existing SDK patterns + +--- + +### Python + +```python +import opencode + +client = opencode.Client("http://localhost:3000") +session = client.sessions.create() +message = client.messages.prompt( + session_id=session.id, + parts=[{"type": "text", "text": "Hello"}] +) + +# Event streaming +for event in client.events.stream(): + if event.type == "message.part.updated": + print(event.properties.part.text) +``` + +**Advantages:** Large AI/ML ecosystem + +--- + +### Go + +```go +client := opencode.NewClient("http://localhost:3000") +session, _ := client.Sessions.Create(nil) +message, _ := client.Messages.Prompt(opencode.PromptInput{ + SessionID: session.ID, + Parts: []opencode.Part{ + {Type: "text", Text: "Hello"}, + }, +}) + +// Event streaming +events := client.Events.Subscribe() +for event := range events { + switch e := event.(type) { + case *opencode.MessagePartUpdated: + fmt.Println(e.Part.Text) + } +} +``` + +**Advantages:** Performance, concurrency + +--- + +### Rust + +```rust +let client = OpenCodeClient::new("http://localhost:3000"); +let session = client.sessions().create(None).await?; +let message = client.messages().prompt(PromptInput { + session_id: session.id, + parts: vec![Part::Text { text: "Hello".into() }], +}).await?; + +// Event streaming +let mut events = client.events().subscribe().await?; +while let Some(event) = events.next().await { + match event { + Event::MessagePartUpdated { part, .. } => { + println!("{}", part.text); + } + _ => {} + } +} +``` + +**Advantages:** Performance, safety, WASM support + +--- + +## Client Architecture Patterns + +### Pattern 1: Thin Client + +``` +┌─────────────────┐ +│ Thin Client │ +│ (just HTTP) │ +└────────┬────────┘ + │ + ▼ +┌─────────────────┐ +│ OpenCode API │ +└─────────────────┘ +``` + +All logic in server. Client only renders. + +**Use case:** Web dashboards, monitoring tools + +--- + +### Pattern 2: Smart Client + +``` +┌─────────────────┐ +│ Smart Client │ +│ ┌─────────────┐ │ +│ │ Local State │ │ +│ │ Cache │ │ +│ └─────────────┘ │ +└────────┬────────┘ + │ + ▼ +┌─────────────────┐ +│ OpenCode API │ +└─────────────────┘ +``` + +Local state, caching, optimistic updates. + +**Use case:** TUI, IDE plugins + +--- + +### Pattern 3: Offline-First Client + +``` +┌─────────────────┐ +│ Offline Client │ +│ ┌─────────────┐ │ +│ │ Local Store │ │ +│ │ (SQLite) │ │ +│ └─────────────┘ │ +└────────┬────────┘ + │ Sync + ▼ +┌─────────────────┐ +│ OpenCode API │ +└─────────────────┘ +``` + +Full offline support with sync. + +**Use case:** Mobile apps, distributed teams + +--- + +## Feature Parity Matrix + +| Feature | Current TUI | New Client Possible | +|---------|-------------|---------------------| +| Session management | Yes | Yes | +| Real-time streaming | Yes | Yes | +| Subagent monitoring | Yes | Yes | +| File diff view | Yes | Yes | +| Cost tracking | Yes | Yes | +| Permission dialogs | Yes | Yes | +| Vim keybindings | Yes | Implementation choice | +| Markdown rendering | Yes | Implementation choice | +| Syntax highlighting | Yes | Implementation choice | +| Theme customization | Yes | Implementation choice | +| Session navigation | Yes | Yes | + +--- + +## Challenges and Solutions + +### Challenge 1: Streaming Response Parsing + +**Problem:** POST /message returns streaming JSON chunks. + +**Solution:** +```typescript +async function* streamPrompt(input: PromptInput) { + const response = await fetch(url, { + method: "POST", + body: JSON.stringify(input), + }) + + const reader = response.body!.getReader() + const decoder = new TextDecoder() + let buffer = "" + + while (true) { + const { done, value } = await reader.read() + if (done) break + + buffer += decoder.decode(value, { stream: true }) + const lines = buffer.split("\n") + buffer = lines.pop()! + + for (const line of lines) { + if (line.trim()) { + yield JSON.parse(line) + } + } + } +} +``` + +--- + +### Challenge 2: Event Reconnection + +**Problem:** SSE connections can drop. + +**Solution:** +```typescript +class ResilientEventSource { + private url: string + private eventSource?: EventSource + private retryDelay = 1000 + + connect() { + this.eventSource = new EventSource(this.url) + + this.eventSource.onerror = () => { + this.eventSource?.close() + setTimeout(() => this.connect(), this.retryDelay) + this.retryDelay = Math.min(this.retryDelay * 2, 30000) + } + + this.eventSource.onopen = () => { + this.retryDelay = 1000 + } + } +} +``` + +--- + +### Challenge 3: Parent-Child Session Tracking + +**Problem:** Need to track relationships for subagent monitoring. + +**Solution:** +```typescript +class SessionTree { + private sessions: Map = new Map() + private children: Map> = new Map() + + add(session: Session.Info) { + this.sessions.set(session.id, session) + if (session.parentID) { + if (!this.children.has(session.parentID)) { + this.children.set(session.parentID, new Set()) + } + this.children.get(session.parentID)!.add(session.id) + } + } + + getChildren(id: string): Session.Info[] { + const childIds = this.children.get(id) || new Set() + return [...childIds].map(id => this.sessions.get(id)!) + } + + getAncestors(id: string): Session.Info[] { + const result: Session.Info[] = [] + let current = this.sessions.get(id) + while (current?.parentID) { + current = this.sessions.get(current.parentID) + if (current) result.push(current) + } + return result + } +} +``` + +--- + +### Challenge 4: Permission Handling + +**Problem:** Server may pause execution for permission requests. + +**Solution:** +```typescript +class PermissionHandler { + private pending: Map void + permission: Permission + }> = new Map() + + async handle(event: PermissionEvent) { + const permission = event.properties.permission + + // Show UI dialog + const approved = await this.showDialog(permission) + + // Send response + await fetch(`/session/${permission.sessionID}/permissions/${permission.id}`, { + method: "POST", + body: JSON.stringify({ approved }), + }) + } + + private async showDialog(permission: Permission): Promise { + // Implementation depends on UI framework + } +} +``` + +--- + +## Estimated Development Effort + +### TypeScript Web Client + +| Component | Effort | Priority | +|-----------|--------|----------| +| HTTP client wrapper | 2-3 days | P0 | +| SSE event handling | 1-2 days | P0 | +| Session state management | 2-3 days | P0 | +| Message rendering | 3-5 days | P0 | +| Subagent monitoring | 2-3 days | P1 | +| Permission dialogs | 1-2 days | P1 | +| File diff viewer | 3-5 days | P1 | +| Cost dashboard | 1-2 days | P2 | +| Session sharing | 1 day | P2 | + +**Total: 2-4 weeks** for full-featured client + +--- + +### Python SDK + +| Component | Effort | Priority | +|-----------|--------|----------| +| HTTP client | 2-3 days | P0 | +| Async streaming | 2-3 days | P0 | +| Type definitions | 1-2 days | P0 | +| Event handling | 1-2 days | P0 | +| Documentation | 2-3 days | P1 | + +**Total: 1-2 weeks** for SDK + +--- + +## Recommendations + +### For Web Client + +1. Use React/Vue/Svelte with reactive state +2. Implement SSE event batching for performance +3. Use virtual scrolling for message lists +4. Consider Monaco editor for code blocks + +### For CLI Client + +1. Use Ink (React for CLI) or Bubble Tea (Go) +2. Implement local caching +3. Support pipe/redirect for automation +4. Consider TUI framework like Ratatui (Rust) + +### For IDE Plugin + +1. Use direct module import for performance +2. Integrate with IDE's existing event loop +3. Leverage IDE's UI components +4. Support multiple concurrent sessions + +--- + +## Conclusion + +Building a new OpenCode client is highly feasible due to: + +1. **Complete HTTP API** - All operations exposed via REST +2. **Real-time Events** - SSE provides live updates +3. **Well-Defined Schemas** - Zod schemas can generate types +4. **Clear Architecture** - Parent-child session model is straightforward +5. **Flexible Permission System** - Async permission handling + +**Recommended starting point:** +1. Implement basic session/message CRUD +2. Add SSE event streaming +3. Build subagent monitoring +4. Add permission handling +5. Enhance with file diffs, costs, sharing + +The modular API design ensures any client can achieve feature parity with the existing TUI while potentially adding new capabilities like web UIs, mobile apps, or IDE integrations. diff --git a/docs/SUBAGENT_API_REFERENCE.md b/docs/SUBAGENT_API_REFERENCE.md new file mode 100644 index 00000000000..1ac698f32f8 --- /dev/null +++ b/docs/SUBAGENT_API_REFERENCE.md @@ -0,0 +1,1226 @@ +# OpenCode Subagent & Task Management API Reference + +This document provides a comprehensive reference for all client and server APIs related to subagents and task management in OpenCode. + +## Table of Contents + +1. [Existing Clients & SDK](#existing-clients--sdk) +2. [Architecture Overview](#architecture-overview) +3. [Server-Side APIs](#server-side-apis) +4. [Client-Side APIs](#client-side-apis) +5. [Event System](#event-system) +6. [New Client Implementation Guide](#new-client-implementation-guide) + +--- + +## Existing Clients & SDK + +OpenCode already provides multiple client implementations and a generated SDK: + +### Packages Overview + +| Package | Type | Description | +|---------|------|-------------| +| `packages/opencode` | Core | Main OpenCode server and TUI client | +| `packages/desktop` | Web Client | SolidJS web client for browser/Electron | +| `packages/sdk/js` | SDK | Generated TypeScript SDK from OpenAPI | +| `packages/ui` | Components | Shared UI component library | +| `packages/console` | Console | Management console web app | +| `packages/tauri` | Desktop | Tauri-based desktop application | +| `packages/enterprise` | Enterprise | Enterprise features | +| `sdks/vscode` | IDE | VS Code extension | + +### Generated SDK (`@opencode-ai/sdk`) + +The SDK is auto-generated from OpenAPI specs using `@hey-api/openapi-ts`: + +```typescript +import { createOpencodeClient } from "@opencode-ai/sdk/client" + +const client = createOpencodeClient({ + baseUrl: "http://localhost:4096", + directory: "/path/to/project", +}) + +// All methods are typed and available +const sessions = await client.session.list() +const session = await client.session.create() +const messages = await client.session.messages({ path: { id: session.id } }) +``` + +**SDK Classes:** +- `Global` - Global events +- `Project` - Project management +- `Config` - Configuration +- `Tool` - Tool management +- `Instance` - Instance control +- `Path` - Path utilities +- `Session` - Session CRUD and messaging +- `Command` - Commands +- `Provider` - Model providers +- `Find` - Search functionality +- `File` - File operations +- `App` - App info and agents +- `Mcp` - MCP server management +- `Lsp` - LSP status +- `Formatter` - Formatter status +- `Tui` - TUI control +- `Auth` - Authentication +- `Event` - Event subscription + +### Desktop Web Client (`packages/desktop`) + +A full SolidJS web application with: + +- **Session management** - Create, list, navigate sessions +- **Message rendering** - Real-time streaming messages with `` +- **File browser** - Open, view, and edit files +- **Diff review** - Side-by-side and unified diff views +- **Drag-and-drop tabs** - Reorderable file tabs +- **Progress tracking** - Context usage and token counts +- **Keyboard shortcuts** - Vim-style navigation + +```typescript +// Desktop client uses the SDK +import { createOpencodeClient } from "@opencode-ai/sdk/client" +import { useSDK, SDKProvider } from "./context/sdk" + +// Context provides SDK to all components +const { client, event } = useSDK() + +// Make API calls +const session = await client.session.create() +await client.session.prompt({ + path: { id: session.id }, + body: { parts: [{ type: "text", text: "Hello" }] } +}) +``` + +--- + +## Architecture Overview + +OpenCode uses a parent-child session architecture for subagent management: + +``` +Parent Session (sessionID: "session_abc") +│ +├─ User Message +├─ Assistant Response +│ └─ Task Tool Invocation +│ ├─ Child Session 1 (parentID: "session_abc") +│ ├─ Child Session 2 (parentID: "session_abc") +│ └─ Child Session 3 (parentID: "session_abc") +│ +└─ Results aggregated back to parent +``` + +### Key Concepts + +- **Session**: Container for a conversation with messages and parts +- **Message**: User or assistant turn in a session +- **Part**: Individual content blocks (text, tool calls, reasoning, etc.) +- **Agent**: Configuration for AI behavior (primary, subagent, or all modes) +- **Task Tool**: Mechanism for spawning child sessions + +--- + +## Server-Side APIs + +### Session Management + +#### Session.create() + +Creates a new session, optionally as a child of another session. + +**Location:** `packages/opencode/src/session/index.ts:122-136` + +```typescript +const create = fn( + z.object({ + parentID: Identifier.schema("session").optional(), + title: z.string().optional(), + }).optional(), + async (input) => Session.Info +) +``` + +**HTTP Endpoint:** `POST /session` + +**Request Body:** +```json +{ + "parentID": "session_abc123", // Optional: parent for child sessions + "title": "My Session" // Optional: custom title +} +``` + +**Response:** `Session.Info` + +--- + +#### Session.get() + +Retrieves a session by ID. + +**Location:** `packages/opencode/src/session/index.ts:210-213` + +```typescript +const get = fn(Identifier.schema("session"), async (id) => Session.Info) +``` + +**HTTP Endpoint:** `GET /session/:id` + +--- + +#### Session.list() + +Lists all sessions in the current project. + +**Location:** `packages/opencode/src/session/index.ts:303-308` + +```typescript +async function* list(): AsyncGenerator +``` + +**HTTP Endpoint:** `GET /session` + +--- + +#### Session.update() + +Updates session properties. + +**Location:** `packages/opencode/src/session/index.ts:270-280` + +```typescript +async function update( + id: string, + editor: (session: Info) => void +): Promise +``` + +**HTTP Endpoint:** `PATCH /session/:id` + +--- + +#### Session.remove() + +Deletes a session and all its children. + +**Location:** `packages/opencode/src/session/index.ts:321-342` + +```typescript +const remove = fn(Identifier.schema("session"), async (sessionID) => void) +``` + +**HTTP Endpoint:** `DELETE /session/:id` + +--- + +#### Session.fork() + +Creates a new session by copying messages up to a point. + +**Location:** `packages/opencode/src/session/index.ts:138-167` + +```typescript +const fork = fn( + z.object({ + sessionID: Identifier.schema("session"), + messageID: Identifier.schema("message").optional(), + }), + async (input) => Session.Info +) +``` + +**HTTP Endpoint:** `POST /session/:id/fork` + +--- + +#### Session.children() + +Gets all child sessions of a parent. + +**HTTP Endpoint:** `GET /session/:id/children` + +--- + +#### Session.messages() + +Retrieves messages for a session. + +**Location:** `packages/opencode/src/session/index.ts:287-301` + +```typescript +const messages = fn( + z.object({ + sessionID: Identifier.schema("session"), + limit: z.number().optional(), + }), + async (input) => MessageV2.WithParts[] +) +``` + +**HTTP Endpoint:** `GET /session/:id/message?limit=` + +--- + +### Session.Info Schema + +```typescript +const Info = z.object({ + id: Identifier.schema("session"), + projectID: z.string(), + directory: z.string(), + parentID: Identifier.schema("session").optional(), + summary: z.object({ + additions: z.number(), + deletions: z.number(), + files: z.number(), + diffs: Snapshot.FileDiff.array().optional(), + }).optional(), + share: z.object({ url: z.string() }).optional(), + title: z.string(), + version: z.string(), + time: z.object({ + created: z.number(), + updated: z.number(), + compacting: z.number().optional(), + }), + revert: z.object({ + messageID: z.string(), + partID: z.string().optional(), + snapshot: z.string().optional(), + diff: z.string().optional(), + }).optional(), +}) +``` + +--- + +### Prompt Execution + +#### SessionPrompt.prompt() + +Creates a user message and starts the execution loop. + +**Location:** `packages/opencode/src/session/prompt.ts:193-205` + +```typescript +const PromptInput = z.object({ + sessionID: Identifier.schema("session"), + messageID: Identifier.schema("message").optional(), + model: z.object({ + providerID: z.string(), + modelID: z.string(), + }).optional(), + agent: z.string().optional(), + noReply: z.boolean().optional(), + system: z.string().optional(), + tools: z.record(z.string(), z.boolean()).optional(), + parts: z.array(TextPart | FilePart | AgentPart | SubtaskPart), +}) + +const prompt = fn(PromptInput, async (input) => MessageV2.WithParts) +``` + +**HTTP Endpoint:** `POST /session/:id/message` (streams JSON) + +--- + +#### SessionPrompt.loop() + +Main execution loop for processing agent responses. + +**Location:** `packages/opencode/src/session/prompt.ts:232-612` + +```typescript +const loop = fn(Identifier.schema("session"), async (sessionID) => MessageV2.WithParts) +``` + +**Execution Flow:** +1. Fetch last user & assistant messages +2. Check for pending subtasks/compaction +3. Resolve system prompts & tools +4. Stream text from LLM +5. Process tool calls +6. Handle errors and retries +7. Continue until completion + +--- + +#### SessionPrompt.command() + +Executes a slash command. + +**Location:** `packages/opencode/src/session/prompt.ts:1292-1396` + +```typescript +const CommandInput = z.object({ + messageID: Identifier.schema("message").optional(), + sessionID: Identifier.schema("session"), + agent: z.string().optional(), + model: z.string().optional(), + arguments: z.string(), + command: z.string(), +}) + +async function command(input: CommandInput): Promise +``` + +**HTTP Endpoint:** `POST /session/:id/command` + +--- + +#### SessionPrompt.shell() + +Executes a shell command and records output. + +**Location:** `packages/opencode/src/session/prompt.ts:1106-1290` + +```typescript +const ShellInput = z.object({ + sessionID: Identifier.schema("session"), + agent: z.string(), + model: z.object({ + providerID: z.string(), + modelID: z.string(), + }).optional(), + command: z.string(), +}) + +async function shell(input: ShellInput): Promise +``` + +**HTTP Endpoint:** `POST /session/:id/shell` + +--- + +### Task Tool API + +The Task tool enables spawning subagent sessions. + +**Location:** `packages/opencode/src/tool/task.ts:13-115` + +#### Parameters + +```typescript +z.object({ + description: z.string(), // Short task description (3-5 words) + prompt: z.string(), // Full task prompt + subagent_type: z.string(), // Agent name (e.g., "general") + session_id: z.string().optional(), // Continue existing session +}) +``` + +#### Return Value + +```typescript +{ + title: string, + metadata: { + summary: ToolPart[], + sessionId: string, + }, + output: string, +} +``` + +#### Execution Flow + +1. Get subagent configuration by type +2. Create child session (or reuse existing) +3. Execute `SessionPrompt.prompt()` in child session +4. Monitor tool execution via Bus subscription +5. Return output with task metadata + +--- + +### Agent APIs + +#### Agent.get() + +**Location:** `packages/opencode/src/agent/agent.ts:182-184` + +```typescript +async function get(agent: string): Promise +``` + +--- + +#### Agent.list() + +**Location:** `packages/opencode/src/agent/agent.ts:186-188` + +```typescript +async function list(): Promise +``` + +**HTTP Endpoint:** `GET /agent` + +--- + +#### Agent.Info Schema + +```typescript +const Info = z.object({ + name: z.string(), + description: z.string().optional(), + mode: z.enum(["subagent", "primary", "all"]), + builtIn: z.boolean(), + topP: z.number().optional(), + temperature: z.number().optional(), + color: z.string().optional(), + permission: z.object({ + edit: Config.Permission, + bash: z.record(z.string(), Config.Permission), + webfetch: Config.Permission.optional(), + doom_loop: Config.Permission.optional(), + external_directory: Config.Permission.optional(), + }), + model: z.object({ + modelID: z.string(), + providerID: z.string(), + }).optional(), + prompt: z.string().optional(), + tools: z.record(z.string(), z.boolean()), + options: z.record(z.string(), z.any()), +}) +``` + +**Agent Modes:** +- `primary` - User-selectable, initiates conversations +- `subagent` - Called by other agents for subtasks +- `all` - Can function as both + +--- + +### Message APIs + +#### MessageV2.Info Schema + +```typescript +// User message +const User = Base.extend({ + role: z.literal("user"), + time: z.object({ created: z.number() }), + agent: z.string(), + model: z.object({ providerID: z.string(), modelID: z.string() }), +}) + +// Assistant message +const Assistant = Base.extend({ + role: z.literal("assistant"), + time: z.object({ created: z.number(), completed: z.number().optional() }), + error: z.discriminatedUnion("name", [...]).optional(), + parentID: z.string(), + modelID: z.string(), + providerID: z.string(), + mode: z.string(), + path: z.object({ cwd: z.string(), root: z.string() }), + cost: z.number(), + tokens: z.object({ + input: z.number(), + output: z.number(), + reasoning: z.number(), + cache: z.object({ read: z.number(), write: z.number() }), + }), +}) +``` + +--- + +#### Message Part Types + +| Type | Description | Key Fields | +|------|-------------|------------| +| `TextPart` | Plain text output | `text`, `synthetic` | +| `ReasoningPart` | Extended thinking | `text`, `time` | +| `FilePart` | File references | `filename`, `mime` | +| `ToolPart` | Tool invocations | `tool`, `state`, `callID` | +| `SnapshotPart` | Filesystem snapshots | `snapshot` | +| `PatchPart` | Diff patches | `hash`, `files` | +| `SubtaskPart` | Subtask references | `prompt`, `agent` | +| `StepStartPart` | Step markers | `snapshot` | +| `StepFinishPart` | Step completion | `cost`, `tokens` | + +--- + +### HTTP Endpoints Summary + +#### Session Endpoints + +| Method | Path | Operation | +|--------|------|-----------| +| POST | `/session` | Create session | +| GET | `/session` | List sessions | +| GET | `/session/:id` | Get session | +| PATCH | `/session/:id` | Update session | +| DELETE | `/session/:id` | Delete session | +| GET | `/session/:id/children` | Get children | +| POST | `/session/:id/fork` | Fork session | +| POST | `/session/:id/share` | Share session | +| POST | `/session/:id/abort` | Abort execution | + +#### Message Endpoints + +| Method | Path | Operation | +|--------|------|-----------| +| GET | `/session/:id/message` | List messages | +| GET | `/session/:id/message/:msgID` | Get message | +| POST | `/session/:id/message` | Create & execute | +| POST | `/session/:id/command` | Execute command | +| POST | `/session/:id/shell` | Execute shell | +| POST | `/session/:id/revert` | Revert message | + +#### Event Endpoints + +| Method | Path | Operation | +|--------|------|-----------| +| GET | `/event` | Subscribe to events (SSE) | +| GET | `/global/event` | Global events (SSE) | +| GET | `/session/status` | Session status | + +--- + +## Client-Side APIs + +### Bus/Event System + +The Bus system provides typed pub/sub messaging. + +**Location:** `packages/opencode/src/bus/index.ts` + +#### Bus.event() + +Define a typed event. + +```typescript +function event( + type: Type, + properties: Properties +): EventDefinition +``` + +**Example:** +```typescript +const Created = Bus.event("session.created", z.object({ info: Session.Info })) +``` + +--- + +#### Bus.publish() + +Broadcast an event to all subscribers. + +```typescript +async function publish( + def: Definition, + properties: z.output +): Promise +``` + +**Example:** +```typescript +await Bus.publish(Session.Event.Created, { info: newSession }) +``` + +--- + +#### Bus.subscribe() + +Listen for specific events. + +```typescript +function subscribe( + def: Definition, + callback: (event: EventPayload) => void +): () => void // Returns unsubscribe function +``` + +**Example:** +```typescript +const unsubscribe = Bus.subscribe(Session.Event.Created, (event) => { + console.log("Session created:", event.properties.info.id) +}) +``` + +--- + +#### Bus.once() + +One-time event listener. + +```typescript +function once( + def: Definition, + callback: (event: EventPayload) => "done" | undefined +): void +``` + +--- + +#### Bus.subscribeAll() + +Listen to all events (wildcard). + +```typescript +function subscribeAll(callback: (event: any) => void): () => void +``` + +--- + +### Defined Events + +#### Session Events + +```typescript +const Event = { + Created: Bus.event("session.created", z.object({ info: Info })), + Updated: Bus.event("session.updated", z.object({ info: Info })), + Deleted: Bus.event("session.deleted", z.object({ info: Info })), + Diff: Bus.event("session.diff", z.object({ + sessionID: z.string(), + diff: Snapshot.FileDiff.array(), + })), + Error: Bus.event("session.error", z.object({ + sessionID: z.string().optional(), + error: MessageV2.Assistant.shape.error, + })), +} +``` + +#### Message Events + +```typescript +const Event = { + Updated: Bus.event("message.updated", z.object({ info: Info })), + Removed: Bus.event("message.removed", z.object({ + sessionID: z.string(), + messageID: z.string(), + })), + PartUpdated: Bus.event("message.part.updated", z.object({ + part: Part, + delta: z.string().optional(), + })), + PartRemoved: Bus.event("message.part.removed", z.object({ + sessionID: z.string(), + messageID: z.string(), + partID: z.string(), + })), +} +``` + +--- + +### Storage API + +File-based JSON storage system. + +**Location:** `packages/opencode/src/storage/storage.ts` + +#### Storage.read() + +```typescript +async function read(key: string[]): Promise +``` + +**Example:** +```typescript +const session = await Storage.read(["session", projectID, sessionID]) +``` + +--- + +#### Storage.write() + +```typescript +async function write(key: string[], content: T): Promise +``` + +--- + +#### Storage.update() + +Atomic read-modify-write. + +```typescript +async function update( + key: string[], + fn: (draft: T) => void +): Promise +``` + +--- + +#### Storage.list() + +List records by prefix. + +```typescript +async function list(prefix: string[]): Promise +``` + +**Example:** +```typescript +const sessions = await Storage.list(["session", projectID]) +// Returns: [["session", "proj_abc", "sess_123"], ...] +``` + +--- + +### Provider API + +Model and provider management. + +**Location:** `packages/opencode/src/provider/provider.ts` + +#### Provider.getModel() + +```typescript +async function getModel( + providerID: string, + modelID: string +): Promise<{ + modelID: string + providerID: string + info: ModelsDev.Model + language: LanguageModel + npm?: string +}> +``` + +--- + +#### Provider.list() + +```typescript +async function list(): Promise<{ + [providerID: string]: { + source: Source + info: ModelsDev.Provider + options: Record + } +}> +``` + +--- + +#### Provider.defaultModel() + +```typescript +async function defaultModel(): Promise<{ + providerID: string + modelID: string +}> +``` + +--- + +### Worker/RPC API + +For multi-process communication. + +**Location:** `packages/opencode/src/util/rpc.ts` + +#### Rpc.listen() + +Server-side RPC handler (in worker). + +```typescript +function listen(rpc: Definition): void +``` + +**Example:** +```typescript +Rpc.listen({ + async server(input: { port: number }) { + return { url: `http://localhost:${input.port}` } + }, +}) +``` + +--- + +#### Rpc.client() + +Client-side RPC caller (main thread). + +```typescript +function client(target: Worker): { + call( + method: Method, + input: Parameters[0] + ): Promise> +} +``` + +**Example:** +```typescript +const client = Rpc.client(worker) +const result = await client.call("server", { port: 3000 }) +``` + +--- + +### State Management Contexts + +TUI state management using Solid.js contexts. + +#### useSDK() + +SDK client and event subscription. + +**Location:** `packages/opencode/src/cli/cmd/tui/context/sdk.tsx` + +```typescript +const { client, event } = useSDK() +// client: OpencodeClient - HTTP client for API calls +// event: EventEmitter - Batched event emissions +``` + +--- + +#### useSync() + +Global state synchronization. + +**Location:** `packages/opencode/src/cli/cmd/tui/context/sync.tsx` + +```typescript +const sync = useSync() + +// Access data +sync.data.session // Session[] +sync.data.message // { [sessionID]: Message[] } +sync.data.part // { [messageID]: Part[] } +sync.data.agent // Agent[] +sync.data.provider // Provider[] +sync.data.permission // { [sessionID]: Permission[] } + +// Session utilities +sync.session.get(id) // Get session by ID +sync.session.status(id) // "idle" | "working" | "compacting" +await sync.session.sync(id) // Fetch messages for session + +// Bootstrap +await sync.bootstrap() // Load initial data +``` + +--- + +#### useLocal() + +Local preferences (model, agent). + +**Location:** `packages/opencode/src/cli/cmd/tui/context/local.tsx` + +```typescript +const local = useLocal() + +// Model management +local.model.current() // Current model +local.model.set(model) // Set model +local.model.cycle(1) // Cycle to next model + +// Agent management +local.agent.current() // Current agent +local.agent.set(name) // Set agent +local.agent.list() // Available agents +``` + +--- + +#### useRoute() + +Navigation state. + +**Location:** `packages/opencode/src/cli/cmd/tui/context/route.tsx` + +```typescript +const route = useRoute() + +route.data // Current route +route.navigate({ type: "session", sessionID: "..." }) +``` + +--- + +## Event System + +### Event Flow + +``` +Tool Execution / State Change + ↓ + Bus.publish() + ↓ + GlobalBus.emit() → Other processes + ↓ + Local subscribers + ↓ + SSE to HTTP clients +``` + +### Subscribing via HTTP (SSE) + +```typescript +const eventSource = new EventSource("/event") +eventSource.onmessage = (e) => { + const event = JSON.parse(e.data) + switch (event.type) { + case "session.created": + handleSessionCreated(event.properties.info) + break + case "message.part.updated": + handlePartUpdated(event.properties.part) + break + } +} +``` + +### Event Types for Subagent Monitoring + +| Event | Description | Payload | +|-------|-------------|---------| +| `session.created` | Child session created | `{ info: Session.Info }` | +| `message.updated` | Message state changed | `{ info: MessageV2.Info }` | +| `message.part.updated` | Part updated (streaming) | `{ part: Part, delta?: string }` | +| `session.diff` | File changes | `{ sessionID, diff: FileDiff[] }` | +| `session.error` | Error occurred | `{ sessionID?, error }` | + +--- + +## New Client Implementation Guide + +### Minimum Required APIs + +To build a new client with full subagent support, implement these core integrations: + +#### 1. Session Management + +```typescript +interface SessionClient { + create(input?: { parentID?: string; title?: string }): Promise + get(id: string): Promise + list(): Promise + children(id: string): Promise + remove(id: string): Promise +} +``` + +#### 2. Message Execution + +```typescript +interface MessageClient { + prompt(input: { + sessionID: string + parts: Part[] + agent?: string + model?: { providerID: string; modelID: string } + }): Promise + + messages(sessionID: string, limit?: number): Promise + abort(sessionID: string): Promise +} +``` + +#### 3. Event Subscription + +```typescript +interface EventClient { + subscribe(callback: (event: BusEvent) => void): () => void + + // Or via SSE + connect(): EventSource +} +``` + +#### 4. Agent Configuration + +```typescript +interface AgentClient { + list(): Promise + get(name: string): Promise +} +``` + +### Implementation Example + +```typescript +class OpencodeClient { + private baseUrl: string + private eventSource?: EventSource + + constructor(baseUrl: string) { + this.baseUrl = baseUrl + } + + // Session APIs + async createSession(parentID?: string): Promise { + const res = await fetch(`${this.baseUrl}/session`, { + method: "POST", + headers: { "Content-Type": "application/json" }, + body: JSON.stringify({ parentID }), + }) + return res.json() + } + + async getSession(id: string): Promise { + const res = await fetch(`${this.baseUrl}/session/${id}`) + return res.json() + } + + async listSessions(): Promise { + const res = await fetch(`${this.baseUrl}/session`) + return res.json() + } + + async getChildren(sessionID: string): Promise { + const res = await fetch(`${this.baseUrl}/session/${sessionID}/children`) + return res.json() + } + + // Message APIs + async prompt(input: { + sessionID: string + parts: Part[] + agent?: string + }): Promise { + const res = await fetch(`${this.baseUrl}/session/${input.sessionID}/message`, { + method: "POST", + headers: { "Content-Type": "application/json" }, + body: JSON.stringify(input), + }) + return res.json() + } + + async getMessages(sessionID: string): Promise { + const res = await fetch(`${this.baseUrl}/session/${sessionID}/message`) + return res.json() + } + + // Event subscription + subscribeToEvents(callback: (event: any) => void): () => void { + this.eventSource = new EventSource(`${this.baseUrl}/event`) + + this.eventSource.onmessage = (e) => { + callback(JSON.parse(e.data)) + } + + return () => { + this.eventSource?.close() + } + } + + // Agent APIs + async listAgents(): Promise { + const res = await fetch(`${this.baseUrl}/agent`) + return res.json() + } +} +``` + +### Subagent Monitoring + +To monitor subagent execution in real-time: + +```typescript +class SubagentMonitor { + private client: OpencodeClient + private parentSessionID: string + + constructor(client: OpencodeClient, parentSessionID: string) { + this.client = client + this.parentSessionID = parentSessionID + } + + async watchSubagents(callback: (event: SubagentEvent) => void): Promise<() => void> { + const children = new Set() + + // Get existing children + const existing = await this.client.getChildren(this.parentSessionID) + existing.forEach(s => children.add(s.id)) + + // Subscribe to events + return this.client.subscribeToEvents((event) => { + switch (event.type) { + case "session.created": + if (event.properties.info.parentID === this.parentSessionID) { + children.add(event.properties.info.id) + callback({ + type: "child_created", + session: event.properties.info, + }) + } + break + + case "message.part.updated": + if (children.has(event.properties.part.sessionID)) { + callback({ + type: "child_progress", + sessionID: event.properties.part.sessionID, + part: event.properties.part, + }) + } + break + + case "session.error": + if (children.has(event.properties.sessionID)) { + callback({ + type: "child_error", + sessionID: event.properties.sessionID, + error: event.properties.error, + }) + } + break + } + }) + } +} +``` + +### Key Considerations for New Clients + +1. **Streaming Support**: Handle streaming responses for real-time output +2. **Event Batching**: Batch rapid events to avoid UI thrashing +3. **Session Tree Navigation**: Support parent-child relationships +4. **Permission Handling**: Respond to permission requests via `/session/:id/permissions/:permissionID` +5. **Error Recovery**: Handle network errors, retries, and reconnection +6. **Cost Tracking**: Aggregate costs across parent and child sessions + +### Feature Matrix + +| Feature | API Required | Complexity | +|---------|-------------|------------| +| Basic sessions | Session CRUD | Low | +| Message execution | POST /message | Medium | +| Real-time updates | SSE /event | Medium | +| Subagent spawning | Task tool | High | +| Permission handling | Permission endpoints | Medium | +| File diffs | Session.Diff events | Medium | +| Cost tracking | Message tokens | Low | +| Session sharing | Share endpoints | Low | + +--- + +## Conclusion + +OpenCode provides a comprehensive API surface for building clients with full subagent support: + +- **15+ HTTP endpoints** for session and message management +- **10+ event types** for real-time monitoring +- **Typed schemas** with Zod validation +- **Parent-child session** architecture +- **Flexible agent configuration** + +A new client can leverage these APIs to implement: +- Multi-session management +- Real-time streaming output +- Subagent progress monitoring +- Cost aggregation +- File change tracking +- Custom UI experiences + +The modular architecture makes it straightforward to implement clients in any language or framework that supports HTTP and Server-Sent Events. From 4050746c0979e541e99d9e490ce116e04cda2b59 Mon Sep 17 00:00:00 2001 From: Claude Date: Mon, 24 Nov 2025 20:51:10 +0000 Subject: [PATCH 08/58] docs: add comprehensive todo and task tools documentation Comprehensive analysis of OpenCode's todo and task tool systems including: - TodoWrite and TodoRead tool definitions and data models - Task tool for subagent spawning - Internal storage design and event bus architecture - Usage guidelines and prompts from todowrite.txt, todoread.txt, task.txt - System integration points and UI rendering - Data flow diagrams and common patterns - File references with line numbers This documentation provides a complete reference for understanding how OpenCode implements task management and agent delegation. --- docs/todo-and-task-tools-full.md | 376 +++++++++++++++++++++++++++++++ 1 file changed, 376 insertions(+) create mode 100644 docs/todo-and-task-tools-full.md diff --git a/docs/todo-and-task-tools-full.md b/docs/todo-and-task-tools-full.md new file mode 100644 index 00000000000..ceb78221060 --- /dev/null +++ b/docs/todo-and-task-tools-full.md @@ -0,0 +1,376 @@ +# OpenCode Todo and Task Tools - Comprehensive Documentation + +## Overview + +OpenCode provides two primary tool systems for task management and delegation: + +1. **Todo Tools** (`TodoWrite` and `TodoRead`) - For tracking and managing tasks within a session +2. **Task Tool** - For launching autonomous subagents to handle complex multi-step tasks + +--- + +## Table of Contents + +- [Todo Tools](#todo-tools) + - [Data Model](#data-model) + - [Tool Definitions](#tool-definitions) + - [Usage Guidelines](#usage-guidelines) + - [Storage Design](#storage-design) +- [Task Tool](#task-tool) + - [Definition & Architecture](#definition--architecture) + - [Usage Guidelines](#task-usage-guidelines) +- [System Integration](#system-integration) +- [File References](#file-references) + +--- + +## Todo Tools + +### Data Model + +**File**: `packages/opencode/src/session/todo.ts:6-14` + +```typescript +export const Info = z.object({ + content: z.string().describe("Brief description of the task"), + status: z.string().describe("pending, in_progress, completed, cancelled"), + priority: z.string().describe("high, medium, low"), + id: z.string().describe("Unique identifier"), +}) +``` + +### Tool Definitions + +#### TodoWriteTool (`packages/opencode/src/tool/todo.ts:6-24`) + +- **Parameters**: `todos` array with content, status, priority, id +- **Returns**: Count of incomplete todos, JSON output, metadata +- **Side Effects**: Persists to storage, publishes bus event + +#### TodoReadTool (`packages/opencode/src/tool/todo.ts:26-39`) + +- **Parameters**: None +- **Returns**: Current todo list +- **Side Effects**: None (read-only) + +### Usage Guidelines + +**Source**: `packages/opencode/src/tool/todowrite.txt` + +#### ✅ When to Use + +1. Complex multi-step tasks (3+ steps) +2. Non-trivial tasks requiring planning +3. User explicitly requests it +4. Multiple tasks provided by user +5. After receiving new instructions +6. After completing tasks (mark complete) +7. When starting work (mark in_progress) + +#### ❌ When NOT to Use + +1. Single straightforward task +2. Trivial task +3. <3 trivial steps +4. Purely conversational + +#### Task Management Rules + +1. **Status Tracking**: Update real-time, mark complete immediately +2. **Single Focus**: Only ONE task in_progress at a time +3. **Sequential Work**: Complete current before starting new + +### Storage Design + +**File**: `packages/opencode/src/session/todo.ts:26-35` + +```typescript +export async function update(input: { sessionID: string; todos: Info[] }) { + await Storage.write(["todo", input.sessionID], input.todos) + Bus.publish(Event.Updated, input) +} + +export async function get(sessionID: string) { + return Storage.read(["todo", sessionID]) + .then((x) => x || []) + .catch(() => []) +} +``` + +**Storage Location**: `~/.opencode/storage/todo/{sessionID}.json` + +--- + +## Task Tool + +### Definition & Architecture + +**File**: `packages/opencode/src/tool/task.ts` + +The Task tool spawns autonomous subagents for complex multi-step tasks. + +#### Key Implementation Details + +1. **Session Creation** (lines 38-42): + - Creates child session with parentID + - Title includes subagent name + - Can resume existing sessions via session_id parameter + +2. **Tool Restrictions** (lines 88-92): + ```typescript + tools: { + todowrite: false, // Prevent recursive nesting + todoread: false, + task: false, + ...agent.tools, + } + ``` + +3. **Progress Tracking** (lines 55-67): + - Subscribes to MessageV2.Event.PartUpdated + - Tracks tool calls in subagent + - Updates metadata with summary + +4. **Cancellation Support** (lines 74-78): + - Respects abort signals + - Cleans up listeners + +#### Parameters + +```typescript +{ + description: string // Short (3-5 words) description + prompt: string // Detailed task instructions + subagent_type: string // Agent type to use + session_id?: string // Optional: resume existing +} +``` + +#### Returns + +```typescript +{ + title: string // Task description + output: string // Agent response + metadata + metadata: { + summary: ToolPart[] // All tool calls + sessionId: string // Child session ID + } +} +``` + +### Task Usage Guidelines + +**Source**: `packages/opencode/src/tool/task.txt` + +#### When to Use + +- Execute custom slash commands +- Complex multi-step autonomous tasks matching agent descriptions + +#### When NOT to Use + +- Reading specific file paths (use Read/Glob) +- Searching for specific class definitions (use Glob) +- Searching within 2-3 specific files (use Read) +- Tasks not matching agent descriptions + +#### Best Practices + +1. **Concurrency**: Launch multiple agents in parallel when possible +2. **Detailed Prompts**: Provide highly detailed task descriptions +3. **Specify Intent**: Clearly state if agent should write code or just research +4. **Trust Results**: Agent outputs should generally be trusted +5. **User Communication**: Summarize results for user (agent output not visible to them) + +--- + +## System Integration + +### Event Bus Architecture + +**File**: `packages/opencode/src/session/todo.ts:16-24` + +```typescript +export const Event = { + Updated: Bus.event("todo.updated", z.object({ + sessionID: z.string(), + todos: z.array(Info), + })), +} +``` + +- Publishes on every TodoWrite +- Enables real-time UI updates +- Session-scoped events + +### Tool Registry + +**File**: `packages/opencode/src/tool/registry.ts` + +Tools are registered centrally and made available to all agents unless explicitly disabled. + +### UI Rendering + +**File**: `packages/opencode/src/cli/cmd/tui/routes/session/index.tsx:1596-1622` + +```tsx + + {(todo) => ( + + [{todo.status === "completed" ? "✓" : " "}] {todo.content} + + )} + +``` + +Visual indicators: +- ✓ for completed +- Green color for in_progress +- Muted color for pending + +--- + +## File References + +### Core Files + +| File | Purpose | Lines | +|------|---------|-------| +| `packages/opencode/src/tool/todo.ts` | Tool definitions | 40 | +| `packages/opencode/src/session/todo.ts` | Data model & storage | 37 | +| `packages/opencode/src/tool/task.ts` | Task tool definition | 116 | +| `packages/opencode/src/storage/storage.ts` | File-based storage | 227 | + +### Prompt Files + +| File | Purpose | Size | +|------|---------|------| +| `packages/opencode/src/tool/todowrite.txt` | TodoWrite usage guidelines | 8,846 bytes | +| `packages/opencode/src/tool/todoread.txt` | TodoRead usage guidelines | 977 bytes | +| `packages/opencode/src/tool/task.txt` | Task tool guidelines | 3,506 bytes | + +### System Prompts + +| File | Todo Instructions | +|------|-------------------| +| `packages/opencode/src/session/prompt/anthropic.txt` | ✓ Full instructions | +| `packages/opencode/src/session/prompt/anthropic-20250930.txt` | ✓ Enhanced version | +| `packages/opencode/src/session/prompt/polaris.txt` | ✓ Similar instructions | + +--- + +## Data Flow Diagram + +``` +┌─────────────┐ +│ User Input │ +└──────┬──────┘ + │ + ▼ +┌─────────────────┐ +│ TodoWrite/Read │ +│ Tool Execution │ +└──────┬──────────┘ + │ + ├──────────────┐ + ▼ ▼ +┌──────────────┐ ┌──────────────┐ +│ Todo.update()│ │ Todo.get() │ +│ Todo.get() │ │ │ +└──────┬───────┘ └──────┬───────┘ + │ │ + ▼ ▼ +┌──────────────────────────────┐ +│ Storage.write/read() │ +│ ~/.opencode/storage/todo/ │ +│ {sessionID}.json │ +└──────┬───────────────────────┘ + │ + ▼ +┌──────────────────┐ +│ Bus.publish() │ +│ Event.Updated │ +└──────┬───────────┘ + │ + ▼ +┌──────────────────────┐ +│ Tool Returns │ +│ { title, output, │ +│ metadata: todos } │ +└──────┬───────────────┘ + │ + ▼ +┌──────────────────────┐ +│ TUI Renders │ +│ with checkmarks ✓ │ +│ and color coding │ +└──────────────────────┘ +``` + +--- + +## Key Design Decisions + +### 1. Session-Scoped Storage +- Each session has independent todo list +- Stored at `~/.opencode/storage/todo/{sessionID}.json` +- Enables parallel sessions without conflicts + +### 2. Complete List Replacement +- TodoWrite replaces entire list (not incremental updates) +- Simplifies consistency and reduces edge cases +- Agent is responsible for managing complete state + +### 3. Task Tool Restrictions +- Subagents cannot use todowrite, todoread, or task tools +- Prevents recursive nesting and complexity +- Forces clear separation of concerns + +### 4. Event-Driven UI Updates +- Bus events enable real-time synchronization +- TUI subscribes to Event.Updated +- No polling required + +### 5. Single In-Progress Rule +- Only one task should be in_progress at a time +- Enforces sequential completion +- Prevents context-switching confusion + +--- + +## Common Patterns + +### Pattern 1: Multi-Step Task +```typescript +// 1. User provides complex request +// 2. Agent creates todo list with TodoWrite +// 3. Agent marks first task in_progress +// 4. Agent completes first task +// 5. Agent marks first completed, second in_progress +// 6. Repeat until all complete +``` + +### Pattern 2: Task Delegation +```typescript +// 1. Main agent identifies complex subtask +// 2. Launches Task tool with specific subagent +// 3. Subagent works independently (no nested todos/tasks) +// 4. Results returned to main agent +// 5. Main agent continues with results +``` + +### Pattern 3: Progress Checking +```typescript +// 1. Agent uses TodoRead at conversation start +// 2. Reviews pending/in_progress items +// 3. Continues where left off +// 4. Marks items completed as work progresses +``` + +--- + +*Generated from OpenCode source code analysis* +*Last updated: 2025-11-24* From 38b3009e9b33cc8516bef83f90f6e202aa7f4917 Mon Sep 17 00:00:00 2001 From: Claude Date: Mon, 24 Nov 2025 20:54:12 +0000 Subject: [PATCH 09/58] docs: add comprehensive OpenCode architecture whitepaper Create a detailed 14-section whitepaper synthesizing all architectural analysis into a cohesive document covering: 1. System Overview - Architecture style, technology stack, components 2. Core Architecture - Instance model, HTTP API, message flow 3. Session Management - Lifecycle, sequential processing, multi-client 4. MCP Server Integration - Configuration, lifecycle, tool registration 5. LSP Integration - 19 language servers, selection algorithm, usage 6. System Prompt Construction - Assembly pipeline, model-specific prompts 7. Event System - Bus architecture, event flow, client subscription 8. Storage Layer - File-based JSON storage, lock implementation 9. Concurrency Control - Multi-layer locking, race prevention 10. Multi-Server Considerations - Statefulness analysis, deployment options 11. Security Model - Permission system, MCP/LSP security 12. Performance Characteristics - Bottlenecks, optimizations, scalability 13. Design Decisions - Language-agnostic prompts, file storage, locking 14. Future Considerations - Enhancement opportunities, evolution phases Includes comprehensive diagrams, decision rationales, trade-off analysis, and complete reference appendices for files, events, and configuration. --- .../opencode/docs/architecture-whitepaper.md | 1068 +++++++++++++++++ 1 file changed, 1068 insertions(+) create mode 100644 packages/opencode/docs/architecture-whitepaper.md diff --git a/packages/opencode/docs/architecture-whitepaper.md b/packages/opencode/docs/architecture-whitepaper.md new file mode 100644 index 00000000000..ca6f730441c --- /dev/null +++ b/packages/opencode/docs/architecture-whitepaper.md @@ -0,0 +1,1068 @@ +# OpenCode Architecture Whitepaper + +**Version**: 1.0 +**Date**: November 2024 +**Status**: Technical Analysis + +--- + +## Executive Summary + +OpenCode is a sophisticated AI-powered coding assistant that integrates Language Server Protocol (LSP) capabilities, Model Context Protocol (MCP) servers, and large language models to provide intelligent code assistance. This whitepaper provides a comprehensive analysis of OpenCode's architecture, design decisions, and operational characteristics. + +**Key Characteristics**: +- **Stateful architecture** requiring session affinity +- **Event-driven** real-time updates with SSE +- **LSP integration** with 19+ language servers +- **MCP support** for extensible tool integration +- **File-based storage** with in-memory locking +- **Multi-client support** with sequential message processing + +--- + +## Table of Contents + +1. [System Overview](#1-system-overview) +2. [Core Architecture](#2-core-architecture) +3. [Session Management](#3-session-management) +4. [MCP Server Integration](#4-mcp-server-integration) +5. [LSP Integration](#5-lsp-integration) +6. [System Prompt Construction](#6-system-prompt-construction) +7. [Event System](#7-event-system) +8. [Storage Layer](#8-storage-layer) +9. [Concurrency Control](#9-concurrency-control) +10. [Multi-Server Considerations](#10-multi-server-considerations) +11. [Security Model](#11-security-model) +12. [Performance Characteristics](#12-performance-characteristics) +13. [Design Decisions](#13-design-decisions) +14. [Future Considerations](#14-future-considerations) + +--- + +## 1. System Overview + +### 1.1 Architecture Style + +OpenCode employs a **monolithic stateful architecture** with the following characteristics: + +- **Single-process execution** per project instance +- **File-based persistence** for session data +- **In-memory state management** for active sessions +- **Event-driven communication** via Server-Sent Events (SSE) +- **Plugin-based extensibility** via MCP and LSP + +### 1.2 Technology Stack + +| Component | Technology | +|-----------|------------| +| **Runtime** | Bun (JavaScript runtime) | +| **Transport** | HTTP/1.1 with SSE | +| **Storage** | JSON files (XDG base directories) | +| **Locking** | In-memory reader-writer locks | +| **LSP Communication** | JSON-RPC over stdio | +| **MCP Communication** | HTTP/SSE or stdio | +| **Event Bus** | In-memory pub/sub | + +### 1.3 Key Components + +``` +┌─────────────────────────────────────────────────────┐ +│ OpenCode Server │ +├─────────────────────────────────────────────────────┤ +│ ┌──────────────┐ ┌──────────────┐ ┌──────────┐ │ +│ │ Session │ │ LSP │ │ MCP │ │ +│ │ Management │ │ Integration │ │ Servers │ │ +│ └──────────────┘ └──────────────┘ └──────────┘ │ +│ ┌──────────────┐ ┌──────────────┐ ┌──────────┐ │ +│ │ Storage │ │ Event Bus │ │ Prompt │ │ +│ │ Layer │ │ (Pub/Sub) │ │ System │ │ +│ └──────────────┘ └──────────────┘ └──────────┘ │ +│ ┌──────────────┐ ┌──────────────┐ │ +│ │ Locking │ │ Tool │ │ +│ │ Mechanism │ │ Registry │ │ +│ └──────────────┘ └──────────────┘ │ +└─────────────────────────────────────────────────────┘ + │ │ │ + ▼ ▼ ▼ + ┌────────┐ ┌─────────┐ ┌──────────┐ + │ File │ │ LSP │ │ MCP │ + │ System │ │ Servers │ │ Servers │ + └────────┘ └─────────┘ └──────────┘ +``` + +--- + +## 2. Core Architecture + +### 2.1 Project Instance Model + +**File**: `packages/opencode/src/project/instance.ts` + +OpenCode uses a **per-directory instance model**: + +- Each working directory has its own `Instance` +- Instance maintains isolated state via `Instance.state()` +- State is scoped by initialization function (singleton per init) +- Cleanup via `Instance.dispose()` on process exit + +**State Hierarchy**: +``` +Instance (per directory) +├── SessionPrompt state (session locks, callbacks) +├── MCP state (clients, status) +├── LSP state (servers, broken tracking) +├── Bus state (subscriptions) +└── Storage state (directory path) +``` + +### 2.2 HTTP API Surface + +**File**: `packages/opencode/src/server/server.ts` + +| Endpoint | Method | Purpose | +|----------|--------|---------| +| `/session` | GET | List sessions | +| `/session/:id` | GET | Get session info | +| `/session/:id/message` | GET | Get messages (paginated) | +| `/session/:id/message` | POST | Send message (streaming) | +| `/session/:id/diff` | GET | Get session diffs | +| `/session/:id/todo` | GET | Get session todos | +| `/event` | GET | SSE event stream | +| `/global/event` | GET | Global SSE stream | +| `/mcp` | GET | MCP server status | +| `/mcp` | POST | Add MCP server | + +### 2.3 Message Flow + +``` +User Request + ↓ +POST /session/:id/message + ↓ +SessionPrompt.prompt() + ↓ +┌─────────────────────┐ +│ Lock Acquisition │ (start() function) +│ - Check busy state │ +│ - Queue if busy │ +└─────────────────────┘ + ↓ +┌─────────────────────┐ +│ Prompt Construction │ +│ - System prompt │ +│ - Tool resolution │ +│ - Message history │ +└─────────────────────┘ + ↓ +┌─────────────────────┐ +│ LLM API Call │ +│ - Stream response │ +│ - Handle tool calls │ +└─────────────────────┘ + ↓ +┌─────────────────────┐ +│ Storage & Events │ +│ - Write to disk │ +│ - Publish events │ +│ - Resolve callbacks │ +└─────────────────────┘ + ↓ +Response to Client +``` + +--- + +## 3. Session Management + +### 3.1 Session Lifecycle + +**File**: `packages/opencode/src/session/index.ts` + +**Phases**: +1. **Creation**: `Session.create()` → writes JSON to storage +2. **Active**: Messages processed via `SessionPrompt.prompt()` +3. **Idle**: No active processing, can receive new messages +4. **Archived**: Historical data retained + +**Storage Structure**: +``` +~/.local/share/opencode/storage/ +├── session/ +│ └── {projectID}/ +│ └── {sessionID}.json +├── message/ +│ └── {sessionID}/ +│ └── {messageID}.json +└── part/ + └── {messageID}/ + └── {partID}.json +``` + +### 3.2 Sequential Message Processing + +**File**: `packages/opencode/src/session/prompt.ts` (lines 207-238) + +**Key Mechanism**: Session-level lock with callback queue + +```typescript +const state = Record +}> + +function start(sessionID: string) { + if (state[sessionID]) return undefined // Already busy + state[sessionID] = { abort: new AbortController(), callbacks: [] } + return controller.signal +} +``` + +**Behavior**: +- First client acquires lock +- Subsequent clients queued in `callbacks[]` +- When processing completes, all queued callbacks resolved +- Guarantees sequential processing per session + +### 3.3 Multi-Client Support + +**Multiple connections allowed** via: +- Separate SSE connections per client +- Bus pub/sub broadcasts events to all +- Shared file storage for persistence + +**Historical data retrieval**: +- NO automatic replay on connect +- Clients must fetch via REST APIs +- Pull-based for history, push-based for updates + +--- + +## 4. MCP Server Integration + +### 4.1 MCP Architecture + +**File**: `packages/opencode/src/mcp/index.ts` + +**MCP (Model Context Protocol)** enables external tool providers: + +**Configuration Types**: + +```typescript +// Local subprocess +{ + type: "local", + command: ["npx", "mcp-server"], + environment: { ... }, + timeout: 5000 +} + +// Remote HTTP/SSE +{ + type: "remote", + url: "https://example.com/mcp", + headers: { "Authorization": "..." }, + timeout: 5000 +} +``` + +### 4.2 Connection Lifecycle + +**Initialization** (on first tool access): +``` +1. Load config from opencode.jsonc +2. For each MCP server: + a. Validate configuration + b. Create transport (HTTP/SSE/Stdio) + c. Create MCP client via @ai-sdk/mcp + d. Fetch tools with timeout + e. Store client + status +``` + +**Transport Selection**: + +| Server Type | Transport 1 | Transport 2 | +|-------------|-------------|-------------| +| **Remote** | StreamableHTTPClientTransport | SSEClientTransport (fallback) | +| **Local** | StdioClientTransport | - | + +### 4.3 Tool Registration + +**File**: `packages/opencode/src/session/prompt.ts` (lines 727-789) + +**Tool Naming**: `{sanitized_client_name}_{sanitized_tool_name}` + +**Integration Flow**: +``` +MCP.tools() + ↓ +For each client: + ↓ +client.tools() + ↓ +Sanitize names (replace non-alphanumeric) + ↓ +resolveTools() + ↓ +Wrap with plugin hooks + ↓ +Available to LLM +``` + +**Plugin Hooks**: +- `tool.execute.before` - Pre-execution hook +- `tool.execute.after` - Post-execution hook + +### 4.4 Error Handling + +**Status Tracking**: +```typescript +Status = + | { status: "connected" } + | { status: "disabled" } + | { status: "failed", error: string } +``` + +**Failure Modes**: +- Connection timeout (5s default) +- Tool fetch timeout (configurable) +- Transport failures (both transports tried) +- Subprocess spawn failures + +--- + +## 5. LSP Integration + +### 5.1 LSP Architecture + +**File**: `packages/opencode/src/lsp/index.ts` + +**Supported Features**: +- Diagnostics (errors, warnings) +- Hover information (type inspection) +- Workspace symbols (cross-file search) +- Document symbols (file outline) + +### 5.2 Language Server Matrix + +| Language | Server | Extensions | Auto-Install | +|----------|--------|------------|--------------| +| **TypeScript** | typescript-language-server | .ts, .tsx, .js, .jsx | Yes (npm) | +| **Go** | gopls | .go | Yes (`go install`) | +| **Python** | pyright | .py, .pyi | Yes (npm) | +| **Rust** | rust-analyzer | .rs | No (expects installed) | +| **C/C++** | clangd | .c, .cpp, .h, .hpp | Yes (GitHub) | +| **Java** | jdtls | .java | Yes (Eclipse) | +| **Ruby** | ruby-lsp | .rb | Yes (`gem install`) | + +**19 total language servers** supported. + +### 5.3 Server Selection Algorithm + +**File**: `packages/opencode/src/lsp/index.ts` (lines 156-240) + +``` +getClients(file) { + extension = extract_extension(file) + + for server in configured_servers: + if extension not in server.extensions: + continue + + root = server.root(file) // Project root detection + if not root: + continue + + if broken.has(root + server.id): + continue // Previously failed + + if cached_client exists: + return cached_client + + if spawn_inflight: + wait for spawn + else: + spawn new server + + return client +} +``` + +**Root Detection**: Searches up directory tree for: +- Go: `go.work`, `go.mod`, `go.sum` +- TypeScript: `package-lock.json`, lockfiles +- Rust: `Cargo.toml` (with workspace detection) +- Python: `pyproject.toml`, `requirements.txt` + +### 5.4 LSP Data Usage + +**In Edit Tool** (`packages/opencode/src/tool/edit.ts`): +```typescript +await LSP.touchFile(filePath, true) // Wait for diagnostics +const diagnostics = await LSP.diagnostics() +const errors = diagnostics.filter(d => d.severity === 1) +// Errors automatically shown to LLM +``` + +**In Prompt Generation** (`packages/opencode/src/session/prompt.ts`): +- Document symbols used for range refinement +- Workspace symbols for code navigation +- Range data for Read tool offset calculation + +### 5.5 Transport + +**JSON-RPC over stdio**: +```typescript +createMessageConnection( + new StreamMessageReader(process.stdout), + new StreamMessageWriter(process.stdin) +) +``` + +**Notification Handling**: +- `textDocument/publishDiagnostics` → tracked by file +- `window/workDoneProgress/create` → ignored +- `workspace/configuration` → returns init options + +--- + +## 6. System Prompt Construction + +### 6.1 Prompt Assembly Pipeline + +**File**: `packages/opencode/src/session/prompt.ts` (lines 621-641) + +``` +resolveSystemPrompt() { + messages = [] + + // Step 1: Provider header + messages.push(SystemPrompt.header(providerID)) + + // Step 2: Base prompt (priority order) + if (custom_system_override): + messages.push(custom_system) + else if (agent.prompt): + messages.push(agent.prompt) + else: + messages.push(SystemPrompt.provider(modelID)) + + // Step 3: Environment context + messages.push(SystemPrompt.environment()) + + // Step 4: Custom instructions + messages.push(SystemPrompt.custom()) + + // Optimization: Combine into 2 messages for caching + return [messages[0], messages.slice(1).join("\n")] +} +``` + +### 6.2 Model-Specific Prompts + +| Model | Header | Base Prompt | Focus | +|-------|--------|-------------|-------| +| **Claude** | "Claude Code" | anthropic.txt (106 lines) | TodoWrite, parallelism, code refs | +| **GPT-4/o1/o3** | None | beast.txt | Autonomous, research-heavy | +| **GPT-5** | None | codex.txt (319 lines) | Structured workflows | +| **Gemini** | None | gemini.txt (156 lines) | Gemini-specific | +| **Others** | None | qwen.txt | Concise (1-3 sentences) | + +### 6.3 Environment Context + +**File**: `packages/opencode/src/session/system.ts` (lines 36-59) + +**Variables Substituted**: +- `${Instance.directory}` → Working directory +- `${project.vcs}` → Git repository status +- `${process.platform}` → OS platform +- `${new Date().toDateString()}` → Current date +- File tree via Ripgrep (limit: 200 files) + +### 6.4 Custom Instructions + +**Search Order**: + +**Local** (project-specific): +1. `AGENTS.md` +2. `CLAUDE.md` +3. `CONTEXT.md` (deprecated) + +**Global** (user-level): +1. `~/.opencode/AGENTS.md` +2. `~/.claude/CLAUDE.md` + +**Format**: Each file prefixed with `"Instructions from: {path}\n{content}"` + +### 6.5 Anthropic Prompt Content + +**Key Sections** (from `anthropic.txt`): + +1. **Identity**: "OpenCode, the best coding agent on the planet" +2. **Tone & Style**: Concise, no emojis, markdown +3. **Professional Objectivity**: Facts over validation +4. **Task Management**: Heavy TodoWrite usage +5. **Tool Policy**: + - Parallel calls for independent operations + - Task tool for codebase exploration + - Specialized tools over bash +6. **Code References**: `file_path:line_number` format + +--- + +## 7. Event System + +### 7.1 Event Architecture + +**File**: `packages/opencode/src/bus/index.ts` + +**Components**: +- **Bus**: Local pub/sub within process +- **GlobalBus**: Cross-directory EventEmitter +- **SSE**: Server-Sent Events for clients + +### 7.2 Event Flow + +``` +Event Source (Session.updateMessage) + ↓ +Bus.publish(MessageV2.Event.Updated, { info }) + ↓ +┌────────────────────┬────────────────────┐ +│ │ │ +▼ ▼ ▼ +Local Subscribers GlobalBus.emit Store to subscriptions map + ↓ ↓ +Plugin Hooks Cross-directory broadcast + ↓ ↓ +Processing Other instances +``` + +### 7.3 Event Types + +**Session Events**: +- `session.created` +- `session.updated` +- `session.deleted` +- `session.diff` +- `session.error` + +**Message Events**: +- `message.updated` +- `message.removed` +- `message.part.updated` +- `message.part.removed` + +**LSP Events**: +- `lsp.updated` +- `lsp.client.diagnostics` + +### 7.4 Client Subscription + +**File**: `packages/opencode/src/server/server.ts` (lines 1973-1995) + +```typescript +GET /event → streamSSE(async (stream) => { + // Send connection ack + stream.writeSSE({ + data: JSON.stringify({ type: "server.connected" }) + }) + + // Subscribe to all events + const unsub = Bus.subscribeAll(async (event) => { + await stream.writeSSE({ data: JSON.stringify(event) }) + }) + + // Cleanup on disconnect + stream.onAbort(() => { + unsub() + }) +}) +``` + +**Key Behavior**: No historical replay, only future events. + +--- + +## 8. Storage Layer + +### 8.1 Storage Architecture + +**File**: `packages/opencode/src/storage/storage.ts` + +**Storage Location**: XDG base directories +- `~/.local/share/opencode/storage/` (Linux) +- `~/Library/Application Support/opencode/storage/` (macOS) + +**Format**: JSON files with hierarchical structure + +### 8.2 Storage Operations + +| Operation | Lock Type | Atomicity | +|-----------|-----------|-----------| +| `Storage.read()` | Read lock (shared) | Read-only | +| `Storage.update()` | Write lock (exclusive) | Read-modify-write | +| `Storage.write()` | Write lock (exclusive) | Atomic write | + +### 8.3 Lock Implementation + +**File**: `packages/opencode/src/util/lock.ts` + +**Reader-Writer Lock**: +```typescript +Lock = { + readers: number, + writer: boolean, + waitingReaders: (() => void)[], + waitingWriters: (() => void)[] +} +``` + +**Characteristics**: +- **Multiple concurrent readers** allowed +- **Single exclusive writer** (blocks all) +- **Writer priority** (prevents starvation) +- **In-memory only** (no cross-process protection) + +### 8.4 Critical Limitation + +**No distributed locking** - locks are process-local: +- Multiple server processes can corrupt data +- No file-level OS locks (`flock`/`fcntl`) +- No distributed coordination (Redis, etc.) + +--- + +## 9. Concurrency Control + +### 9.1 Concurrency Layers + +| Layer | Mechanism | Scope | Guarantees | +|-------|-----------|-------|-----------| +| **Session Message** | Single-threaded loop + callback queue | Per session | Sequential processing | +| **File I/O** | Reader-Writer Lock | Per file | Concurrent reads, exclusive writes | +| **Event Publishing** | Bus pub/sub + Promise.all | Global | Atomic notification | +| **State Storage** | Directory-scoped Instance.state | Per project | Singleton per init function | +| **HTTP Connections** | SSE streams + individual subscriptions | Per connection | Independent delivery | + +### 9.2 Race Condition Prevention + +**Session Level**: +```typescript +// Only one message processed at a time +if (state[sessionID]) { + // Queue this request + return new Promise((resolve, reject) => { + state[sessionID].callbacks.push({ resolve, reject }) + }) +} +``` + +**File Level**: +```typescript +using _ = await Lock.write(target) // Exclusive access +const content = await Bun.file(target).json() +fn(content) // Modify +await Bun.write(target, JSON.stringify(content)) +``` + +**Event Level**: +```typescript +const pending = subscribers.map(sub => sub(event)) +await Promise.all(pending) // Wait for all handlers +``` + +### 9.3 Cancellation + +**AbortController per session**: +```typescript +state[sessionID] = { + abort: new AbortController(), + callbacks: [] +} + +// User cancels +SessionPrompt.cancel(sessionID) +state[sessionID].abort.abort() // Propagates to LLM call +``` + +--- + +## 10. Multi-Server Considerations + +### 10.1 Statefulness Analysis + +**OpenCode is HIGHLY STATEFUL** due to: + +| Component | Storage | Cross-Process | +|-----------|---------|---------------| +| **Session locks** | In-memory Map | ❌ No | +| **File locks** | In-memory Map | ❌ No | +| **Callback queues** | In-memory Map | ❌ No | +| **Session status** | In-memory Map | ❌ No | +| **Bus subscriptions** | In-memory Map | ❌ No | +| **AbortControllers** | In-memory objects | ❌ No | +| **Session data** | File system | ✅ Yes | +| **Message data** | File system | ✅ Yes | + +### 10.2 Multi-Server Problems + +**Scenario: Two servers handle same session** + +| Time | Server A | Server B | Problem | +|------|----------|----------|---------| +| T0 | Acquires session lock | - | - | +| T1 | Processing message | Acquires session lock | **Concurrent processing** | +| T2 | Writes session.json | Writes session.json | **Last write wins** | +| T3 | - | Server A's changes lost | **Data corruption** | + +**Additional Issues**: +- Cancellation doesn't propagate +- Callback queues lost +- Events not distributed +- Diagnostics inconsistent + +### 10.3 Deployment Requirements + +**Option 1: Single Server** (Recommended) +- Simple, no coordination needed +- All guarantees preserved + +**Option 2: Session Affinity** +- Load balancer sticky sessions +- Cookie or IP-based routing +- Same guarantees within session + +**Option 3: Full Distribution** (Not Supported) +Would require: +- Distributed file locking (Redis, ZooKeeper) +- Shared state store (Redis, database) +- Global event pub/sub (NATS, Kafka) +- Session migration protocol + +--- + +## 11. Security Model + +### 11.1 Permission System + +**Agent-level permissions**: +```typescript +agent.permission = { + edit: boolean, // Edit files + bash: boolean, // Run bash commands + webfetch: boolean, // Fetch web content +} +``` + +**Tool inheritance**: MCP tools inherit agent permissions. + +### 11.2 MCP Security + +**Limitations**: +- No built-in authentication +- Custom headers for auth (user-provided) +- Subprocess isolation for local servers +- No sandboxing beyond process boundaries + +**Warnings**: +- MCP servers run with full process privileges +- No capability-based security +- Trust model: User configures, system executes + +### 11.3 LSP Security + +**Isolation**: +- Language servers run as subprocesses +- Stderr suppressed (line 190 in lsp/client.ts) +- Environment variables controllable +- No network access restrictions + +### 11.4 File Access + +**No access control** beyond filesystem permissions: +- Read/Write tools access any file +- No chroot or jail +- No path traversal protection +- Relies on filesystem permissions + +--- + +## 12. Performance Characteristics + +### 12.1 Bottlenecks + +| Component | Bottleneck | Impact | +|-----------|------------|--------| +| **Session Processing** | Sequential per session | One message at a time | +| **File I/O** | Writer blocks all readers | Lock contention | +| **LSP Startup** | Server spawn + initialization | 1-5s delay | +| **MCP Tool Fetch** | 5s timeout default | Startup latency | +| **File Tree** | 200 file limit | Incomplete context | + +### 12.2 Optimizations + +**Prompt Caching**: +- System prompts limited to 2 messages +- First message: Header + base prompt +- Second message: Environment + custom instructions +- Enables provider-level caching + +**LSP Reuse**: +- Clients cached by (root, serverID) +- Inflight spawns deduplicated +- Broken servers tracked to avoid retry + +**MCP Reuse**: +- Clients cached in Instance.state +- Tools fetched once per server +- Cleanup on Instance.dispose() + +**Parallel Tool Calls**: +- LLM can invoke multiple tools simultaneously +- Independent operations execute in parallel +- Results aggregated before response + +### 12.3 Scalability + +**Vertical Scaling**: +- Single process per directory +- Concurrent sessions across directories +- Memory grows with active sessions + +**Horizontal Scaling**: +- Not supported (stateful architecture) +- Requires session affinity or refactoring +- See [Section 10.3](#103-deployment-requirements) + +--- + +## 13. Design Decisions + +### 13.1 Language-Agnostic Prompts + +**Decision**: No language-specific instructions in system prompts + +**Rationale**: +- Models have inherent language knowledge +- Project structure provides context +- Users can add custom instructions +- Reduces prompt complexity +- Enables universal workflows + +**Trade-off**: May miss language-specific best practices + +### 13.2 File-Based Storage + +**Decision**: JSON files instead of database + +**Benefits**: +- Simple deployment (no DB setup) +- Human-readable format +- Easy backup/sync +- Version control friendly + +**Trade-offs**: +- No ACID transactions +- No complex queries +- Manual indexing +- Lock limitations + +### 13.3 In-Memory Locking + +**Decision**: Process-local locks instead of OS locks + +**Rationale**: +- Simpler implementation +- Faster (no syscalls) +- Sufficient for single-server + +**Trade-off**: Prevents horizontal scaling + +### 13.4 Sequential Session Processing + +**Decision**: One message at a time per session + +**Rationale**: +- Prevents context confusion +- Simpler state management +- Natural conversation flow +- Easier error recovery + +**Trade-off**: Lower throughput per session + +### 13.5 No Historical Replay + +**Decision**: Pull-based history, push-based updates + +**Rationale**: +- No event storage overhead +- Clients control what they fetch +- Reduces server memory +- Simplifies event bus + +**Trade-off**: Requires explicit sync on connect + +### 13.6 Two-Message System Prompt + +**Decision**: Combine environment + custom into one message + +**Rationale**: +- Enables prompt caching +- Most providers cache by message prefix +- 2-message structure maximizes cache hits + +**Trade-off**: Less granular caching control + +--- + +## 14. Future Considerations + +### 14.1 Potential Enhancements + +**Horizontal Scaling**: +- Distributed locking (Redis/etcd) +- Shared state store +- Event streaming (Kafka/NATS) +- Session migration + +**Storage Improvements**: +- SQLite for indexed queries +- Compression for old sessions +- Configurable retention policies +- Backup/restore tools + +**Security Enhancements**: +- MCP authentication framework +- Capability-based security +- Path allowlist/denylist +- Audit logging + +**Performance Optimizations**: +- Incremental file tree updates +- Lazy LSP server loading +- Tool result streaming +- Parallel session processing + +### 14.2 Architectural Evolution + +**Phase 1: Current** (Single-server stateful) +- ✅ Simple deployment +- ✅ Strong consistency +- ❌ No horizontal scaling + +**Phase 2: Stateless with sticky sessions** +- ✅ Multiple servers +- ✅ Session affinity +- ⚠️ Requires load balancer + +**Phase 3: Fully distributed** +- ✅ True horizontal scaling +- ✅ High availability +- ❌ Significant complexity increase +- ❌ Eventual consistency challenges + +**Recommendation**: Phase 1 sufficient for most deployments. + +--- + +## Conclusion + +OpenCode demonstrates a pragmatic architecture that prioritizes: + +1. **Simplicity**: File-based storage, in-memory state +2. **Correctness**: Sequential processing, explicit locking +3. **Extensibility**: MCP/LSP plugin systems +4. **Developer Experience**: Rich tooling, comprehensive prompts + +The stateful design trades horizontal scalability for implementation simplicity and operational correctness. For most deployment scenarios (individual developers, small teams), this is an appropriate trade-off. + +The architecture's main strength is its **comprehensive integration**: LSP for code intelligence, MCP for extensible tools, and sophisticated prompt engineering for model guidance. These combine to create a powerful AI coding assistant that understands project context deeply. + +**Key Takeaway**: OpenCode is designed for single-server deployments with session affinity, not for large-scale multi-tenancy. Its architecture excels at providing rich, context-aware assistance with strong consistency guarantees. + +--- + +## Appendix A: File Reference Index + +| Component | Primary Files | +|-----------|---------------| +| **Session Management** | `src/session/index.ts`, `src/session/prompt.ts` | +| **MCP Integration** | `src/mcp/index.ts`, `src/mcp/client.ts` | +| **LSP Integration** | `src/lsp/index.ts`, `src/lsp/client.ts`, `src/lsp/server.ts` | +| **Storage Layer** | `src/storage/storage.ts` | +| **Locking** | `src/util/lock.ts` | +| **Event System** | `src/bus/index.ts`, `src/bus/global.ts` | +| **System Prompt** | `src/session/system.ts`, `src/session/prompt/*.txt` | +| **HTTP Server** | `src/server/server.ts` | +| **Configuration** | `src/config/config.ts` | +| **Project Instance** | `src/project/instance.ts`, `src/project/state.ts` | + +--- + +## Appendix B: Event Type Reference + +```typescript +// Session Events +"session.created" → { info: Session.Info } +"session.updated" → { info: Session.Info } +"session.deleted" → { info: Session.Info } +"session.diff" → { sessionID, diff } +"session.error" → { sessionID, error } + +// Message Events +"message.updated" → { info: MessageV2.Info } +"message.removed" → { sessionID, messageID } +"message.part.updated" → { part, delta } +"message.part.removed" → { messageID, partID } + +// LSP Events +"lsp.updated" → {} +"lsp.client.diagnostics" → { serverID, path } + +// Server Events (SSE) +"server.connected" → {} +``` + +--- + +## Appendix C: Configuration Schema + +```jsonc +{ + // MCP Server Configuration + "mcp": { + "server-name": { + "type": "local" | "remote", + "command": ["cmd", "args"], // local only + "url": "https://...", // remote only + "headers": {}, // remote only + "environment": {}, // local only + "enabled": true, + "timeout": 5000 + } + }, + + // LSP Server Configuration + "lsp": { + "server-id": { + "disabled": false, + "command": ["lsp-server"], + "extensions": [".ext"], + "env": {}, + "initialization": {} + } + }, + + // Custom Instructions + "instructions": [ + "~/global-instructions.md", + "project-specific.md" + ] +} +``` + +--- + +**Document Version**: 1.0 +**Last Updated**: November 2024 +**Based on OpenCode**: Latest main branch analysis From 2ca1a3c65b467291e519a0f7894db5e9a35e96da Mon Sep 17 00:00:00 2001 From: Claude Date: Mon, 24 Nov 2025 21:25:18 +0000 Subject: [PATCH 10/58] docs: add comprehensive custom system prompt feature plan Add detailed feature plan for implementing custom system and initial prompt templates per session. This will enable users to create specialized agents (e.g., data analyst, security auditor) by providing custom prompt templates when starting a session. Key features: - Session-level custom prompt templates (persistent) - Support for file-based and inline prompts - Template resolution from project/global directories - Auto-detection of file vs inline prompts - Backward compatible with existing sessions - Comprehensive implementation plan with ~145 LOC The plan includes: - Current architecture analysis - Technical design and schema changes - Implementation roadmap (3 phases) - API changes and CLI integration - Security considerations - Testing strategy - Example templates for data analyst and security auditor Ready for review and implementation. --- feature/custom-system-prompt.md | 1109 +++++++++++++++++++++++++++++++ 1 file changed, 1109 insertions(+) create mode 100644 feature/custom-system-prompt.md diff --git a/feature/custom-system-prompt.md b/feature/custom-system-prompt.md new file mode 100644 index 00000000000..785176832c0 --- /dev/null +++ b/feature/custom-system-prompt.md @@ -0,0 +1,1109 @@ +# Feature Plan: Custom System and Initial Prompt Templates Per Session + +## Executive Summary + +This document outlines the design and implementation plan for enabling custom system and initial instruction prompts on a per-session basis. This feature will allow users to create specialized agents (e.g., data analyst, Python expert, security auditor) by providing custom prompt templates when starting a session. + +**Status:** Planning +**Priority:** Medium +**Complexity:** Medium +**Estimated Files to Modify:** 4-6 + +--- + +## Table of Contents + +1. [Current Architecture](#current-architecture) +2. [Problem Statement](#problem-statement) +3. [Proposed Solution](#proposed-solution) +4. [Technical Design](#technical-design) +5. [Implementation Plan](#implementation-plan) +6. [API Changes](#api-changes) +7. [Backward Compatibility](#backward-compatibility) +8. [Testing Strategy](#testing-strategy) +9. [Future Enhancements](#future-enhancements) + +--- + +## Current Architecture + +### System Prompt Loading Mechanism + +**Location:** `/packages/opencode/src/session/prompt.ts:621-641` + +The `resolveSystemPrompt()` function assembles system prompts in the following **priority order**: + +```typescript +async function resolveSystemPrompt(input: { + system?: string // 1. Per-request override (highest priority) + agent: Agent.Info // 2. Agent-specific prompt + providerID: string + modelID: string +}) { + let system = SystemPrompt.header(providerID) // Provider-specific header + + system.push( + ...(() => { + if (input.system) return [input.system] // Step 1: Custom override + if (input.agent.prompt) return [input.agent.prompt] // Step 2: Agent prompt + return SystemPrompt.provider(modelID) // Step 3: Model-specific default + })() + ) + + system.push(...(await SystemPrompt.environment())) // Step 4: Environment context + system.push(...(await SystemPrompt.custom())) // Step 5: Custom instructions + + // Optimization: Combine into 2 messages for prompt caching + const [first, ...rest] = system + system = [first, rest.join("\n")] + return system +} +``` + +### Prompt Template Files + +**Location:** `/packages/opencode/src/session/prompt/*.txt` + +| Template File | Model Target | Size | Purpose | +|--------------|--------------|------|---------| +| `anthropic.txt` | Claude | 8.2 KB | General coding assistant | +| `beast.txt` | GPT-4/o1/o3 | 11 KB | Autonomous problem-solving | +| `gemini.txt` | Gemini | 15 KB | Gemini-specific instructions | +| `codex.txt` | GPT-5 | 24 KB | Detailed workflows | +| `qwen.txt` | Other | 9.7 KB | Minimal prompt | +| `polaris.txt` | Polaris-alpha | 8.3 KB | Polaris-specific | + +**Selection Logic:** `/packages/opencode/src/session/system.ts:27-34` + +```typescript +export function provider(modelID: string) { + if (modelID.includes("gpt-5")) return [PROMPT_CODEX] + if (modelID.includes("gpt-") || modelID.includes("o1") || modelID.includes("o3")) return [PROMPT_BEAST] + if (modelID.includes("gemini-")) return [PROMPT_GEMINI] + if (modelID.includes("claude")) return [PROMPT_ANTHROPIC] + if (modelID.includes("polaris-alpha")) return [PROMPT_POLARIS] + return [PROMPT_ANTHROPIC_WITHOUT_TODO] // Default +} +``` + +### Session Schema + +**Location:** `/packages/opencode/src/session/index.ts:37-75` + +```typescript +export const Info = z.object({ + id: Identifier.schema("session"), + projectID: z.string(), + directory: z.string(), + parentID: Identifier.schema("session").optional(), + summary: z.object({...}).optional(), + share: z.object({...}).optional(), + title: z.string(), + version: z.string(), + time: z.object({...}), + revert: z.object({...}).optional(), +}) +``` + +### Session Creation Flow + +**API Endpoint:** `POST /session` +**Handler:** `/packages/opencode/src/server/server.ts:516-521` + +```typescript +validator("json", Session.create.schema.optional()), +async (c) => { + const body = c.req.valid("json") ?? {} + const session = await Session.create(body) // Currently accepts: {parentID?, title?} + return c.json(session) +} +``` + +**Session.create Function:** `/packages/opencode/src/session/index.ts:122-135` + +```typescript +export const create = fn( + z.object({ + parentID: Identifier.schema("session").optional(), + title: z.string().optional(), + }).optional(), + async (input) => { + return createNext({ + parentID: input?.parentID, + directory: Instance.directory, + title: input?.title, + }) + } +) +``` + +--- + +## Problem Statement + +### Current Limitations + +1. **No Persistent Session-Level Customization** + - The `system` parameter in `PromptInput` must be passed on **every message request** + - No way to set a custom prompt once during session creation and have it persist + - Cumbersome for multi-turn conversations with specialized agents + +2. **Agent Configs Are Global** + - Agent configurations in `~/.opencode/agent/*.md` are project/user-wide + - Cannot create ephemeral, one-off specialized sessions without modifying configs + - No way to experiment with different prompts without file system changes + +3. **Template Reusability** + - Users cannot easily create and reference reusable prompt templates + - No mechanism to version or share prompt templates across teams + +### Use Cases + +1. **Data Analyst Agent** + ```bash + # User wants to start a session with data analysis focus + opencode --prompt templates/data-analyst.txt + ``` + +2. **Security Auditor** + ```bash + # Security-focused session for code review + opencode --prompt security-auditor + ``` + +3. **Domain-Specific Agents** + ```bash + # Medical records processing (HIPAA-compliant) + # Financial analysis (SOX-compliant) + # Legal document review + ``` + +4. **A/B Testing Prompts** + - Test different prompt variations without editing config files + - Compare agent behavior with different system prompts + +--- + +## Proposed Solution + +### Design Principles + +1. **Persistent but Optional:** Custom prompts stored in session metadata, falling back to existing behavior +2. **File-Based Templates:** Support loading prompts from files for reusability +3. **Inline Prompts:** Support inline prompt strings for quick experiments +4. **Backward Compatible:** Zero breaking changes to existing API +5. **Composable:** Custom prompts work with existing environment/instruction system + +### Solution Overview + +Add **session-level custom prompt templates** that: +- Are specified once during session creation +- Persist in session metadata +- Take precedence between agent prompts and model-specific defaults +- Support both file paths and inline strings + +### Priority Order (Updated) + +``` +1. Per-request `system` parameter (API override) +2. Agent-specific `agent.prompt` (from agent config) +3. ✨ NEW: Session-level `customPromptTemplate` (from session metadata) +4. Model-specific default (anthropic.txt, beast.txt, etc.) +5. Environment context (git status, file tree, etc.) +6. Custom instructions (AGENTS.md, CLAUDE.md, etc.) +``` + +--- + +## Technical Design + +### 1. Schema Changes + +#### Session.Info Schema Extension + +**File:** `/packages/opencode/src/session/index.ts` + +```typescript +export const Info = z.object({ + id: Identifier.schema("session"), + projectID: z.string(), + directory: z.string(), + parentID: Identifier.schema("session").optional(), + + // ✨ NEW: Custom prompt template + customPrompt: z.object({ + type: z.enum(["file", "inline"]), + value: z.string(), // File path or inline prompt text + loadedAt: z.number().optional(), // Timestamp for cache invalidation + }).optional(), + + summary: z.object({...}).optional(), + share: z.object({...}).optional(), + title: z.string(), + version: z.string(), + time: z.object({...}), + revert: z.object({...}).optional(), +}) +``` + +#### Session.create Schema Extension + +**File:** `/packages/opencode/src/session/index.ts` + +```typescript +export const create = fn( + z.object({ + parentID: Identifier.schema("session").optional(), + title: z.string().optional(), + + // ✨ NEW: Custom prompt options + customPrompt: z.union([ + z.string(), // Shorthand: file path or inline text (auto-detect) + z.object({ + type: z.enum(["file", "inline"]), + value: z.string(), + }), + ]).optional(), + }).optional(), + async (input) => { + // Implementation details below... + } +) +``` + +### 2. Prompt Loading Logic + +#### New Helper: `SystemPrompt.fromSession()` + +**File:** `/packages/opencode/src/session/system.ts` + +```typescript +export async function fromSession(sessionID: string): Promise { + const session = await Session.get(sessionID) + if (!session.customPrompt) return null + + if (session.customPrompt.type === "inline") { + return session.customPrompt.value + } + + if (session.customPrompt.type === "file") { + const filePath = resolveTemplatePath(session.customPrompt.value) + + // Cache check (optional optimization) + const fileStats = await Bun.file(filePath).stat() + if (session.customPrompt.loadedAt && fileStats.mtime.getTime() <= session.customPrompt.loadedAt) { + // File hasn't changed, could use cached version + } + + const content = await Bun.file(filePath).text() + return content + } + + return null +} + +function resolveTemplatePath(value: string): string { + // Priority order for file resolution: + // 1. Absolute path: /path/to/template.txt + // 2. Home directory: ~/templates/data-analyst.txt + // 3. Project .opencode/prompts/: template.txt → .opencode/prompts/template.txt + // 4. Global ~/.opencode/prompts/: template.txt → ~/.opencode/prompts/template.txt + + if (path.isAbsolute(value)) return value + if (value.startsWith("~/")) return path.join(os.homedir(), value.slice(2)) + + // Check project-level prompts + const projectPrompt = path.join(Instance.directory, ".opencode", "prompts", value) + if (Bun.file(projectPrompt).exists()) return projectPrompt + + // Check global prompts + const globalPrompt = path.join(Global.Path.config, "prompts", value) + if (Bun.file(globalPrompt).exists()) return globalPrompt + + // Fallback: treat as relative to cwd + return path.resolve(Instance.directory, value) +} +``` + +#### Updated `resolveSystemPrompt()` + +**File:** `/packages/opencode/src/session/prompt.ts` + +```typescript +async function resolveSystemPrompt(input: { + system?: string + agent: Agent.Info + providerID: string + modelID: string + sessionID: string // ✨ NEW: Need session ID to load custom prompt +}) { + let system = SystemPrompt.header(input.providerID) + + system.push( + ...(() => { + if (input.system) return [input.system] // 1. Per-request override + if (input.agent.prompt) return [input.agent.prompt] // 2. Agent prompt + + // ✨ NEW: 3. Session-level custom prompt + const sessionPrompt = await SystemPrompt.fromSession(input.sessionID) + if (sessionPrompt) return [sessionPrompt] + + return SystemPrompt.provider(input.modelID) // 4. Model default + })() + ) + + system.push(...(await SystemPrompt.environment())) // 5. Environment + system.push(...(await SystemPrompt.custom())) // 6. Custom instructions + + const [first, ...rest] = system + system = [first, rest.join("\n")] + return system +} +``` + +**Note:** Need to pass `sessionID` to `resolveSystemPrompt()` - already available in calling context at line 495. + +### 3. Session Creation Logic + +#### Updated `createNext()` + +**File:** `/packages/opencode/src/session/index.ts` + +```typescript +export async function createNext(input: { + id?: string + title?: string + parentID?: string + directory: string + customPrompt?: { // ✨ NEW + type: "file" | "inline" + value: string + } +}) { + const result: Info = { + id: Identifier.descending("session", input.id), + version: Installation.VERSION, + projectID: Instance.project.id, + directory: input.directory, + parentID: input.parentID, + title: input.title ?? createDefaultTitle(!!input.parentID), + + // ✨ NEW: Store custom prompt metadata + customPrompt: input.customPrompt ? { + type: input.customPrompt.type, + value: input.customPrompt.value, + loadedAt: Date.now(), + } : undefined, + + time: { + created: Date.now(), + updated: Date.now(), + }, + } + + await Storage.write(["session", Instance.project.id, result.id], result) + // ... rest of existing logic + return result +} +``` + +### 4. Auto-Detection Logic + +**File:** `/packages/opencode/src/session/index.ts` + +```typescript +function parseCustomPromptInput(input: string | { type: string; value: string }) { + if (typeof input === "object") { + return input as { type: "file" | "inline"; value: string } + } + + // Auto-detect: if it looks like a file path, treat as file + // Otherwise, treat as inline prompt + + const isFilePath = + input.startsWith("/") || // Absolute path + input.startsWith("~/") || // Home directory + input.startsWith("./") || // Relative path + input.startsWith("../") || // Parent directory + input.endsWith(".txt") || // Common extension + input.endsWith(".md") || + !input.includes("\n") // Single line = likely a path + + return { + type: isFilePath ? "file" as const : "inline" as const, + value: input, + } +} + +export const create = fn( + z.object({ + parentID: Identifier.schema("session").optional(), + title: z.string().optional(), + customPrompt: z.union([ + z.string(), + z.object({ + type: z.enum(["file", "inline"]), + value: z.string(), + }), + ]).optional(), + }).optional(), + async (input) => { + const customPrompt = input?.customPrompt + ? parseCustomPromptInput(input.customPrompt) + : undefined + + return createNext({ + parentID: input?.parentID, + directory: Instance.directory, + title: input?.title, + customPrompt, + }) + } +) +``` + +--- + +## Implementation Plan + +### Phase 1: Core Implementation (Priority: High) + +#### Task 1.1: Extend Session Schema +**File:** `/packages/opencode/src/session/index.ts` + +- [ ] Add `customPrompt` field to `Session.Info` schema (lines 37-71) +- [ ] Add `customPrompt` parameter to `Session.create` schema (lines 122-135) +- [ ] Add `customPrompt` parameter to `createNext()` function (lines 175-208) +- [ ] Implement `parseCustomPromptInput()` helper function +- [ ] Update session storage to persist custom prompt metadata + +**Complexity:** Low +**Risk:** Low (additive change, backward compatible) + +#### Task 1.2: Implement Prompt Loading +**File:** `/packages/opencode/src/session/system.ts` + +- [ ] Add `fromSession()` function to load session-level prompts +- [ ] Implement `resolveTemplatePath()` helper for file resolution +- [ ] Add error handling for missing/invalid template files +- [ ] Add logging for prompt loading (debugging) + +**Complexity:** Medium +**Risk:** Medium (file I/O, path resolution edge cases) + +#### Task 1.3: Update Prompt Resolution +**File:** `/packages/opencode/src/session/prompt.ts` + +- [ ] Pass `sessionID` to `resolveSystemPrompt()` function (line 621) +- [ ] Call `SystemPrompt.fromSession()` in priority order (line 629-633) +- [ ] Update all call sites of `resolveSystemPrompt()` to include sessionID +- [ ] Verify prompt caching still works correctly + +**Complexity:** Low +**Risk:** Low (small change to existing function) + +#### Task 1.4: API Validation +**File:** `/packages/opencode/src/server/server.ts` + +- [ ] Verify OpenAPI schema includes new `customPrompt` field (line 516) +- [ ] Test API endpoint with new parameter +- [ ] Add validation for file path security (no directory traversal) + +**Complexity:** Low +**Risk:** Medium (security validation important) + +### Phase 2: CLI Integration (Priority: Medium) + +#### Task 2.1: Add CLI Flag +**File:** `/packages/opencode/src/cli/cmd/*.ts` (TBD - find CLI entry point) + +- [ ] Add `--prompt