Runtime selection
Second supports three builder runtimes:| Runtime | Runtime ID | Model format | Parameter controls |
|---|---|---|---|
| Claude Code | claude-code | Claude model IDs such as claude-opus-4-8 | Effort and thinking |
| Codex CLI | codex-cli | OpenAI model IDs such as gpt-5.4 | Reasoning effort and Codex sandbox |
| OpenCode | opencode | OpenCode provider/model IDs such as openai/gpt-5.5 | Model variant |
apps/web/src/lib/agent/runtime-registry.ts. It groups models by runtime and renders only the parameter controls supported by the selected runtime. The composer and chat transport send runtimeId, runtimeModel, and runtimeParams on every app creation, settings update, and chat POST.
The local onboarding runtime choice is also saved as a browser preference so the app composer opens with the selected runtime instead of falling back to the project default.
How command runtimes work under the hood
Claude uses the Claude Agent SDK. Codex is launched through the Codex CLI app-server protocol over stdio, which is the same local Codex runtime surface used by the Codex SDK but without adding an extra SDK dependency in the worker. OpenCode is launched in non-interactive JSON mode. The worker normalizes all runtime output into the same Claude-shaped worker SSE events so the existing chat bridge and AI element cards continue to render streamed text, plans, terminal commands, file edits, app data tools, integration setup, anddone_building.
OpenCode support requires an OpenCode CLI version whose opencode run --help includes --format json. Older OpenCode binaries are reported during onboarding as installed but not usable for the OpenCode runtime, and the worker returns a clear runtime error instead of starting a non-streamable plain-text run. OpenCode model discovery uses opencode models --verbose, filters to models whose metadata reports capabilities.toolcall: true, and exposes each model’s variants as the OpenCode intelligence control. The selected variant is passed to opencode run --variant; auto omits the flag and lets OpenCode choose the model default.
Claude Agent SDK
Understanding model selection requires understanding whatquery() does at the process level.
Every call spawns a new process
The Claude Agent SDK does not keep a long-running connection to the Anthropic API. Eachquery() call spawns a brand new CLI process:
- Sends
POST https://api.anthropic.com/v1/messageswith the specified model - Claude responds with text and/or tool calls
- CLI executes tools locally (Read, Edit, Bash, etc.)
- CLI appends tool results and sends another API call
- Repeat until Claude responds with no tool calls
- Process exits
claude CLI binary.
Sessions are files on disk
The CLI writes every API request and response to a JSONL file:model field and full usage object. This file is the CLI’s own record of what happened — not written by our code.
The API is stateless
Anthropic’s Messages API has no server-side sessions. Every API call includes the entire conversation history as themessages array. Resuming a session means re-sending all previous messages as input tokens.
Prompt caching mitigates this: system prompts, tool definitions, and early messages get cached at 0.1x the input price. In practice, most resumed conversations hit the cache heavily.
Model selection and switching
Available models
The runtime registry includes Claude Code, Codex CLI, and OpenCode defaults. It stores runtime-native IDs, display names, descriptions, defaults, and parameter constraints. OpenCode also has a dynamic model picker that reads the installed OpenCode catalog/config at runtime. Dynamically discovered OpenCode models keep their nativeprovider/model IDs instead of being collapsed back to the static defaults.
Claude pricing metadata is available for cost display:
| Display name | Model ID | Description | Input / MTok | Output / MTok | Cache read / MTok |
|---|---|---|---|---|---|
| Opus 4.8 | claude-opus-4-8 | Most capable for long-horizon agentic work | $5 | $25 | $0.50 |
| Opus 4.6 | claude-opus-4-6 | Previous Opus release, still available | $5 | $25 | $0.50 |
| Sonnet 4.6 | claude-sonnet-4-6 | Most efficient for everyday tasks | $3 | $15 | $0.30 |
| Haiku 4.5 | claude-haiku-4-5 | Fastest for quick answers | $1 | $5 | $0.10 |
xhigh effort, adaptive thinking, and summarized thinking display. Runtime defaults and model display names are defined in lib/agent/runtime-registry.ts.
Model-specific capabilities
Some features are only available on certain models:| Feature | Available on | Fallback for other models |
|---|---|---|
Effort: xhigh | Opus 4.8 | high |
Effort: max | Opus 4.8, Opus 4.6, Sonnet 4.6 | high |
Thinking: adaptive | Opus 4.8, Opus 4.6, Sonnet 4.6 | disabled |
Thinking: enabled | Opus 4.6, Sonnet 4.6 | adaptive on Opus 4.8 |
display: "summarized" with adaptive thinking. Without that flag, Claude may spend thinking tokens but return empty thinking text to the UI.
How switching works
The user selects a runtime model and runtime-specific parameters from the composer. Each message carries the normalized runtime settings through the full stack:--model claude-opus-4-8 --resume <sessionId>. The CLI reads the session JSONL (which includes all previous Sonnet messages), sends the full history to the API with the new model, and continues the conversation. Effort and thinking settings take effect on the same call.
Second stores provider-native session state per runtime on the run document. When the user keeps using a runtime whose native session state is current, the next message sends only the latest user prompt plus that runtime’s session state. When the user switches to another runtime, Second uses the persisted provider-agnostic UIMessage[] transcript as the handoff layer. The chat route builds a bounded neutral transcript for the messages that the target runtime has not already seen, then appends the latest user message. The target runtime receives that handoff as plain prompt context plus its own provider session state when one exists.
Second does not write vendor-private session files to “convert” a Claude session into a Codex or OpenCode session. The durable source of truth is the stored UIMessage[] plus the workspace files on disk; provider session state is an optimization for native resume, not the tenant boundary or the only conversation record.
No re-run and no conversation restart. Same-runtime switches use native resume when possible; cross-runtime switches use the neutral transcript handoff and continue from the same Second run.
Why custom fetch (not transport body)
The Vercel AI SDK’suseChat hook captures the DefaultChatTransport instance on first render and never swaps it. If you create a new transport when the model changes, useChat ignores it.
The solution: create one stable transport (memoized on chatApi only) with a custom fetch function that reads current values from React refs on every request:
Composer layout
+button — Attach files (placeholder, not wired yet).- Model dropdown (
components/model-selector.tsx) — shared between the workspace composer and the chat composer. Shows Claude and Codex models inline and opens a searchable OpenCode model dialog for larger OpenCode catalogs. Includes an “Add runtime” dialog with setup notes for Claude Code, Codex CLI, and OpenCode. - Runtime parameter dropdowns (
components/runtime-parameter-selectors.tsx) — rendered fromruntime-registry.ts. Claude shows effort and thinking. Codex CLI shows reasoning effort and sandbox mode. OpenCode shows the selected model variant. - Submit button — Circle with
ArrowUpicon. Switches toPausewhile streaming. Clicking during a stream callsstop()to abort.
Local provider setup
During onboarding in local mode (SECOND_AUTH_MODE=none), a provider setup screen at /onboarding/provider auto-detects what’s available:
- Claude CLI on PATH — checked via
which claudeon the worker, orSECOND_CLAUDE_PATHwhen an operator pins a custom executable path - Codex CLI on PATH — checked via
which codexon the worker, orSECOND_CODEX_PATHwhen configured - OpenCode CLI on PATH with JSON events — checked via
which opencodeandopencode run --helpon the worker, orSECOND_OPENCODE_PATHwhen configured. OpenCode model discovery is available through the worker’s/opencode/modelsendpoint and returns only model metadata, not auth files or config contents. - Runtime auth env hints —
ANTHROPIC_API_KEY,CODEX_API_KEY,OPENAI_API_KEY,GOOGLE_API_KEY, andGEMINI_API_KEYare reported only as booleans, never values
claude login), everything works automatically — no API key needed. The SDK spawns the user’s local claude binary, which uses their existing auth.
If ANTHROPIC_API_KEY is set, it takes priority — the CLI switches to API billing regardless of whether the user is also logged in via subscription.
Codex CLI can use its own login state or CODEX_API_KEY/OPENAI_API_KEY, depending on the installed CLI configuration. Detection runs codex login status and checks stdout and stderr because Codex may print login status on stderr even when the command succeeds. It reports only a boolean auth result; it never returns token values or reads auth file contents. OpenCode uses the provider credentials required by the selected provider/model ID.
This screen only exists in local mode. In enterprise deployments (SECOND_AUTH_MODE=external), the API key is configured before deployment and the screen is skipped entirely.
Files involved
| File | Role |
|---|---|
apps/worker/src/index.ts | GET /detect-provider — detects claude, codex, opencode, and auth-mode booleans |
apps/web/src/app/api/setup/detect-provider/route.ts | Proxies to worker |
apps/web/src/app/onboarding/provider/page.tsx | Server component — guards, renders setup |
apps/web/src/components/provider-setup.tsx | Client component — calls detect, shows results |
Billing modes
Second separates runtime authentication from token/cost visibility. Runtimes can emit token counts and API-equivalent dollar estimates even when the local CLI usage is covered by a subscription plan.| Runtime | Local subscription mode | API billing mode |
|---|---|---|
| Claude Code | SECOND_AUTH_MODE=none, no ANTHROPIC_API_KEY, Claude CLI logged in via Claude.ai | ANTHROPIC_API_KEY configured |
| Codex CLI | SECOND_AUTH_MODE=none, no CODEX_API_KEY/OPENAI_API_KEY, Codex CLI logged in with ChatGPT | CODEX_API_KEY or OPENAI_API_KEY configured |
| OpenCode | Not treated as subscription-backed by Second | Provider key required by the selected provider/model |
page.tsx from server environment flags, then AppWorkspace applies the billing display per model row. This matters for mixed-runtime runs: a Claude subscription row and a Codex ChatGPT-login row can both be struck through, while an API-key-backed OpenCode row still displays as billable.
Usage tracking
Where the data comes from
Claude emits aresult message at the end of every SDK query() call:
modelUsage field is computed by the runtime adapter from provider runtime events. Claude includes cost and token data from the Claude CLI result. Codex app-server exposes token usage but not a dollar value, so Second estimates OpenAI cost from the selected model’s current input, cached-input, and output token rates. OpenCode emits the same result shape when its JSON stream exposes usage data; when a runtime does not expose cost and Second has no pricing metadata for the selected model, Second records token counts when available and zero cost.
How it’s captured
$inc. Each runtime turn adds to the run’s totals. Multiple messages in a run accumulate correctly.
Schema
Theusage field on AgentRunDocument:
Querying for billing
Per-app cost across all runs:UI: info panel
A smallⓘ icon in the top-right corner of the app page opens a dropdown showing:
- Total run cost
- Input / output / cache-read token counts
- Per-model breakdown (model name, token count, cost)
GET /chat which includes usage in the response.
Verification and debugging
Five levels of proof that the correct runtime/model was used, from lowest (closest to the metal) to highest:1. Runtime-native logs
Claude writes every raw API response to a session JSONL file. Themodel field in each response is what the Anthropic API returned.
msg_01... ID is assigned by Anthropic’s API. Different IDs = different API calls. Different model values = different models served the request.
Codex CLI and OpenCode keep their own runtime/session records depending on the installed CLI configuration. Use those native logs together with Second’s stored sessionState when debugging resume behavior.
2. Provider console
For API-key backed runtimes, use the provider’s console or usage logs. For example, Anthropic logs Claude API calls with model, token counts, and cost.3. MongoDB
modelUsage from all result messages in the run.
4. Worker terminal
The worker logs each request:5. Browser Network tab
Open devtools → Network → filter bychat. Inspect the POST request payload. It contains runtimeId, runtimeModel, and runtimeParams injected by the custom fetch.