Gateway is Smithers’ headless control plane. Reach for it (instead of startServer()) when long-lived clients (bots, dashboards, schedulers, and custom UIs) need to authenticate once, stream events over WebSocket with resilient reconnection, decide approvals, inject signals, access metrics, and manage cron schedules across many registered workflows. Custom UIs, whether using the vanilla SDK or React hooks, rely on the Gateway to provide pushed updates and a stale-data-free model. For the single-workflow Hono-based HTTP surface, see Serve Mode (createServeApp() / bunx smithers-orchestrator up --serve).
API reference: Server & Gateway and Gateway Client list every gateway and client export, its options, and links to source and tests.
Quick start
Gateway client SDK
Programmatic clients (bots, schedulers, dashboards, third-party UIs) talk to the Gateway through the typed client SDK over the same RPC and WebSocket API. For the full custom-UI guide (declarative queries, pushed updates, stale guards, reconnect/resume, backpressure, optimistic mutations, auth, vanilla JS + React hooks) see Custom UIs.| Package | Exports |
|---|---|
smithers-orchestrator/gateway-client | SmithersGatewayClient, SmithersGatewayConnection, GatewayRpcError, gatewayBackoffDelay, RPC frame/type-map types, extension envelope helpers/types, GatewayUiBootConfig, SmithersGatewayClientOptions, createGatewayCollection, gatewayCollectionDefs, flattenGatewayRunNode, snapshotToGatewayRunNode, reconcileSnapshotNodes, collection row types, syncBackoffDelay, syncKeyFingerprint, syncKeyMatches, gatewayKeys, createSmithersGatewayTransport |
smithers-orchestrator/gateway-react | SmithersGatewayProvider, createGatewayReactRoot, useGatewayRun, useGatewayRuns, useGatewayWorkflows, useGatewayApprovals, useGatewayNodeOutput, useGatewayRunEvents, useGatewayActions, useGatewayRpc, useSmithersGateway, useGatewayExtensionResource, useGatewayExtensionAction, useGatewayExtensionStream, SyncProvider, createGatewayCollections, useSyncClient, useSyncQuery, useSyncMutation, useSyncSubscription, useGatewayQuery, useGatewayMutation, useGatewayRunStream, useGatewayRunTree, useGatewayConnectionStatus |
RPC methods (TOON)
health remains available as a utility RPC and GET /health is available without auth. The legacy method names are still accepted for compatibility (runs.create, runs.get, runs.list, runs.cancel, runs.rerun, runs.diff, frames.list, frames.get, attempts.list, attempts.get, workflows.list, approvals.list, approvals.decide, signals.send, cron.list, cron.add, cron.remove, cron.trigger, getDevToolsSnapshot, jumpToFrame, devtools.jumpToFrame, devtools.getNodeOutput, devtools.getNodeDiff), but new clients should use the v1 names above.
Scopes
* grants every scope. Pass a method name string in the scopes array (e.g. "launchRun") to grant access to exactly that RPC call. Legacy wildcard method grants such as cron.* continue to match legacy method names; typed scopes are the contract to use for new integrations. Legacy ranked grants (read, execute, approve, admin) are accepted so older tokens keep working.
rewindRun (destructive rewind)
Rewinds a run to a prior frame and makes it resumable from that point.
This is destructive: it truncates frames, attempts, output rows, and
diff-cache entries beyond the target; reverts JJ sandboxes; marks the
run running again; and emits a TimeTravelJumped event so
streamDevTools subscribers rebaseline.
Caller identity is authorized per-request: the connection must have
run:admin scope and must also be the run owner (userId matches
ownerId) or have role: "admin". Scope alone never grants access.
The legacy aliases jumpToFrame and devtools.jumpToFrame route to
rewindRun.
Request:
JumpResult):
run.time_travel_jumped with
{ runId, fromFrameNo, toFrameNo, timestampMs, caller }.
Quota: 10 rewinds per run per caller per hour (default window). Exceeded
→ RateLimited.
Failure modes and HTTP status:
| Code | Meaning | HTTP |
|---|---|---|
InvalidRunId | runId fails /^[a-z0-9_-]{1,64}$/. | 400 |
InvalidFrameNo | frameNo is not a non-negative i32 integer. | 400 |
ConfirmationRequired | Caller omitted confirm: true. | 400 |
FrameOutOfRange | frameNo > latest frame, or run has no frames. | 400 |
Unauthorized | Caller is neither the run owner nor an admin (audit row still written). | 401 |
RunNotFound | runId does not exist. | 404 |
Busy | Another rewind is in flight for this run. | 409 |
RateLimited | Caller exceeded rewind quota (default 10/hour). | 429 |
UnsupportedSandbox | A sandbox cannot be reverted (missing / untrackable jjPointer). | 501 |
VcsError | A JJ revert call failed; DB/reconciler rolled back. | 500 |
RewindFailed | Rewind failed and rollback was partial; run marked needs_attention. | 500 |
_smithers_time_travel_audit with result ∈ { success, failed, partial, in_progress }.
An in-progress row is inserted before any mutation and updated in place
on completion; startup recovery flips any leftover in_progress rows to
partial.
Node output
getNodeOutput returns the DevTools Output-tab payload for a single task iteration:
Error codes
Gateway v1 RPC errors use stable code strings and HTTP status mappings:Versioned wire shapes
All DevTools wire types carryversion: 1.
DevToolsSnapshot (v1):
DevToolsDelta (v1):
DevToolsEvent (v1), frames pushed over devtools.event:
snapshot event, then emits delta events
per frame. The server re-baselines (emits a full snapshot instead of a
delta) after 50 delta events, when a delta is larger than a fresh snapshot,
or when the gateway observes TimeTravelJumped for the run.
WebSocket protocol
Three frame types share the same socket:req:{ type: "req", id, method, params? }from client.res:{ type: "res", id, ok, payload?, error? }from server, correlated byid.event:{ type: "event", event, payload?, seq, stateVersion }server-pushed;seqis per connection,stateVersionis global.
connect.challenge ({ nonce, ts }). The client replies with a connect request carrying minProtocol, maxProtocol, client metadata, auth, and an optional subscribe: string[] to filter events by runId. The server returns a hello payload (protocol, features, policy.heartbeatMs, auth with sessionToken/role/scopes/userId, snapshot).
After connect, the gateway emits tick events every heartbeatMs. launchRun, submitApproval, submitSignal, and cronRun automatically subscribe the connection to the affected runId. Server-pushed event names:
| Event | Category |
|---|---|
connect.challenge | Connection |
tick | Connection |
run.event | Run lifecycle |
run.heartbeat | Run lifecycle |
run.gap_resync | Run lifecycle |
run.error | Run lifecycle |
run.completed | Run lifecycle |
run.time_travel_jumped | Run lifecycle |
node.started | Run lifecycle |
node.finished | Run lifecycle |
node.failed | Run lifecycle |
task.output | Run lifecycle |
task.heartbeat | Run lifecycle |
approval.requested | Approval |
approval.decided | Approval |
approval.auto_approved | Approval |
cron.triggered | Cron |
devtools.event | DevTools |
POST /rpc accepts the same body shape ({ id, method, params }) and returns the same ResponseFrame. Auth headers: Authorization: Bearer <token> or x-smithers-key: <token> (or trusted-proxy headers in trusted-proxy mode).
GatewayOptions
scope, role from role, and user id from sub unless the *Claim options override those claim names. Missing JWT role falls back to defaultRole and then operator; missing JWT scopes fall back to defaultScopes and then []. Trusted-proxy auth reads trustedHeaders as [user, scopes, role]; missing role falls back to defaultRole and then operator, and missing scopes fall back to defaultScopes and then ["*"].
allowedOrigins is available in every mode (token, jwt, trusted-proxy) as defense-in-depth. It defaults to [], which enforces no Origin allowlist. When non-empty, the gateway rejects any HTTP RPC or WebSocket upgrade whose browser Origin header is not on the list; requests with no Origin header (server-to-server / CLI callers) are always allowed. Set it to your operator-UI origin(s) when exposing a token/jwt gateway to a browser.
Runs started through the gateway expose ctx.auth = { triggeredBy, role, scopes, createdAt }. <Approval> may further restrict decisions with allowedScopes and allowedUsers, which the gateway enforces before accepting submitApproval.
headersTimeout and requestTimeout are applied to the underlying Node HTTP server when gateway.listen() starts. Keep both below the corresponding reverse-proxy idle/read timeouts so slow clients are closed by Smithers first.
Notes
- Cron:
gateway.register(name, wf, { schedule })writes a cron row keyedgateway:<name>; the gateway polls between 1 s and 15 s (clamped fromheartbeatMs). Cron-fired runs getctx.auth.role = "system",triggeredBy = "cron:gateway",scopes = ["*"]. - JWT mode currently validates
alg=HS256, HMAC,iss,aud,exp,nbf. Scope claims may be arrays or space/comma-separated strings. - Trusted-proxy mode is only safe behind something you control (Cloudflare Access, internal API gateway) that strips and rewrites identity headers.
- DevTools streams: see Versioned wire shapes for re-baseline triggers; over-capacity subscribers receive
BackpressureDisconnect.