Skip to main content
Smithers ships a built-in MCP stdio server. Passing --mcp to the CLI speaks the Model Context Protocol over stdin/stdout instead of acting as an interactive CLI. Any MCP-aware client can connect, discover workflows, start runs, watch progress, resolve approvals, and revert bad attempts through structured tool calls. Use the MCP server when an AI agent should drive Smithers autonomously. Use the HTTP Server for REST endpoints for human-written code or webhooks.

Setup

Start the server

bunx smithers-orchestrator --mcp
This starts the semantic surface: a stable, structured tool set for AI agent consumption, documented on this page. Two additional surfaces are available via --surface:
# Semantic tools only (default)
bunx smithers-orchestrator --mcp --surface semantic

# Raw CLI-mirroring tools only
bunx smithers-orchestrator --mcp --surface raw

# Both surfaces registered on the same server
bunx smithers-orchestrator --mcp --surface both
Use --surface raw only for direct CLI parity. Prefer the semantic surface for new integrations: every tool returns a { ok, data, error } envelope with Zod-validated input and output schemas. Scope the semantic surface when a client should not receive every Smithers control tool:
# Expose only selected semantic tools
bunx smithers-orchestrator --mcp --allowed-tools list_workflows,get_run

# Expose only tools annotated as read-only
bunx smithers-orchestrator --mcp --read-only
--allowed-tools accepts a comma-separated list of semantic tool names. Passing an empty allowlist intentionally exposes no semantic tools. --read-only removes semantic tools with write or control side effects, such as starting runs, resolving approvals, or reverting attempts. With --surface both, these controls apply to the semantic toolset only; raw CLI-mirroring tools are still registered by the raw surface.

Register manually

For clients that read JSON config directly:
{
  "mcpServers": {
    "smithers": {
      "command": "bunx",
      "args": ["smithers-orchestrator", "--mcp"]
    }
  }
}
Project-scoped install (e.g. a monorepo where Smithers is a dev dependency; ensure smithers-orchestrator is in the local package.json):
{
  "mcpServers": {
    "smithers": {
      "command": "bunx",
      "args": ["smithers-orchestrator", "--mcp"]
    }
  }
}

If mcp add fails

bunx smithers-orchestrator mcp add hands the launch command to a registration helper that expects it as a single argument. If a runner or shell word-splits it, the helper sees a bare --mcp token and aborts:
Registering MCP server...
code: MCP_ADD_FAILED
message: "error: unknown option '--mcp'"
Register with your agent’s own CLI instead. The -- separator tells the agent that everything after it is the launch command, so it never parses --mcp as one of its own flags:
codex mcp add smithers -- bunx smithers-orchestrator --mcp
claude mcp add smithers -- bunx smithers-orchestrator --mcp
Any MCP-aware CLI follows the same <agent> mcp add <name> -- <command> shape. Or write the JSON or TOML config by hand using the snippets above. The Smithers CLI prints these fallback commands automatically whenever mcp add fails.

Tool Registration

On start, each tool is registered with its input schema, output schema, and MCP annotations. Every tool carries:
  • inputSchema: Zod object describing accepted parameters.
  • outputSchema: Zod schema for the structured response envelope.
  • annotations: MCP annotation metadata (readOnlyHint, destructiveHint, idempotentHint, openWorldHint).

Structured tool envelope

Every tool returns the same shape:
{
  ok: boolean;
  data?: { ... };     // present on success
  error?: {           // present on failure
    code: string;
    message: string;
    details?: Record<string, unknown> | null;
    docsUrl?: string | null;
  };
}
The response is also echoed as a text content block, so clients that do not parse structuredContent still receive the JSON payload.

Tool annotations

AnnotationToolsMeaning
readOnlyHint: trueMost query toolsTool does not modify state
readOnlyHint: false, openWorldHint: truerun_workflowLaunches external processes
readOnlyHint: false, destructiveHint: true, idempotentHint: falseresolve_approval, revert_attempt, rewind_run, restore_checkpoint, time_travelMutates persisted state irreversibly
readOnlyHint: false, idempotentHint: falsefork_run, replay_runCreates new run/branch state

Tool Reference

list_workflows

List all Smithers workflows discovered in the working directory. Input: none Output:
{
  workflows: Array<{
    id: string;
    metadataVersion: number;
    displayName: string;
    scope: "local" | "global";
    entryFile: string;
    path: string;
    sourceType: string;
    description: string;
    tags: string[];
    aliases: string[];
  }>;
}
Use the returned id values as the workflowId parameter for run_workflow.

run_workflow

Start or resume a discovered workflow. Input:
ParameterTypeDefaultDescription
workflowIdstringrequiredWorkflow ID from list_workflows
inputRecord<string, unknown>{}Workflow input object
promptstring-Shorthand: sets input.prompt when input is not provided
runIdstringautoCustom run ID
resumebooleanfalseResume an existing run; requires runId
forcebooleanfalseForce-start even if a run with this ID already exists
waitForTerminalbooleanfalseBlock until the run reaches a terminal state
waitForStartMsnumber1000For background launches, how long to wait for the run row to appear in the database
maxConcurrencynumber-Max concurrent nodes
rootDirstring-Root directory for tool sandboxing and path resolution
logDirstring-Directory for log files
allowNetworkbooleanfalseAllow network access in bash tool
maxOutputBytesnumber-Cap on node output size
toolTimeoutMsnumber-Per-tool call timeout
hotbooleanfalseEnable hot-reloading of the workflow file
Output:
{
  workflow: {
    id: string;
    metadataVersion: number;
    displayName: string;
    scope: "local" | "global";
    entryFile: string;
    path: string;
    sourceType: string;
    description: string;
    tags: string[];
    aliases: string[];
  };
  runId: string;
  launchMode: "background" | "waited";
  requestedResume: boolean;
  status: string;
  observedRun: RunSummary | null;
  result: { runId, status, output?, error? } | null;
}
Background vs. waited launch By default (waitForTerminal: false) the tool fires the workflow and returns immediately with launchMode: "background". observedRun reflects the run state polled during waitForStartMs. Use watch_run to track progress. Set waitForTerminal: true to block until the workflow finishes. result is populated and launchMode is "waited". Run option forwarding rootDir, logDir, allowNetwork, maxOutputBytes, toolTimeoutMs, and hot are forwarded verbatim to runWorkflow. They override values baked into the workflow file.

list_runs

List recent runs with summary data. Input:
ParameterTypeDefaultDescription
limitnumber (1–200)20Max runs to return
statusstring-Filter by status (running, finished, failed, etc.)
Output:
{
  runs: RunSummary[];
}
RunSummary fields: runId, workflowName, workflowPath, parentRunId, status, createdAtMs, startedAtMs, finishedAtMs, heartbeatAtMs, activeNodeId, activeNodeLabel, pendingApprovalCount, waitingTimers, countsByState.

get_run

Get the full detail record for a specific run, including steps, approvals, timers, loop state, lineage, config, and error. Input:
ParameterTypeDescription
runIdstringRun ID
Output:
{
  run: RunSummary & {
    steps: Array<{ nodeId, iteration, state, lastAttempt, updatedAtMs, outputTable, label }>;
    approvals: PendingApproval[];
    loops: Array<{ loopId, iteration, maxIterations }>;
    continuedFromRunIds: string[];
    activeDescendantRunId: string | null;
    config: unknown | null;
    error: unknown | null;
  };
}

watch_run

Poll a run at a fixed interval until it reaches a terminal state or a timeout expires. Input:
ParameterTypeDefaultDescription
runIdstringrequiredRun to watch
intervalMsnumber1000Poll interval (minimum enforced by runtime)
timeoutMsnumber30000Wall-clock budget before giving up
Output:
{
  runId: string;
  intervalMs: number;
  pollCount: number;
  reachedTerminal: boolean;
  timedOut: boolean;
  finalRun: RunSummary;
  snapshots: Array<{ observedAtMs: number; run: RunSummary }>;
}
When timedOut is true the run is still active, so call watch_run again or raise timeoutMs. Terminal statuses: any status other than running, waiting-approval, waiting-event, or waiting-timer, including finished, failed, cancelled, and continued.

explain_run

Return a structured diagnosis explaining why a run is blocked, waiting, or stale. Input:
ParameterTypeDescription
runIdstringRun ID
Output:
{
  diagnosis: {
    runId: string;
    status: string;
    summary: string;
    generatedAtMs: number;
    blockers: Array<{
      kind: string;
      nodeId: string;
      iteration: number | null;
      reason: string;
      waitingSince: number;
      unblocker: string;
      context?: string;
      signalName?: string | null;
      dependencyNodeId?: string | null;
      firesAtMs?: number | null;
      remainingMs?: number | null;
      attempt?: number | null;
      maxAttempts?: number | null;
    }>;
    currentNodeId: string | null;
  };
}
summary is a human-readable sentence. blockers lists every node preventing progress; unblocker describes what action or event would unblock it.

list_pending_approvals

List approvals that are waiting for a human decision, optionally filtered by run, workflow, or node. Input: All parameters optional. Omit all to list every pending approval across all runs.
ParameterTypeDescription
runIdstringFilter by run ID
workflowNamestringFilter by workflow name
nodeIdstringFilter by node ID
Output:
{
  approvals: Array<{
    runId: string;
    nodeId: string;
    iteration: number;
    status: string;
    requestedAtMs: number | null;
    decidedAtMs: number | null;
    note: string | null;
    decidedBy: string | null;
    request: unknown;
    decision: unknown;
    autoApproved?: boolean;
    workflowName: string | null;
    runStatus: string | null;
    nodeLabel: string | null;
  }>;
}

resolve_approval

Approve or deny a pending approval. This tool is destructive and non-idempotent. Input:
ParameterTypeDescription
action"approve" | "deny"required, decision to record
runIdstringFilter to a specific run
workflowNamestringFilter by workflow name
nodeIdstringFilter by node ID
iterationnumberFilter by loop iteration
notestringOptional note to record with the decision
decidedBystringIdentity of the decision-maker
decisionunknownStructured decision payload passed back to the workflow
Ambiguity guard Zero matches errors with INVALID_INPUT. More than one match errors with INVALID_INPUT and returns matches in details.matches; add runId, nodeId, or iteration to narrow the selection. The tool never guesses when multiple approvals match. Output:
{
  action: "approve" | "deny";
  approval: PendingApproval;   // with updated status, decidedAtMs, note, decidedBy
  run: RunSummary | null;
}

ask_human

Block the current run and ask a human to make a decision, then wait for their answer. Use this whenever the agent is blocked, uncertain, missing information, or about to take an irreversible or destructive action, instead of guessing. The tool creates a durable, pending human request and returns only once it is resolved. When run inside a Smithers task, the run/node context is taken from the SMITHERS_RUN_ID / SMITHERS_NODE_ID / SMITHERS_ITERATION environment variables Smithers injects into the agent; pass runId/nodeId/iteration explicitly to override, or rely on single-active-run autodetection. The orchestrating agent resolves the request on the human’s behalf: relay the question to the human in conversation, collect their decision, then run bunx smithers-orchestrator human answer <requestId> --value '<json>' (or bunx smithers-orchestrator human cancel <requestId>) yourself; never instruct the human to run these. bunx smithers-orchestrator human inbox lists everything waiting. Input:
ParameterTypeDescription
promptstringrequired, the decision or question to put to a human
contextstringExtra context appended to the prompt
choicesstring[]Fixed choices; restricts the human’s answer to one of these
runIdstringRun to attach to (default: SMITHERS_RUN_ID or the single active run)
nodeIdstringNode to attach to (default: SMITHERS_NODE_ID)
iterationnumberLoop iteration (default: SMITHERS_ITERATION or 0)
timeoutSecondsnumberSeconds before the request expires (0/unset = no timeout)
pollSecondsnumberPoll interval while blocking (default 3s)
Output:
{
  requestId: string;
  runId: string;
  nodeId: string;
  iteration: number;
  status: "answered" | "cancelled" | "expired" | "missing" | "aborted";
  decision: "approved" | "blocked";   // "blocked" => do not proceed
  response: unknown | null;            // the human's answer when status is "answered"
  answeredBy: string | null;
}

get_node_detail

Get enriched detail for a single node, including all attempts, tool calls, token usage, scorer results, and validated output. Input:
ParameterTypeDescription
runIdstringrequired
nodeIdstringrequired
iterationnumberLoop iteration (default: latest)
Output:
{
  detail: {
    node: { runId, nodeId, iteration, state, lastAttempt, updatedAtMs, outputTable, label };
    status: string;
    durationMs: number | null;
    attemptsSummary: { total, failed, cancelled, succeeded, waiting };
    attempts: unknown[];
    toolCalls: unknown[];
    tokenUsage: unknown;
    scorers: unknown[];
    output: {
      validated: unknown | null;
      raw: unknown | null;
      source: "cache" | "output-table" | "none";
      cacheKey: string | null;
    };
    approval: PendingApproval | null;
    limits: {
      toolPayloadBytesHuman: number;
      validatedOutputBytesHuman: number;
    };
  };
}

revert_attempt

Revert the workspace and frame history back to the state captured at a specific attempt. This is destructive and non-idempotent. Input:
ParameterTypeDefaultDescription
runIdstringrequiredRun containing the node
nodeIdstringrequiredNode to revert
iterationnumber0Loop iteration
attemptnumberrequiredAttempt number to revert to (must be ≥ 1)
Output:
{
  runId: string;
  nodeId: string;
  iteration: number;
  attempt: number;
  success: boolean;
  error?: string;
  jjPointer?: string;
  run: RunSummary | null;
}

fork_run

Create a branched run from a time-travel snapshot checkpoint without starting it. Input:
ParameterTypeDescription
parentRunIdstringSource run ID
frameNonumberSnapshot frame number
resetNodesstring[]Node IDs to reset to pending in the fork
inputOverridesRecord<string, unknown>Input fields to overlay on the snapshot input
branchLabelstringOptional branch label
Output: { runId, parentRunId, parentFrameNo, branch, snapshot, run }

replay_run

Fork a run from a checkpoint for replay, optionally restoring VCS state. Resume the returned runId with run_workflow when needed. Input: same as fork_run, plus:
ParameterTypeDefaultDescription
restoreVcsbooleanfalseRestore the working copy to the source frame revision
cwdstring-Working directory used for VCS restore
Output: { runId, parentRunId, parentFrameNo, branch, snapshot, vcsRestored, vcsPointer, vcsError?, run }

rewind_run

Rewind a run to a previous frame, deleting later frames and invalidating derived state. This is destructive and requires confirm: true. Input:
ParameterTypeDescription
runIdstringRun to rewind
frameNonumberTarget frame number
confirmbooleanMust be true
Output: { result, run }

restore_checkpoint

Restore the worktree to a durability checkpoint for a node. If seq is omitted, the latest matching checkpoint is used. Input:
ParameterTypeDescription
runIdstringRun containing the checkpoint
nodeIdstringNode whose checkpoint should be restored
iterationnumberOptional loop iteration
seqnumberOptional checkpoint sequence
Output: { runId, nodeId, iteration, seq, commitId, cwd, success, error? }

list_snapshots

List durability workspace checkpoints for a run with matching VCS operation IDs when available. Input: { runId: string } Output: { snapshots: Array<{ seq, nodeId, iteration, attempt, tier, source, label, commitId, operationId, cwd, createdAtMs }> }

get_timeline

Return the time-travel timeline for a run, optionally including all child forks recursively. Input:
ParameterTypeDefaultDescription
runIdstringrequiredRun ID
treebooleanfalseInclude child forks recursively
Output: { timeline: unknown }

time_travel

Reset a run back to a prior node attempt and optionally restore VCS state. If the run is still marked running, pass force: true. Input:
ParameterTypeDefaultDescription
runIdstringrequiredRun ID
nodeIdstringrequiredNode to travel back to
iterationnumber0Loop iteration
attemptnumberlatestAttempt number
restoreVcsbooleantrueRestore filesystem state
resetDependentsbooleantrueReset dependent nodes too
forcebooleanfalseAllow time travel when the run is still running
Output: { result, run }

list_artifacts

List structured output artifacts produced by nodes in a run. Input:
ParameterTypeDefaultDescription
runIdstringrequiredRun ID
nodeIdstring-Limit to a specific node
includeRawbooleanfalseInclude raw (pre-validation) output values
Output:
{
  artifacts: Array<{
    artifactId: string;   // "<runId>:<nodeId>:<iteration>"
    kind: "node-output";
    runId: string;
    nodeId: string;
    iteration: number;
    label: string | null;
    state: string;
    outputTable: string | null;
    source: "cache" | "output-table" | "none";
    cacheKey: string | null;
    value: unknown | null;
    rawValue?: unknown | null;   // only when includeRaw=true
  }>;
}
Only nodes with an outputTable and a non-none output source are included.

get_chat_transcript

Return the structured agent chat transcript for a run, grouped by attempts. Input:
ParameterTypeDefaultDescription
runIdstringrequiredRun ID
allbooleanfalseInclude all attempts, not just those with known output events
includeStderrbooleantrueInclude stderr messages
tailnumber-Return only the last N messages
Output:
{
  runId: string;
  attempts: Array<{
    attemptKey: string;
    nodeId: string;
    iteration: number;
    attempt: number;
    state: string;
    startedAtMs: number;
    finishedAtMs: number | null;
    cached: boolean;
    meta: unknown | null;
  }>;
  messages: Array<{
    id: string;
    attemptKey: string;
    nodeId: string;
    iteration: number;
    attempt: number;
    role: "user" | "assistant" | "stderr";
    stream: "stdout" | "stderr" | null;
    timestampMs: number;
    text: string;
    source: "prompt" | "event" | "responseText";
  }>;
}
Messages are sorted by timestampMs. Use tail to limit context window usage on long transcripts.

get_run_events

Return the raw structured event history for a run with optional filtering. Input:
ParameterTypeDefaultDescription
runIdstringrequiredRun ID
afterSeqnumber-Only events with seq greater than this value
limitnumber (1–10000)200Max events to return
nodeIdstring-Filter to events for a specific node
typesstring[]-Filter to specific event types (e.g. ["NodeFinished", "NodeFailed"])
sinceTimestampMsnumber-Only events at or after this timestamp
Output:
{
  runId: string;
  events: Array<{
    runId: string;
    seq: number;
    timestampMs: number;
    type: string;
    payload: unknown | null;
  }>;
}
Paginate via afterSeq: pass the seq of the last received event to fetch the next page.

Usage Examples

List workflows and start a run

> list_workflows {}

{
  "ok": true,
  "data": {
    "workflows": [
      { "id": "bugfix", "displayName": "bugfix", "entryFile": "./workflows/bugfix.tsx", "sourceType": "user" }
    ]
  }
}

> run_workflow { "workflowId": "bugfix", "prompt": "Fix the auth token expiry bug" }

{
  "ok": true,
  "data": {
    "runId": "smi_abc123",
    "launchMode": "background",
    "status": "running",
    ...
  }
}

Watch until complete

> watch_run { "runId": "smi_abc123", "timeoutMs": 120000 }

{
  "ok": true,
  "data": {
    "reachedTerminal": true,
    "timedOut": false,
    "finalRun": { "status": "finished", ... }
  }
}

Resolve a pending approval

> list_pending_approvals { "runId": "smi_abc123" }

{
  "ok": true,
  "data": {
    "approvals": [
      { "nodeId": "deploy", "iteration": 0, "nodeLabel": "Deploy to production", ... }
    ]
  }
}

> resolve_approval { "action": "approve", "runId": "smi_abc123", "nodeId": "deploy", "decidedBy": "alice", "note": "Looks good" }

{
  "ok": true,
  "data": {
    "action": "approve",
    "approval": { "status": "approved", "decidedAtMs": 1707500100000, ... },
    "run": { "status": "running", ... }
  }
}

Debug a blocked run

> explain_run { "runId": "smi_abc123" }

{
  "ok": true,
  "data": {
    "diagnosis": {
      "summary": "Run is waiting for a human approval on node 'deploy'.",
      "blockers": [
        {
          "kind": "approval",
          "nodeId": "deploy",
          "reason": "Node requires human approval before proceeding.",
          "unblocker": "Call resolve_approval with action=approve or action=deny."
        }
      ]
    }
  }
}

Revert a failed attempt

> get_node_detail { "runId": "smi_abc123", "nodeId": "analyze" }

{
  "ok": true,
  "data": {
    "detail": {
      "attemptsSummary": { "total": 3, "failed": 2, "succeeded": 1 },
      ...
    }
  }
}

> revert_attempt { "runId": "smi_abc123", "nodeId": "analyze", "attempt": 1 }

{
  "ok": true,
  "data": {
    "success": true,
    "run": { "status": "running", ... }
  }
}

Error Codes

Errors follow the structured envelope. Common codes:
CodeMeaning
RUN_NOT_FOUNDNo run or workflow exists with the given ID
INVALID_INPUTMissing required field, failed validation, or ambiguous approval filter
WORKFLOW_MISSING_DEFAULTWorkflow file has no default export