Definition

A headless agent runtime is a backend execution system that consumes a declarative agent definition, validates it, compiles it into an executable workflow, executes it without UI dependency, persists execution state, enforces permissions, and exposes runtime state through APIs, events, and traces.

It is the execution authority of the agentic system.

It is independent from the visual editor (see Visual agent system designers architecture). The editor produces definitions; the runtime executes them.

agent definition (JSON)
→ validation
→ compilation
→ runtime execution
→ persistence
→ observability
→ external interfaces

The runtime must be deterministic at the orchestration layer and bounded at the LLM decision layer.


Core Responsibilities

The runtime owns:

  1. definition loading
  2. schema validation
  3. compilation
  4. execution scheduling
  5. state persistence
  6. retries
  7. human approval checkpoints
  8. permission enforcement
  9. tool execution
  10. observability
  11. audit logging
  12. termination

The runtime does not own:

  1. canvas rendering
  2. node dragging
  3. visual graph editing
  4. design-time UX

Canonical Input

The canonical input is the stable declarative specification of the agent system. It is not framework-specific code; it is a portable execution contract.

{
  "graphId":      "refund-agent",
  "version":      4,
  "stateSchema":  {},
  "nodes":        [],
  "edges":        [],
  "permissions":  {},
  "runtimePolicy": {}
}

Required properties:

  1. versioned
  2. immutable after publish
  3. schema-validated
  4. environment-independent
  5. compiler-target neutral

The runtime never executes raw frontend objects.


Validation Layer

Validation runs before compilation and before publish. Failure must block publication, not warn.

Structural

  1. exactly one start node
  2. terminal path exists
  3. no orphan nodes
  4. no illegal cycles
  5. node and edge references resolve

Semantic

  1. node configs valid against per-type schemas
  2. output schemas resolvable
  3. prompt contracts valid
  4. tool definitions complete
  5. state references resolve

Security

  1. forbidden tools blocked
  2. approval required for sensitive tools
  3. tenant boundaries enforced
  4. model usage allowed
  5. secret access rules valid

Implementations: Zod, JSON Schema, Open Policy Agent for policy checks.


Compiler Layer

The compiler transforms the canonical definition into an executable runtime graph for a specific target engine.

The compiler must be deterministic: same definition → same compiled output.

Targets

Target Best for See also
LangGraph Long-running agents, tool loops, retrieval, HIL 1.2.10
XState Deterministic approval flows, explicit transitions, UI sync 1.2.9
Temporal Enterprise durability, distributed retries, infra reliability
AWS Step Functions Cross-service AWS-native orchestration
Custom Domain-specific orchestrators

The compiler-target mapping rules for LangGraph and XState are described in the Visual agent system designers architecture note; the runtime depends on those mappings being deterministic.


Definition vs Runtime Instance

Two distinct persistence concerns.

{ "graphId": "refund-agent", "version": 4 }
{
  "runId":       "run_9241",
  "graphId":     "refund-agent",
  "version":     4,
  "currentNode": "awaitingApproval",
  "status":      "paused"
}

A definition is a system contract. A runtime instance is an operational record. One definition produces millions of instances.

Every instance pins the graph version at start time. A redesign mid-flight does not corrupt in-flight executions.


Execution Engine

The execution engine processes runtime instances through valid execution states.

load state
→ execute node
→ validate output
→ choose next transition
→ persist checkpoint
→ continue or pause

Required capabilities:

  1. synchronous execution
  2. asynchronous execution
  3. pause / resume
  4. retries
  5. timeouts
  6. cancellations
  7. rollback paths

The engine surface for these concerns is detailed in 1.1.1 State machines in agentic systems.


LLM Boundary

The LLM boundary is the deterministic interface where model output enters the runtime.

LLM output → structured event → validation → transition

Not:

LLM output → direct execution authority

The runtime validates: event allowed in current state, payload schema valid, permissions satisfied, approval policy respected. Only then does execution continue.

For the full event-boundary pattern, see 1.1.1 State machines in agentic systems.


Tool Execution

Tool execution is isolated from LLM authority. The LLM proposes; the runtime executes.

Tool Registry

The tool registry is the authoritative catalog of executable tools.

const toolRegistry = {
  refund_customer: {
    schema:           RefundSchema,
    requiresApproval: true,
    executor:         runRefund,
    timeout:          30_000,
    retryPolicy:      'exponential'
  }
};

Per tool entry:

  1. tool name
  2. input schema
  3. permission rules
  4. approval rules
  5. idempotency rules
  6. executor binding
  7. timeout policy
  8. retry policy

Only registered tools may execute. Arbitrary tool execution is forbidden.

The full tool authority semantics — model-proposed action validation, allowlist enforcement, idempotency — are covered in 1.1.1 State machines in agentic systems.


Human Approval

Approval is a runtime state, not a prompt instruction. The runtime persists at the approval checkpoint and resumes from persisted state when the approval event arrives.

ready_to_execute
→ awaiting_manager_approval
→ approved
→ execute

Approval state must survive crashes and restarts. Approver identity and timestamp must be captured in the audit log. Replay must not re-execute side effects on an already-approved branch.

LangGraph and Temporal both expose pause/resume primitives for this pattern.


State Persistence

Persistence happens before unsafe actions, not after.

Per run:

  1. current node
  2. prior transitions
  3. state snapshot
  4. tool outputs
  5. approval decisions
  6. retry counters
  7. execution metadata
  8. timestamps

Storage:

  • Local: Postgres, Redis, SQLite
  • Google Cloud: Firestore, Cloud SQL, BigQuery
  • Workflow-native: Temporal, AWS Step Functions

For the broader durability concerns — idempotency, out-of-order events, recovery semantics — see 1.1.1 State machines in agentic systems.


Scheduler

The scheduler determines when and where execution continues. It is infrastructure, not agent logic.

Responsibilities:

  1. enqueue runs
  2. distribute work across workers
  3. delay retries with backoff
  4. enforce concurrency limits
  5. prevent duplicate execution (idempotent on run_id)
  6. resume paused workflows on event arrival

Common implementations: BullMQ, Redis queues, Pub/Sub, SQS, Temporal task queues.


Multi-Tenant Isolation

Multi-tenant isolation prevents one tenant's workflows, memory, tools, and model context from affecting another tenant. This is mandatory for enterprise SaaS, not optional.

Required boundaries:

  1. state isolation
  2. vector store isolation
  3. tool access isolation
  4. secret isolation
  5. trace isolation
  6. model context isolation

Quotas applied per tenant: tokens, tool calls, runs, cost.


Observability

The observability layer records execution history for debugging, replay, evaluation, and audit.

Required trace events:

  1. node started
  2. node completed
  3. edge selected
  4. tool executed
  5. tool failed
  6. approval requested
  7. approval granted
  8. LLM output received
  9. retry triggered
  10. run terminated

Exporters: OpenTelemetry, LangSmith, Helicone, Braintrust, Cloud Logging, Grafana, Prometheus.

The trace stream is also the input to visual replay on the canvas — see Visual agent system designers architecture.


Runtime API

The runtime API exposes execution control to external systems. No endpoint may bypass policy enforcement.

POST /definitions/publish
POST /runs/start
POST /runs/:id/pause
POST /runs/:id/resume
POST /runs/:id/cancel
GET  /runs/:id/state
GET  /runs/:id/traces
POST /approvals/:id/approve
POST /approvals/:id/reject

Streaming surfaces (Server-Sent Events, WebSocket) typically expose live trace events for the canvas. Webhook callbacks deliver completion and approval-requested events to external systems.


Stacks

Node / TypeScript

  1. Fastify or NestJS
  2. LangGraph.js or XState
  3. Postgres
  4. Redis
  5. BullMQ
  6. OpenTelemetry
  7. Zod
  8. Docker

Google Cloud

  1. Cloud Run
  2. Pub/Sub
  3. Firestore
  4. BigQuery
  5. Vertex AI
  6. Cloud Logging
  7. IAM
  8. Secret Manager

On-Prem

  1. Kubernetes
  2. Postgres
  3. Redis
  4. Ollama
  5. vLLM
  6. Prometheus
  7. Grafana
  8. Vault

The runtime architecture remains the same. Only infrastructure changes.


Production Rule

Good:

editor → canonical definition → validation → compiler → runtime

Bad:

editor → arbitrary generated code → eval

The runtime executes bounded declarative definitions. It does not execute user-generated code.


Final Architecture

published definition
→ validation
→ compiler
→ execution engine
→ scheduler
→ tool execution
→ persistence
→ approval checkpoints
→ observability
→ runtime API

The visual editor is optional. The headless runtime is the system.


Cross-references

  • Visual agent system designers architecture (this submodule) — produces the canonical definitions consumed here.
  • 1.2.10 LangGraph — primary compiler target for agent loops and HIL workflows.
  • 1.2.9 XState — primary compiler target for deterministic transition control.
  • 1.1.1 State machines in agentic systems — runtime concerns: durability, idempotency, tool authority, event boundary, approval gates.

References

  1. LangGraph overview (JS) — docs.langchain.com/oss/javascript/langgraph/overview
  2. XState v5 — stately.ai/docs/xstate
  3. Temporal — temporal.io
  4. Open Policy Agent — openpolicyagent.org
  5. OpenTelemetry — opentelemetry.io