Definition
A headless agent runtime is a backend execution system that consumes a declarative agent definition, validates it, compiles it into an executable workflow, executes it without UI dependency, persists execution state, enforces permissions, and exposes runtime state through APIs, events, and traces.
It is the execution authority of the agentic system.
It is independent from the visual editor (see Visual agent system designers architecture). The editor produces definitions; the runtime executes them.
agent definition (JSON)
→ validation
→ compilation
→ runtime execution
→ persistence
→ observability
→ external interfaces
The runtime must be deterministic at the orchestration layer and bounded at the LLM decision layer.
Core Responsibilities
The runtime owns:
- definition loading
- schema validation
- compilation
- execution scheduling
- state persistence
- retries
- human approval checkpoints
- permission enforcement
- tool execution
- observability
- audit logging
- termination
The runtime does not own:
- canvas rendering
- node dragging
- visual graph editing
- design-time UX
Canonical Input
The canonical input is the stable declarative specification of the agent system. It is not framework-specific code; it is a portable execution contract.
{
"graphId": "refund-agent",
"version": 4,
"stateSchema": {},
"nodes": [],
"edges": [],
"permissions": {},
"runtimePolicy": {}
}
Required properties:
- versioned
- immutable after publish
- schema-validated
- environment-independent
- compiler-target neutral
The runtime never executes raw frontend objects.
Validation Layer
Validation runs before compilation and before publish. Failure must block publication, not warn.
Structural
- exactly one start node
- terminal path exists
- no orphan nodes
- no illegal cycles
- node and edge references resolve
Semantic
- node configs valid against per-type schemas
- output schemas resolvable
- prompt contracts valid
- tool definitions complete
- state references resolve
Security
- forbidden tools blocked
- approval required for sensitive tools
- tenant boundaries enforced
- model usage allowed
- secret access rules valid
Implementations: Zod, JSON Schema, Open Policy Agent for policy checks.
Compiler Layer
The compiler transforms the canonical definition into an executable runtime graph for a specific target engine.
The compiler must be deterministic: same definition → same compiled output.
Targets
| Target | Best for | See also |
|---|---|---|
| LangGraph | Long-running agents, tool loops, retrieval, HIL | 1.2.10 |
| XState | Deterministic approval flows, explicit transitions, UI sync | 1.2.9 |
| Temporal | Enterprise durability, distributed retries, infra reliability | |
| AWS Step Functions | Cross-service AWS-native orchestration | |
| Custom | Domain-specific orchestrators |
The compiler-target mapping rules for LangGraph and XState are described in the Visual agent system designers architecture note; the runtime depends on those mappings being deterministic.
Definition vs Runtime Instance
Two distinct persistence concerns.
{ "graphId": "refund-agent", "version": 4 }
{
"runId": "run_9241",
"graphId": "refund-agent",
"version": 4,
"currentNode": "awaitingApproval",
"status": "paused"
}
A definition is a system contract. A runtime instance is an operational record. One definition produces millions of instances.
Every instance pins the graph version at start time. A redesign mid-flight does not corrupt in-flight executions.
Execution Engine
The execution engine processes runtime instances through valid execution states.
load state
→ execute node
→ validate output
→ choose next transition
→ persist checkpoint
→ continue or pause
Required capabilities:
- synchronous execution
- asynchronous execution
- pause / resume
- retries
- timeouts
- cancellations
- rollback paths
The engine surface for these concerns is detailed in 1.1.1 State machines in agentic systems.
LLM Boundary
The LLM boundary is the deterministic interface where model output enters the runtime.
LLM output → structured event → validation → transition
Not:
LLM output → direct execution authority
The runtime validates: event allowed in current state, payload schema valid, permissions satisfied, approval policy respected. Only then does execution continue.
For the full event-boundary pattern, see 1.1.1 State machines in agentic systems.
Tool Execution
Tool execution is isolated from LLM authority. The LLM proposes; the runtime executes.
Tool Registry
The tool registry is the authoritative catalog of executable tools.
const toolRegistry = {
refund_customer: {
schema: RefundSchema,
requiresApproval: true,
executor: runRefund,
timeout: 30_000,
retryPolicy: 'exponential'
}
};
Per tool entry:
- tool name
- input schema
- permission rules
- approval rules
- idempotency rules
- executor binding
- timeout policy
- retry policy
Only registered tools may execute. Arbitrary tool execution is forbidden.
The full tool authority semantics — model-proposed action validation, allowlist enforcement, idempotency — are covered in 1.1.1 State machines in agentic systems.
Human Approval
Approval is a runtime state, not a prompt instruction. The runtime persists at the approval checkpoint and resumes from persisted state when the approval event arrives.
ready_to_execute
→ awaiting_manager_approval
→ approved
→ execute
Approval state must survive crashes and restarts. Approver identity and timestamp must be captured in the audit log. Replay must not re-execute side effects on an already-approved branch.
LangGraph and Temporal both expose pause/resume primitives for this pattern.
State Persistence
Persistence happens before unsafe actions, not after.
Per run:
- current node
- prior transitions
- state snapshot
- tool outputs
- approval decisions
- retry counters
- execution metadata
- timestamps
Storage:
- Local: Postgres, Redis, SQLite
- Google Cloud: Firestore, Cloud SQL, BigQuery
- Workflow-native: Temporal, AWS Step Functions
For the broader durability concerns — idempotency, out-of-order events, recovery semantics — see 1.1.1 State machines in agentic systems.
Scheduler
The scheduler determines when and where execution continues. It is infrastructure, not agent logic.
Responsibilities:
- enqueue runs
- distribute work across workers
- delay retries with backoff
- enforce concurrency limits
- prevent duplicate execution (idempotent on
run_id) - resume paused workflows on event arrival
Common implementations: BullMQ, Redis queues, Pub/Sub, SQS, Temporal task queues.
Multi-Tenant Isolation
Multi-tenant isolation prevents one tenant's workflows, memory, tools, and model context from affecting another tenant. This is mandatory for enterprise SaaS, not optional.
Required boundaries:
- state isolation
- vector store isolation
- tool access isolation
- secret isolation
- trace isolation
- model context isolation
Quotas applied per tenant: tokens, tool calls, runs, cost.
Observability
The observability layer records execution history for debugging, replay, evaluation, and audit.
Required trace events:
- node started
- node completed
- edge selected
- tool executed
- tool failed
- approval requested
- approval granted
- LLM output received
- retry triggered
- run terminated
Exporters: OpenTelemetry, LangSmith, Helicone, Braintrust, Cloud Logging, Grafana, Prometheus.
The trace stream is also the input to visual replay on the canvas — see Visual agent system designers architecture.
Runtime API
The runtime API exposes execution control to external systems. No endpoint may bypass policy enforcement.
POST /definitions/publish
POST /runs/start
POST /runs/:id/pause
POST /runs/:id/resume
POST /runs/:id/cancel
GET /runs/:id/state
GET /runs/:id/traces
POST /approvals/:id/approve
POST /approvals/:id/reject
Streaming surfaces (Server-Sent Events, WebSocket) typically expose live trace events for the canvas. Webhook callbacks deliver completion and approval-requested events to external systems.
Stacks
Node / TypeScript
- Fastify or NestJS
- LangGraph.js or XState
- Postgres
- Redis
- BullMQ
- OpenTelemetry
- Zod
- Docker
Google Cloud
- Cloud Run
- Pub/Sub
- Firestore
- BigQuery
- Vertex AI
- Cloud Logging
- IAM
- Secret Manager
On-Prem
- Kubernetes
- Postgres
- Redis
- Ollama
- vLLM
- Prometheus
- Grafana
- Vault
The runtime architecture remains the same. Only infrastructure changes.
Production Rule
Good:
editor → canonical definition → validation → compiler → runtime
Bad:
editor → arbitrary generated code → eval
The runtime executes bounded declarative definitions. It does not execute user-generated code.
Final Architecture
published definition
→ validation
→ compiler
→ execution engine
→ scheduler
→ tool execution
→ persistence
→ approval checkpoints
→ observability
→ runtime API
The visual editor is optional. The headless runtime is the system.
Cross-references
- Visual agent system designers architecture (this submodule) — produces the canonical definitions consumed here.
- 1.2.10 LangGraph — primary compiler target for agent loops and HIL workflows.
- 1.2.9 XState — primary compiler target for deterministic transition control.
- 1.1.1 State machines in agentic systems — runtime concerns: durability, idempotency, tool authority, event boundary, approval gates.
References
- LangGraph overview (JS) —
docs.langchain.com/oss/javascript/langgraph/overview - XState v5 —
stately.ai/docs/xstate - Temporal —
temporal.io - Open Policy Agent —
openpolicyagent.org - OpenTelemetry —
opentelemetry.io