Headless agent runtime – Miguel Armengol

Definition

A headless agent runtime is a backend execution system that consumes a declarative agent definition, validates it, compiles it into an executable workflow, executes it without UI dependency, persists execution state, enforces permissions, and exposes runtime state through APIs, events, and traces.

It is the execution authority of the agentic system.

It is independent from the visual editor (see Visual agent system designers architecture). The editor produces definitions; the runtime executes them.

agent definition (JSON)
→ validation
→ compilation
→ runtime execution
→ persistence
→ observability
→ external interfaces

The runtime must be deterministic at the orchestration layer and bounded at the LLM decision layer.

Core Responsibilities

The runtime owns:

definition loading
schema validation
compilation
execution scheduling
state persistence
retries
human approval checkpoints
permission enforcement
tool execution
observability
audit logging
termination

The runtime does not own:

canvas rendering
node dragging
visual graph editing
design-time UX

Canonical Input

The canonical input is the stable declarative specification of the agent system. It is not framework-specific code; it is a portable execution contract.

{
  "graphId":      "refund-agent",
  "version":      4,
  "stateSchema":  {},
  "nodes":        [],
  "edges":        [],
  "permissions":  {},
  "runtimePolicy": {}
}

Required properties:

versioned
immutable after publish
schema-validated
environment-independent
compiler-target neutral

The runtime never executes raw frontend objects.

Validation Layer

Validation runs before compilation and before publish. Failure must block publication, not warn.

Structural

exactly one start node
terminal path exists
no orphan nodes
no illegal cycles
node and edge references resolve

Semantic

node configs valid against per-type schemas
output schemas resolvable
prompt contracts valid
tool definitions complete
state references resolve

Security

forbidden tools blocked
approval required for sensitive tools
tenant boundaries enforced
model usage allowed
secret access rules valid

Implementations: Zod, JSON Schema, Open Policy Agent for policy checks.

Compiler Layer

The compiler transforms the canonical definition into an executable runtime graph for a specific target engine.

The compiler must be deterministic: same definition → same compiled output.

Targets

Target	Best for	See also
LangGraph	Long-running agents, tool loops, retrieval, HIL	1.2.10
XState	Deterministic approval flows, explicit transitions, UI sync	1.2.9
Temporal	Enterprise durability, distributed retries, infra reliability
AWS Step Functions	Cross-service AWS-native orchestration
Custom	Domain-specific orchestrators

The compiler-target mapping rules for LangGraph and XState are described in the Visual agent system designers architecture note; the runtime depends on those mappings being deterministic.

Definition vs Runtime Instance

Two distinct persistence concerns.

{ "graphId": "refund-agent", "version": 4 }

{
  "runId":       "run_9241",
  "graphId":     "refund-agent",
  "version":     4,
  "currentNode": "awaitingApproval",
  "status":      "paused"
}

A definition is a system contract. A runtime instance is an operational record. One definition produces millions of instances.

Every instance pins the graph version at start time. A redesign mid-flight does not corrupt in-flight executions.

Execution Engine

The execution engine processes runtime instances through valid execution states.

load state
→ execute node
→ validate output
→ choose next transition
→ persist checkpoint
→ continue or pause

Required capabilities:

synchronous execution
asynchronous execution
pause / resume
retries
timeouts
cancellations
rollback paths

The engine surface for these concerns is detailed in 1.1.1 State machines in agentic systems.

LLM Boundary

The LLM boundary is the deterministic interface where model output enters the runtime.

LLM output → structured event → validation → transition

Not:

LLM output → direct execution authority

The runtime validates: event allowed in current state, payload schema valid, permissions satisfied, approval policy respected. Only then does execution continue.

For the full event-boundary pattern, see 1.1.1 State machines in agentic systems.

Tool Execution

Tool execution is isolated from LLM authority. The LLM proposes; the runtime executes.

Tool Registry

The tool registry is the authoritative catalog of executable tools.

const toolRegistry = {
  refund_customer: {
    schema:           RefundSchema,
    requiresApproval: true,
    executor:         runRefund,
    timeout:          30_000,
    retryPolicy:      'exponential'
  }
};

Per tool entry:

tool name
input schema
permission rules
approval rules
idempotency rules
executor binding
timeout policy
retry policy

Only registered tools may execute. Arbitrary tool execution is forbidden.

The full tool authority semantics — model-proposed action validation, allowlist enforcement, idempotency — are covered in 1.1.1 State machines in agentic systems.

Human Approval

Approval is a runtime state, not a prompt instruction. The runtime persists at the approval checkpoint and resumes from persisted state when the approval event arrives.

ready_to_execute
→ awaiting_manager_approval
→ approved
→ execute

Approval state must survive crashes and restarts. Approver identity and timestamp must be captured in the audit log. Replay must not re-execute side effects on an already-approved branch.

LangGraph and Temporal both expose pause/resume primitives for this pattern.

State Persistence

Persistence happens before unsafe actions, not after.

Per run:

current node
prior transitions
state snapshot
tool outputs
approval decisions
retry counters
execution metadata
timestamps

Storage:

Local: Postgres, Redis, SQLite
Google Cloud: Firestore, Cloud SQL, BigQuery
Workflow-native: Temporal, AWS Step Functions

For the broader durability concerns — idempotency, out-of-order events, recovery semantics — see 1.1.1 State machines in agentic systems.

Scheduler

The scheduler determines when and where execution continues. It is infrastructure, not agent logic.

Responsibilities:

enqueue runs
distribute work across workers
delay retries with backoff
enforce concurrency limits
prevent duplicate execution (idempotent on run_id)
resume paused workflows on event arrival

Common implementations: BullMQ, Redis queues, Pub/Sub, SQS, Temporal task queues.

Multi-Tenant Isolation

Multi-tenant isolation prevents one tenant's workflows, memory, tools, and model context from affecting another tenant. This is mandatory for enterprise SaaS, not optional.

Required boundaries:

state isolation
vector store isolation
tool access isolation
secret isolation
trace isolation
model context isolation

Quotas applied per tenant: tokens, tool calls, runs, cost.

Observability

The observability layer records execution history for debugging, replay, evaluation, and audit.

Required trace events:

node started
node completed
edge selected
tool executed
tool failed
approval requested
approval granted
LLM output received
retry triggered
run terminated

Exporters: OpenTelemetry, LangSmith, Helicone, Braintrust, Cloud Logging, Grafana, Prometheus.

The trace stream is also the input to visual replay on the canvas — see Visual agent system designers architecture.

Runtime API

The runtime API exposes execution control to external systems. No endpoint may bypass policy enforcement.

POST /definitions/publish
POST /runs/start
POST /runs/:id/pause
POST /runs/:id/resume
POST /runs/:id/cancel
GET  /runs/:id/state
GET  /runs/:id/traces
POST /approvals/:id/approve
POST /approvals/:id/reject

Streaming surfaces (Server-Sent Events, WebSocket) typically expose live trace events for the canvas. Webhook callbacks deliver completion and approval-requested events to external systems.

Stacks

Node / TypeScript

Fastify or NestJS
LangGraph.js or XState
Postgres
Redis
BullMQ
OpenTelemetry
Zod
Docker

Google Cloud

Cloud Run
Pub/Sub
Firestore
BigQuery
Vertex AI
Cloud Logging
IAM
Secret Manager

On-Prem

Kubernetes
Postgres
Redis
Ollama
vLLM
Prometheus
Grafana
Vault

The runtime architecture remains the same. Only infrastructure changes.

Production Rule

Good:

editor → canonical definition → validation → compiler → runtime

Bad:

editor → arbitrary generated code → eval

The runtime executes bounded declarative definitions. It does not execute user-generated code.

Final Architecture

published definition
→ validation
→ compiler
→ execution engine
→ scheduler
→ tool execution
→ persistence
→ approval checkpoints
→ observability
→ runtime API

The visual editor is optional. The headless runtime is the system.

Cross-references

Visual agent system designers architecture (this submodule) — produces the canonical definitions consumed here.
1.2.10 LangGraph — primary compiler target for agent loops and HIL workflows.
1.2.9 XState — primary compiler target for deterministic transition control.
1.1.1 State machines in agentic systems — runtime concerns: durability, idempotency, tool authority, event boundary, approval gates.

References

LangGraph overview (JS) — docs.langchain.com/oss/javascript/langgraph/overview
XState v5 — stately.ai/docs/xstate
Temporal — temporal.io
Open Policy Agent — openpolicyagent.org
OpenTelemetry — opentelemetry.io