Carrier Structured Agents

AI that compiles with your system.

Declare an agent in .carrier: its input type, structured output, the LLM client it talks to, the bounded tool surface, the guardrails, the resilience budget, and the telemetry. The compiler turns that into a durable runtime — Rust, Java, or Node — with audit trails, workflow state, and OpenAPI / MCP metadata for free.

See an agent →Guardrails Read the chapter ↗

What is a Carrier Structured Agent

A CSA is a typed, governed, observable AI agent declared in .carrier source. The compiler enforces the input/output contract, narrows the tool surface, validates the structured output, and records every run as durable workflow state.

Typed end-to-end

Carrier types both the input and the structured output. The runtime asks the LLM for structured JSON and validates it against the declared output type before completing the run.

Governed by policy

guards.require_auth, tenant_scope, max_tool_calls, and deny_tools_after_output live in source — not buried in a framework config.

Observable by default

Every run captures tokens, tool calls, guard failures, and outcomes into Carrier’s OTLP spans + structured JSON logs. Durable state lives in carrier_workflow_state.

Compiled for production

Same source compiles to a native Rust binary (Tokio + Axum), a Spring Boot service, or a Fastify dev server. No Python glue, no LangChain runtime in the prod path.

Bounded tool surface

A CSA can only call the fn or action declarations listed in its tools block — and the compiler verifies they exist on the underlying LLM client.

Workflow-grade resilience

Per-agent timeouts, retry attempts, exponential backoff, and a typed fallback_output when the model is unavailable. The agent never half-runs.

Anatomy of an agent

One declaration. Carrier validates the LLM client reference, the tool surface, and the input/output schema at compile time.

src/agents/support_triage.carrier

agent · input · output · tools · guards

type SupportMessage {
  user_prompt: String
  conversation_id: String?
}
 
type TicketDecision {
  action:           String
  summary:          String
  confidence:       Float
  suggested_reply:  String?
}
 
agent SupportTriage {
  input:  SupportMessage
  output: TicketDecision
  llm:    RoutedSupportAgent
  prompt: input.user_prompt
 
  tools {
    action create_ticket
    fn     search_help_docs
  }
 
  guards {
    require_auth:           true
    tenant_scope:           current_user.tenant_id
    max_tool_calls:         6
    deny_tools_after_output: true
    output_must_match:      TicketDecision
  }
 
  resilience {
    timeout_ms: 12_000
    retry attempts: 2 backoff_ms: 250
    fallback_output: {
      action:          "escalate"
      summary:         "Agent unavailable"
      confidence:      0.0
      suggested_reply: null
    }
  }
 
  telemetry {
    emit_tokens:         true
    emit_tool_calls:     true
    emit_guard_failures: true
  }
}

input / output

input binds a Carrier type or model with a user_prompt: String. output is a Carrier type Carrier validates the model’s structured output against.

llm

llm: RoutedSupportAgent must reference an llm client — a routed wrapper inherits the budget + downgrade policy and gets fallback dispatch on outage / 429 / budget pressure.

prompt

prompt: input.user_prompt threads the typed input into the LLM call. Tool calls execute under the same auth + tenant context as the route that triggered the agent.

Guards & guardrails

Compiler-checked policy on every agent. The guards block is the difference between an agent and a chat completion.

Guard	What it does
require_auth: true	Fails the run when no caller auth context is present.
tenant_scope: current_user.tenant_id	Pins the agent run to the caller's tenant; emitted in metadata + OTLP.
max_tool_calls: N	Bounds the iteration / tool-call loop. Default: 6.
deny_tools_after_output	Once the LLM returns a valid structured output, no further tool calls are allowed.
output_must_match: T	Carrier validates structured JSON output against the declared type T.
budget_tokens (legacy)	Per-run token cap; defaults to 8192. Pairs with llm-client tenant budgets.

Why it matters

In practice this is the gap between “impressive demo” and “shippable product.” Without these guards an LLM agent can call the wrong tool, leak across tenants, loop on retries, blow a budget, or return malformed JSON the rest of the system can’t consume. CSAs make all of those compile-time concerns.

Resilience & telemetry

Agents run as durable workflows. Timeouts, retries, fallbacks, and OTLP spans are first-class — not bolt-on middleware.

resilience

timeout_ms, retry attempts: N backoff_ms: M, and a typed fallback_output Carrier returns when the run fails. The fallback must typecheck as the declared output, so callers always get a valid response.

telemetry

emit_tokens, emit_tool_calls, and emit_guard_failures wire the agent into Carrier’s OTLP spans + token-usage metrics + audit log without you writing logging glue.

Bounded tool surface

A CSA can only call tools that already exist on its LLM client — and only the ones you explicitly list. This is the only way to keep an agent reviewable.

src/support/agents.carrier

llm client routed · budget · downgrade

llm client SupportAgent {
  provider:       "openai"
  wire_format:    "openai"
  model:          env("LLM_MODEL", "gpt-4.1-mini")
  api_key:        env("LLM_API_KEY", "")
  max_tokens:     600
  max_turns:      8
  temperature:    0.2
 
  budget_per_tenant_per_day_usd: 5.0
  over_budget_behavior:          downgrade("SupportAgentFallback")
 
  tool search_help_docs(term: String) -> String[] = search_help_docs
}
 
llm client SupportAgentFallback {
  provider:    "openai"
  wire_format: "openai"
  model:       "gpt-4.1-mini"
  api_key:     env("OPENAI_API_KEY", "")
  max_tokens:  600
  max_turns:   4
  temperature: 0.2
}
 
llm client routed RoutedSupportAgent {
  primary:             SupportAgent
  fallback:            SupportAgentFallback
  route_by:            cost_vs_latency(target_per_request_usd: 0.05)
  on_primary_outage:   fallback
  on_rate_limit:       fallback
  on_budget_pressure:  fallback
}

tools

action create_ticket binds a real Carrier action with a typed signature. fn search_help_docs binds a pure function. Both are run under the caller’s auth + tenant context.

No string glue

Tool schemas are derived from Carrier types — there is no handwritten JSON schema for the model to misinterpret. Wrong shape returns the agent’s typed fallback_output.

Policy-aware

When a tool is an action, all the action’s policy + tenant rules apply. The agent cannot bypass policy blocks by routing a request through an LLM.

Calling an agent

Routes call agents the same way they call actions — under their auth context, in their transaction, recorded in audit, scoped to their tenant.

src/30_routes/support.carrier

route → agent.run

route POST "/support/triage" protect Auth -> TicketDecision {
  input: SupportMessage
  summary: "Triage a patient message into a typed routing decision"
 
  handler {
    return SupportTriage.run(input)
  }
}

Agents + RAG

A CSA can be the LLM target of a rag declaration — retriever, embedder, and agent in one typed pipeline. The retriever runs first, context fits the budget, then the agent dispatches with full tool guards.

src/support/rag.carrier

rag · retriever · llm

rag SupportAnswerer {
  retriever:             Doc.similar_with_scores
  embed_with:            embed_text
  llm:                   SupportAgent
  context_window_tokens: 4000
  rerank:                score_threshold(0.7)
  top_k:                 8
}
 
route POST "/support/answer" public -> String {
  input: SimilarDocRequest
  handler {
    let response = SupportAnswerer.respond(input.query)
    return response.text
  }
}

Chat threads & UI

Expose an agent inside a generated UI as a typed chat-thread participant. Identity, thread id, and reply channel are wired from the host chat session — durability lands in carrier_workflow_state.

src/ui/patient_chat.carrier

ui · expose agent · thread_participant

ui PatientChat {
  framework: yew
 
  expose agent SupportTriage as thread_participant {
    identity:         from_chat_session
    thread:           injected
    response_channel: thread_reply
  }
 
  safety {
    redact_pii:   auto
    require_auth: auto
    tenant_scope: auto
  }
}

Continuity

If the agent input declares conversation_id: String?, the generated manifest marks that field as the host thread/conversation binding so subsequent turns continue the same conversation.

Targets · Rust, Java, Node

Same source. Three runtimes. Production stays Rust; Spring Boot is the JVM-native target; Node is the dev server.

Rust

Native binary, Tokio async runtime, Axum router, sqlx, OTLP. Executable CSA runs are fully supported here. Native binary, low memory, low tail latency.

Java / Spring Boot

JDBC + HikariCP. Metadata-only legacy agents work today; executable CSA runs currently fail closed with explicit unsupported-feature diagnostics.

Node / Fastify

~200 ms restart for dev iteration. Same migration SQL as the Rust target. Executable CSA runs currently fail closed — develop the agent surface, ship the runtime in Rust.

vs. handwritten / framework agents

What you don't write when you use a CSA.

Concern	Carrier	Handwritten / framework
Tool schema	✓ Derived from Carrier types	Handwritten JSON Schema, drifts from code
Auth + tenant	✓ Inherited from caller	Re-implemented per agent
Output validation	✓ Compiler-checked against output type	Try/catch around JSON.parse
Tool surface	✓ Bounded by the tools block	All client tools by default
Durability	✓ carrier_workflow_state row per run	In-process state lost on crash
Telemetry	✓ OTLP spans, token + tool counters, audit row	Bring-your-own logger
Fallback	✓ Typed fallback_output, always returns the contract	Caller handles partial errors
Budgets	✓ Per-tenant USD ceilings via routed LLM client	Manual rate-limiting middleware

Read further

Where to go next

Agents share the same compiler, runtime, and audit surface as the rest of Carrier. Every concept here links back to a real construct.

The Book — chapter on AI agents

A long-form explanation of how Carrier treats LLMs, agents, and RAG as language features rather than libraries.

Workflows & runtime

Agents share the saga / retry / compensation runtime — see how it works under the hood.

Language reference · llm client

The underlying typed LLM client surface — budgets, routed wrappers, and tool schemas.