Overview
The Agents domain is Backside’s in-process agent executor. You register oneagent_config per (tenant, kind), attach your own Anthropic API key, and POST to /api/v1/agents/{kind}/runs to enqueue work. A worker daemon picks the run up via Postgres LISTEN/NOTIFY, drives the LLM ↔ tool loop, and records every step.
See the Agents guide for the narrative overview. This page is the reference.
Data Model
Entities
| Entity | Description |
|---|---|
| Agent Config | Per-(tenant, kind) configuration: system prompt overrides, tool set, budget, deadline, guardrail set, disable state. One row per kind. |
| Agent Run | One row per enqueue. Carries status, claim state, cost rollup, failure category. |
| Agent Step | Content-hashed step in a run — either an LLM turn or a tool call. Token usage and cost per step. |
| Agent Event | Structured observability row. Guardrail decisions, auto-disable flips, waiting transitions. |
Agent kinds
Today, one kind is GA:| Kind | Purpose |
|---|---|
contact_deduper | Scans contacts for likely duplicates and proposes merges. |
Key Concepts
BYOK (Bring Your Own Key)
Every agent run authenticates to Anthropic with a key you provide. Backside encrypts it with your tenant’s DEK (AES-256-GCM) and loads it only at the worker boundary. Rotate withPUT /api/v1/agent-configs/{kind}/byok-key. Revoke with DELETE. There is no shared fallback key in production — if the tenant key is missing or revoked, runs fail with auth_failed.
Probe a key with POST /api/v1/agent-configs/validate-key before wiring it into a config. The probe is a one-shot live call to Anthropic. The response is either {"status": "valid"} or {"status": "invalid", "category": ..., "message": ...} where category is one of invalid_key, revoked_or_expired, insufficient_permissions, quota_exhausted, provider_unavailable, or rate_limited. The first four are terminal (require operator action); the last two are transient and retry-safe.
Run state machine
| State | Who owns the row | Notes |
|---|---|---|
queued | No worker | Freshly inserted by enqueue_run |
running | A worker (holds lease) | worker_id + lease_expires_at set; renewed every 10s |
waiting | No worker | An approval_gate guardrail paused the run. Lease cleared |
succeeded | No worker (terminal) | Final output available; cost_usd_cents rolled up |
failed | No worker (terminal) | failure_category tells you why |
cancelled | No worker (terminal) | Operator or API caller killed it |
UPDATE ... WHERE status = $old to ensure no two processes ever own the same run.
Failure categories
Every terminal failure carries afailure_category:
| Category | Meaning |
|---|---|
auth_failed | BYOK key was rejected by Anthropic |
tool_failed | A tool call returned an unrecoverable error |
guardrail_blocked | An enforced guardrail rule blocked a call |
config_error | The config is malformed (missing tool, unknown kind, etc.) |
timeout | The deadline elapsed before the run could finish |
budget_exhausted | Rolled cost exceeded the configured budget cap |
Guardrails
Every config carries aGuardrailSet — a list of rules enforced before every tool call. Five primitives:
allowlist/denylist— name-based tool gatingrate_limit— sliding window per-(run, tool) via Dragonflyapproval_gate— pauses the run towaitingon matchio_validation— JSON Schema check against tool inputquiet_hours— tenant-local time window block
Durable replay
Every step writes toagent_steps keyed by (run_id, seq) with a UNIQUE (tenant_id, run_id, content_hash) constraint. If a worker crashes mid-run, the next one replays every journaled step deterministically, skipping rather than re-executing. The hash is computed over canonical JSON of the step inputs, so replays are bit-identical.
Large tool payloads spill out-of-line to agent_step_payloads — the main step row stays small, the payload table stores the bulk.
Cost tracking
Per-step cost is estimated from Anthropic’s published rates (input, output, cache-read, cache-write) and the model used. On terminal, the run’scost_usd_cents is a sum over all steps. The agent_runs_billing_daily view rolls totals per (tenant_id, agent_kind) per day — use it for your own billing dashboards.
Auto-disable circuit breaker
If a(tenant, agent_kind) pair accumulates 10 failed runs in 48 hours, the worker stamps agent_configs.disabled_at = now() and writes a disable_reason. New POST /runs calls return 409 Conflict until an operator nulls both columns. There is no automatic recovery — the circuit stays open until human action re-closes it.
Required fields
Minimum config:Multi-tenancy
Every agent table has RLS. Every worker query runs inside a transaction withSET LOCAL app.tenant_id = ... plus a belt-and-suspenders (run_id, tenant_id) check before acting. A bug in the tenancy layer cannot leak rows across tenants — the database enforces it.
Operational notes
- Runs survive API and worker restarts.
queuedruns sit in the table until a worker wakes;runningruns with expired leases get released toqueuedby the orphan recovery job - The worker exposes an internal health endpoint on
:4500for container healthchecks; it is not reachable over the public internet - Deployment is decoupled from the API — the worker is a separate Docker container (
backside-agent-worker), and you can scale it horizontally without touching the API tier - Every run emits OpenTelemetry GenAI semconv spans:
agent.run,gen_ai.chat,gen_ai.tool. Ship them to the observability stack of your choice
Limits
- One GA kind (
contact_deduper) as of April 2026 — more land as they graduate - Approval resume endpoint is not exposed in Phase 2.
waitingruns need operator unblocking; the client-facing resume call lands in Phase 3 - Mid-run budget action hooks are not exposed; the cap fires only at terminal rollup
- Model routing is static per-run; automatic multi-model failover is a design-phase item
