When to use this
Reach for a Backside agent run when:- You want a task to execute against your Backside data (contacts, CRM, tasks, calendar, notes) without standing up your own inference stack
- The work is bounded: a dedupe pass, a follow-up generation, a weekly cleanup, a daily briefing — something that starts, does a thing, and stops
- You need the work to survive process restarts and mid-run failures
- You want per-run cost caps, guardrails on which tools can run, and structured audit trails
- You already have an agent runner (Claude Code, Cursor, a custom loop). In that case, connect it to the MCP server and skip the agent executor — your runner orchestrates, Backside serves data.
- You want open-ended conversational access with the human in the loop.
Available agent kinds
Today, one kind is generally available:| Kind | What it does |
|---|---|
contact_deduper | Scans your contacts for likely duplicates (email/phone/name collisions) and proposes merges. |
Your keys run the show (BYOK)
Agent runs use your Anthropic API key. Backside never charges you for LLM tokens — the cost lands on your Anthropic invoice, not ours. You attach a key once per kind with aPUT /api/v1/agent-configs/{kind}/byok-key call; Backside encrypts it with your tenant DEK and uses it on every run. Rotate or revoke any time.
A probe call verifies the key works before you enqueue a real run:
ProbeResult — either {"status": "valid"} or {"status": "invalid", "category": ..., "message": ...}. The category is one of invalid_key, revoked_or_expired, insufficient_permissions, quota_exhausted (terminal — require action) or provider_unavailable, rate_limited (transient — safe to retry).
Run lifecycle
| Status | Meaning |
|---|---|
queued | Your POST /runs call accepted the work; no worker has claimed it yet |
running | A worker holds a 30-second lease and is driving the executor loop |
waiting | A guardrail paused the run for approval (see below). Lease is cleared; no worker is spending time on it |
succeeded | Terminal. Final output is available; cost was rolled up |
failed | Terminal. failure_category explains why (auth_failed, tool_failed, guardrail_blocked, config_error, timeout, budget_exhausted) |
cancelled | Terminal. Operator or API caller killed the run via POST /runs/{id}/cancel |
Enqueueing a run
POST returns immediately. A worker wakes via Postgres LISTEN/NOTIFY within milliseconds and starts the run. Poll GET /runs/{id} for status or set up a webhook.
Inspecting a run
Guardrails
Every agent config carries a guardrail set — a list of rules the worker enforces before each tool call. Six primitives:| Rule | Purpose |
|---|---|
allowlist | Only listed tools may run |
denylist | Listed tools may never run |
rate_limit | Sliding-window cap per tool per run |
approval_gate | Matching call pauses the run to waiting for human approval |
io_validation | JSON Schema check on tool input before the call lands |
quiet_hours | Block tool calls outside a tenant-local time window |
shadow— the evaluator logs its decision but doesn’t block. Use this to test a new rule against live traffic.enforce— block or suspend as the rule specifies.
Cost tracking and auto-disable
Every step’s cost (in USD cents) is recorded onagent_steps. At terminal, the run row gets a rolled-up cost_usd_cents. You can pull daily totals per (tenant, agent_kind) from the billing view for your own dashboards.
On the operational side, Backside watches for runaway agents. If a (tenant, agent_kind) pair accumulates 10 failed runs in 48 hours, the config auto-disables: disabled_at gets stamped and future POST /runs calls return 409 Conflict with a reason. Manual operator action is required to re-enable — no self-healing. This is the runaway-agent circuit breaker.
What’s in a step
Eachagent_step is content-hashed. If a worker crashes mid-run and another picks it up, the replay logic skips every step whose hash matches a prior record and resumes from the first un-journaled point. You don’t rebuild state by hand. The database is the state.
Limits and deferred features
- One run type at a time per kind. Keep your configs focused.
- Approval resume endpoint is not yet exposed.
waitingruns currently need operator intervention to unblock. This lands in Phase 3. - Mid-run budget action hooks are not exposed. Configs still honor
budget_usd_centsas a terminal cap — if the rolled total exceeds the cap, the run fails withbudget_exhausted. - Model routing is static. The executor talks to Anthropic via the key you provide. Multi-model failover is on the roadmap under the
model-failover.mddesign doc — contact us if you need it.
Next steps
- API reference for agents — the full endpoint list
- Multi-tenancy — how tenant isolation applies to agent runs
- MCP server — use your existing agent runner with Backside data
