API Security¶
Purpose¶
This document captures the Phase 5 security boundary for the Phoenix-backed DecisionGraph service.
The public API is multi-tenant, authenticated, and intentionally conservative around replay and rebuild controls.
Trust Boundary¶
The first HTTP service boundary assumes:
- clients authenticate with static service-account bearer tokens
- every versioned API request includes
x-tenant-id - tenant authorization happens before controller logic
- Phoenix controllers stay thin and delegate service logic to
dg_api
The unversioned:
GET /api/healthz
is a deployment-health route and not a tenant-scoped product API.
Route Risk Levels¶
Lowest risk:
GET /api/v1/traces/:trace_idGET /api/v1/graph/contextGET /api/v1/graph/edgesGET /api/v1/precedentsGET /api/v1/projections/health
Medium risk:
POST /api/v1/events
Highest risk:
POST /api/v1/admin/replaysGET /api/v1/admin/replays/:job_idPOST /api/v1/admin/replays/:job_id/cancel
Auth Model¶
Every versioned route requires:
Authorization: Bearer <token>x-tenant-id: <tenant>
Service accounts are configured under:
beam/config/dev.exsbeam/config/test.exs
Each account carries:
rolestenant_idspermissions
Role checks gate route families:
readerfor read APIswriterfor ingestionadminfor replay controls
Permission checks harden sensitive admin actions further:
projection_replayfor catch-up replay, replay status, and replay cancelprojection_rebuildfor rebuild creation, rebuild status, and rebuild cancel
Tenant Isolation¶
Tenant isolation is enforced twice:
- the auth plug rejects accounts that are not allowed to use the requested tenant
- replay status and cancel lookups are scoped to the requested tenant before any result is returned
That second rule matters because replay jobs are addressed by job_id. A caller that knows another tenant's job ID should still receive not_found.
Replay Safeguards¶
Replay and rebuild routes now enforce these safeguards:
- admin role is required
- an explicit replay permission is required
- rebuild can be disabled per environment through
:dg_api, :admin_controls - a human-readable
reasonis required by default for replay and rebuild requests - operator metadata is persisted into replay run metadata
The default global control lives in:
beam/config/config.exs
Current default:
allow_rebuild: falserequire_reason: true
Development and test override rebuild to true so local flows remain usable.
Audit Capture¶
Sensitive admin actions emit audit records through:
- logger entries with
api_action,account_id,job_id,request_id, andtenant_id - telemetry events under
[:decision_graph, :api, :admin, :audit]
Current audited actions:
- replay/rebuild start
- replay cancel
Audit metadata is also written into replay run metadata where available:
reasonrequest_id- requesting account identity and roles
Rate Limiting¶
The API uses simple fixed-window ETS rate limiting keyed by:
- API scope
- service-account ID
- tenant ID
- current minute window
Configured buckets:
readwriteadmin
This is intentionally basic Phase 5 protection. It is enough to prevent accidental bursts and low-effort abuse, but it is not a replacement for upstream gateway controls.
Threat Notes¶
This Phase 5 boundary does not yet provide:
- token rotation workflows
- signed request bodies
- IP allowlists
- per-endpoint quota policies
- durable audit exports
- external identity provider integration
Those remain later hardening work. For now, shared environments should treat the BEAM service as an authenticated internal platform service rather than an internet-exposed public API.