ADR-013: NATS auth model for heddle.contrib.events¶
Status: Accepted (Sprint 3, 2026-05-19).
Pairs with: runbook: nats-acl-configuration
(concrete config); heddle-contrib-events-m2-architecture-v7.md
§4.5 (publish-ACL backstop on *.InternalFinalized).
Context¶
Sprint 3 adds the structural belt-and-braces defence for
framework-internal events: a NATS publish ACL that prevents
non-framework callers from publishing *.InternalFinalized events.
The application-layer defence (Aggregate.apply() provenance check
raising CorruptAggregateAlert) is unchanged.
Implementing the publish ACL forces a question that v7 left open: how is heddle's deployment authenticated against NATS? Two realistic shapes:
- Multi-account. One NATS account per logical boundary (framework / application / observer / workshop). Cross-account flow uses NATS account exports/imports. Strong isolation; one message hop per cross-boundary publish.
- Multi-user, single-account. One NATS account; multiple users inside it with different publish/subscribe permission blocks. Permission boundary is per-user; no message-fabric isolation.
The decision shapes operator runbooks, the Sprint 3 ACL config, and any future commercial deployment.
Decision¶
Adopt multi-user, single-account. Heddle ships with four predefined user roles inside one NATS account:
framework— P2/P3 framework projectors, framework command handlers. Publishes everything heddle owns; subscribes to all.application— domain command issuers and regular application code. Publishes events except*.InternalFinalized; publishes commands and dedup announcements; subscribes to all.observer— PF observers and similar ingest paths (Sprint 4a). Same publish posture asapplicationminus dedup, but narrower subscribe (events only).workshop— the Workshop UI and CLI. Publishes commands only; subscribes to all for live-view rendering.
The concrete NATS server config and verification commands live in
the nats-acl-configuration runbook.
Alternatives considered¶
Multi-account (rejected for M2)¶
One NATS account per role; cross-account flow via exports/imports.
- Rejected because the trust boundary in heddle's deployment is between components within one team's deployment, not between tenants. Multi-account adds operational complexity (managing the export/import graph, debugging routing across accounts) for an isolation property heddle does not need at this scale.
- Per-message latency: each cross-account publish takes an extra hop through the export/import routing. Single-digit microseconds in practice, but it's non-zero, and the framework pays it on every projector→command-handler round-trip. With multi-user-single-account the message stays on the same account and skips the hop.
- The benefit of multi-account — strong isolation between unrelated workloads sharing a NATS cluster — applies when heddle is one of several tenants on shared infrastructure. The current target deployment (Naimor SMB on-prem) is single- tenant.
Single user, ACL-less (rejected)¶
One NATS user; trust the application code not to publish
*.InternalFinalized from non-framework call sites.
- Rejected because it leaves the publish-ACL backstop
unwired. v7 §4.5 explicitly requires belt-and-braces:
application-layer provenance check plus structural defence.
The ACL is the structural half. Removing it means the only
defence against a forged
InternalFinalizedis whatever the receiving aggregate'sapply()notices — which is still enforced, but the cost of a misfire (recovery via the §4.12 runbook) is high enough that a structural prevention is worth the modest operator-config work.
Per-worker user (rejected)¶
One NATS user per heddle component (one for each projector, one for each application worker, etc.).
- Rejected because the granularity is theatre. The publish ACL needs to distinguish "framework-finalises" from "everyone else"; finer-grained users add credential management overhead without adding security properties. The four-role shape is the natural cut.
Consequences¶
Enables:
- The publish ACL on
*.InternalFinalizedis operationally workable from a single config file. Operators don't need to understand NATS account exports/imports to deploy heddle. - Future commercial multi-tenant deployment can wrap each tenant in its own NATS account with the same four-role shape inside. The migration is config-only — application code doesn't change.
- The Workshop and CLI get a deliberately narrow surface
(
workshopuser only publishes commands), which the runbook encodes as an operator constraint, not a heddle-internal assumption.
Costs:
- Heddle must document the four-user shape as part of its deployment surface. New operators read one runbook (nats-acl-configuration) to understand what credentials their deployment needs.
- The single-account isolation property is weaker than multi-
account. A compromise of the
frameworkcredential lets an attacker fabricate any heddle subject. The application-layer provenance check still catches forgedInternalFinalizedevents; rotation policy is the operator's responsibility. - A future multi-tenant migration is non-zero work, but it's config-only and bounded — the worst case is "split the one account into N accounts and add exports/imports for any flow that crosses tenant boundaries." No application code change.
Out of scope¶
- JWT vs static-password auth: orthogonal. The four-role shape works with either.
- NATS leaf-node deployment: orthogonal. The ACL config travels with the account; leaf vs hub is an operational topology concern.
- Audit logging of publish-permission rejections: a future
follow-up if
CorruptAggregateAlertevents start showing up in production. Today, the ACL silently rejects and the application-layer alert is the visible signal.