The event envelope
Every event consumer shares one canonical envelope from@voyant-travel/core/events:
name, data, metadata, and emittedAt. Metadata carries category, source (workflow, service, route, subscriber, system), and correlation or causation identifiers when useful. Use the shared envelope rather than inventing package-local event shapes.
Domain events versus internal events
Thecategory field is chosen on purpose, because not every event has the same audience.
domain: a business milestone other modules or integrations may reasonably care about. For exampleinvoice.settled,booking.documents.sent,product.created.internal: a process signal useful to subscribers, diagnostics, or automation, but not part of the core business language. For exampleinvoice.document.generated,contract.document.generated.
The event bus is fire-and-forget
The defaultEventBus is in-process, and its semantics are explicit:
- Handlers run sequentially.
- Subscriber errors are caught and logged.
- Subscribers do not affect the emitter’s outcome.
- Emission does not imply durable delivery or retry.
Two rules that keep events safe
Emit after the durable state change
An event describes a fact that is already true in durable storage. The pattern is always: write first, then announce.financeDocumentService creates the invoice rendition before emitting the internal invoice.document.generated; financeSettlementService writes the payment and updates invoice state before emitting the domain invoice.settled; the booking-documents service persists the delivery row before emitting booking.documents.sent. The event is a signal to observers, never the mechanism that makes the thing true.
Subscribers are observers, not the correctness boundary
Subscribers are a good fit for secondary reactions: notifications, follow-up sync, cache invalidation, read-model refresh, diagnostics. They are a poor fit for anything that must succeed before the caller can treat the main operation as complete. The CMS sync plugins are the canonical example:payloadCmsPlugin and sanityCmsPlugin subscribe to product.created / updated / deleted and catch-and-log their own failures. A failed content sync is an operational issue, not a reason to invalidate the core product write.
If a side effect is part of the correctness boundary, do not hide it in a fire-and-forget subscriber. Move it to a durable path.
When to use durable execution instead
The moment a side effect needs retries, durable execution, delayed execution, explicit job identity or idempotency, or queue-backed isolation from the request path, it leaves the event bus. Voyant already has the right boundary for this:@voyant-travel/core/orchestrationexposes theJobRunnerfor durable background jobs.@voyant-travel/core/workflowsand the Workflows SDK provide durable, step-based orchestration with retries, sleeps, and resumability.
source: "workflow"), and a subscriber can kick off follow-up reactions, but the durable, retryable part of the work lives inside the workflow’s steps, not inside a subscriber.
The action ledger
Events answer “what happened” for integration and reaction. They are explicitly not an audit trail. When operators need to answer “who did what, why, under which authority, and can we undo or compensate it,” that is the action ledger, a cross-module, actor-centered record of important actions. Voyant keeps four histories separate, and the ledger is the fourth:Domain state
The real business tables (bookings, invoices, payments). The source of truth.
Domain events
Business and process facts emitted after durable changes. For reaction, not audit.
Workflow journal
Execution history of a durable orchestration: which steps ran, retried, or compensated.
Action ledger
Who initiated an action, which authority allowed it, what changed, and whether it can be reversed.
Roles: attribution and authority
A ledger entry records the principal and the authority, not just the operation. The shared spine carries the smallest stable facts: the action name and kind, status, evaluated risk, the actor type (staff, customer, partner, supplier), the principal type and id (user, API key, agent, workflow, system), the session or API-token id, delegation, the route or tool name, the workflow run and step, correlation and causation ids, idempotency fields, the target, the checked capability, and the authorization source.
This maps directly from the request context you already have: userId, apiTokenId, sessionId, callerType, internal-request markers, and any delegation chain. A central principle runs through it: AI agents get no special trust. An agent is an ordinary principal with principal_type = "agent", explicit delegated authority, bounded capabilities, and a mandatory ledger record for every sensitive read or mutation. It never inherits a staff session implicitly.
Reversibility
Reversal is a domain-level concept, never database rollback. Each ledgered mutation declares how it can be undone:- Revert when the old state can safely be restored (a catalog overlay value from overlay history, a draft revision rolled back).
- Compensate when the original action had external side effects (cancel an upstream hold, void an unpaid invoice, issue a credit note, refund through the domain cancellation flow).
- Irreversible when the action is historical truth (a delivered email, an issued signature, an external capture past the settlement window).
reversal_state, reversal_outcome, and the links between an action and the action that reversed it. Partial compensation is normal (a paid cancellation might refund 50% per policy). Ledger entries are append-only, so corrections, reversals, and approvals create new linked entries rather than rewriting history. When something cannot be undone, the operator UI says so and offers follow-up actions rather than pretending there is an undo button.
Consistency model
The write path depends on the action’s evaluated risk. High-risk and critical mutations write the ledger entry in the same transaction as the domain mutation: if the action cannot be durably recorded, it is not committed. Sensitive reads (PII reveals, credential access, private documents, agent retrieval contexts) usually have no mutation to piggyback on, so they use a standalone synchronous ledger write, and the response withholds the sensitive value until the entry is durable. Low-risk logging can be best-effort, but only when the capability’s policy explicitly allows loss.The action ledger is a planning reference and a phased build, not a single shipped table. The durable spine plus committed profile details and payload references is the audit truth; a relay or outbox decouples heavier enrichment and export from the write path without becoming the source of truth.
Review heuristics
When you add or review event-related behavior:Domain or internal?
Choose the event category so consumers know if it is a business fact or a process signal.
Can subscriber failure be tolerated?
If not, the work is not a subscriber. Move it to a job or workflow.
Does it need retries or scheduling?
Durable, retryable, or delayed side effects belong on
JobRunner or a workflow.Next steps
Workflows
Durable, step-based orchestration for retryable background work.
Services
Where events are emitted, after the durable write.
Auth and identity
The actor and principal context the action ledger records.
Glossary
The shared travel vocabulary behind event names.