Event Sourcing Guide — Build Reliable Event-Driven Systems

May 28, 2026 · 16 min read · By Michael Lip

Event sourcing is an architectural pattern where the state of a system is derived entirely from a sequence of immutable events rather than stored as a mutable record. Instead of writing UPDATE orders SET status = 'paid' WHERE id = 123, you append an event {"type":"PaymentReceived","orderId":123,"amount":4900} to an event log. The current state is computed by replaying all events from the beginning of time. This approach gives you a complete, auditable history of every state transition, the ability to reconstruct state at any point in the past, and a natural integration point for event-driven webhooks and message queues.

The simulator below lets you build an event store interactively. Select an event type and data, append it to the log, and watch the projected state update in real-time. Use the replay slider or step controls to travel backwards through history and see the state at any previous point. This is exactly how event sourcing works in production: the state panel reflects which events have been applied, not what is stored in a database.

Event Store Simulator — Order Aggregate
Quick-append presets
Events: 0
Viewing:
Aggregate ID:
Version:
0 / 0
Event Log 0 events
No events yet. Append an event to begin.
Projected State at seq 0
State will appear here as you append events.

What is Event Sourcing?

In a traditional CRUD system, the database stores the current state of each entity. When state changes, you overwrite the previous state with the new state. The history of how you got to the current state is lost unless you implement separate audit logging. Event sourcing inverts this relationship: the event log is the source of truth, and current state is a derived view computed by replaying the log from the beginning.

Consider an order management system. In a CRUD approach, the orders table has a row per order with columns for status, total, and customer. When the order is paid, you run UPDATE orders SET status = 'paid', paid_at = NOW(). The previous status (pending) is gone. You cannot know when it transitioned, why, or what other fields changed simultaneously without an external audit trail. In an event sourcing approach, you append a PaymentReceived event with the amount, timestamp, and payment method. The "paid" status is a derived fact: the order is paid because a PaymentReceived event exists in its log. The full payment history, including failed attempts and retries, is preserved as immutable events.

Core Concepts: Event Log, Aggregate, and Projection

The event log (or event store) is an append-only sequence of events. Events are immutable once written. Each event has a sequence number, a timestamp, an aggregate ID (the entity it belongs to), and a payload describing the state change. Writing to the event store is always an append — you never update or delete an event.

An aggregate is the consistency boundary for a group of related events. In the simulator above, the aggregate is a single order identified by an aggregate ID. All events in the aggregate's stream are causally related: the order progresses from Created to Confirmed to Shipped to Delivered. The aggregate enforces business rules about which state transitions are valid. You cannot append a PaymentReceived event to an order that has already been OrderCanceled — the aggregate's current state, derived by replaying its event stream, determines whether a new event is valid.

A projection (or read model) is derived state computed by applying events in sequence. The simplest projection applies every event in order and updates an in-memory state object using an event handler function. The result is the current state of the aggregate. The simulator above shows this in real time: each time you move the replay slider, the projection function replays all events up to that sequence number and computes the resulting state from scratch. This is the core insight of event sourcing: state is always computable from events; events are the ground truth.

Why Event Sourcing Works Well with Webhooks

Webhooks and event sourcing are natural complements. Webhooks deliver the same kind of information that event sourcing stores internally: discrete, immutable records of things that happened. When Stripe fires a payment_intent.succeeded webhook, that is a domain event in Stripe's event store. Your system receives it and can either process it immediately or append it to your own event store as a received external event. This creates a fully traceable audit trail where every state change in your system is attributable to a specific received webhook event.

The idempotency requirement of webhook processing aligns directly with event sourcing semantics. In event sourcing, each event has a globally unique ID. If you receive the same event twice (a common occurrence when webhooks are retried), you check whether an event with that ID already exists in your event store before appending. If it exists, you discard the duplicate. This deduplication is structurally enforced by the event store rather than requiring ad hoc idempotency logic in every handler. The event ID from the webhook provider becomes the event ID in your event store.

Replaying Events: The Superpower of Event Sourcing

The ability to replay the entire event history and recompute state from scratch is the most powerful operational capability that event sourcing provides. In a traditional database, if a bug corrupts data, you need a database backup from before the bug was introduced and then must manually re-apply every transaction that occurred after the backup. With event sourcing, you fix the projection function (the bug) and replay all events through the corrected function. The state recomputes correctly from the same immutable event log.

Replay also enables temporal queries: "What was the state of this order at 3 PM yesterday?" is trivially answered by replaying all events with timestamps before 3 PM yesterday. No time-travel queries, no point-in-time recovery operations, no reconstructing state from change data capture logs — just replay the event log up to the target timestamp.

New projections can be built from existing event history without requiring historical data migration. If you need a new read model — for example, a per-customer lifetime value view that aggregates all orders — you create the new projection function and replay all historical events through it. The event history already contains all the data needed to populate the new view, regardless of when you decided to build it.

CQRS: Command Query Responsibility Segregation

Event sourcing is commonly paired with CQRS, which separates the write model (handling commands that produce events) from the read model (projections that serve queries). The write side validates commands against the current aggregate state, generates events, and appends them to the event store. The read side subscribes to events, updates materialized views, and serves queries. These two sides can scale independently: the write side is optimized for consistency and validation, while the read side is optimized for query performance and can maintain denormalized, cache-friendly views.

For webhook-driven systems, CQRS means your webhook handler is a command handler: it receives an external event, validates it, transforms it into internal domain events, and appends those to the event store. Downstream projections (materialized views, search indexes, notification queues) are updated asynchronously by subscribing to the event store's output. This decouples the webhook receipt (which must be fast to avoid timeouts) from the processing (which can be slow and retried independently).

Event Versioning and Schema Evolution

Because events are immutable and stored permanently, schema changes are more complex than in CRUD systems. When you change the structure of a PaymentReceived event, all existing events in the store still have the old schema. The two standard approaches are upcasting (transforming old event formats to the current format at read time) and copy-and-transform (rewriting the event store with transformed events to the new schema). Most systems use upcasting for non-breaking changes and copy-and-transform for major schema overhauls.

Designing events for longevity from the start reduces versioning pain: use explicit version fields in event payloads, prefer additive changes (adding new optional fields), avoid renaming or removing fields, and treat event schemas as public API contracts. Every event you write today will be replayed by future versions of your projection code, so the event schema is part of your long-term compatibility surface.

Frequently Asked Questions

What is the difference between event sourcing and event-driven architecture?

Event-driven architecture (EDA) is a broad pattern where components communicate by producing and consuming events. Event sourcing is a specific storage pattern where an entity's state is derived from its event history. You can use EDA without event sourcing (storing state in a regular database and using events only for inter-service communication), or you can use event sourcing internally while also publishing events externally as part of an EDA. Most mature event-driven systems use both: event sourcing for internal state management and event publishing for cross-service communication.

How large does the event log grow, and is performance a problem?

Event logs grow indefinitely, which is by design. For write performance, appending to an event store is extremely fast (sequential I/O). For read performance, replaying all events from the beginning of time becomes slow as the log grows. The standard solution is snapshotting: periodically save a snapshot of the aggregate's current state. When loading an aggregate, start from the latest snapshot rather than from event 0, then replay only the events that occurred after the snapshot. This bounds replay cost regardless of total event history length.

How do you handle deletion in event sourcing?

Because events are immutable, you cannot delete data in the traditional sense. For GDPR right-to-erasure requirements, common approaches include: (1) storing sensitive data encrypted with a per-user key and "deleting" by destroying the key; (2) separating PII into a separate store that references event IDs, deleting the PII without touching the events; or (3) using tombstone events that mark data as deleted and ensuring projections honor these tombstones. Design your event schema upfront with compliance requirements in mind, as retrofitting deletion support into an existing event-sourced system is substantially harder.

Can event sourcing work with existing relational databases?

Yes. An event store is conceptually simple: a table with columns for aggregate_id, sequence_number, event_type, payload (JSON), and created_at, with a unique constraint on (aggregate_id, sequence_number). PostgreSQL, MySQL, and SQLite can all serve as event stores for moderate volumes. Dedicated event store databases (EventStoreDB, Axon Server) provide additional features like optimistic concurrency control, built-in projections, and stream subscriptions, but are not required for getting started with event sourcing.

What are the downsides of event sourcing?

Event sourcing adds complexity: projections must be maintained and kept consistent, queries require materialized views rather than direct database queries, and event schema versioning requires ongoing discipline. It is overkill for simple CRUD applications where auditability and replay are not valuable. The best fit is systems with complex business logic, strong auditability requirements, multiple consumers of state changes, and the need to rebuild read models retroactively — patterns common in financial systems, e-commerce order management, and any system that processes webhooks as the primary data input.

Related Guides