BACKGROUND
The Event Sourcing architectural pattern is well known but it does not count on a rigid well-defined standard specification. The result is, we usually found examples of systems that allegedly use an Event Sourcing approach that are actually missing some or even many of the main principles behind this approach.
What is not Event Sourcing
- When your system uses commands as requests for operations across different parts of the system. There is a frequent confusion between commands and events. Commands are imperative requests and can involve or not data writing operations. Commands concepts come from specific protocols and patterns, e.g. RPC.
- A Message Broker makes us an Event Source system. That is not true. Message brokers can provide asynchronous messaging but it is completely up to the rest of the architecture to take advantage of this feature. Simply talking, messages are not events. We need something more.
- We constantly change the state of our data entities, maybe using for tracking actions purposes other tables with information about who and when changed something. This is a clear and frequently found anti-pattern with nothing to do with Event Sourcing.
One of the causes is the lack experience of software engineers in Event Sourcing systems. Training, theory and imaginative solutions are needed but a change of the staff's mindset is critical. My advice is, do not trust on CVs and Linkedin profiles. Prepare a good set of questions and architectural cases for your team and assure their skills in this fields are real. If they are not, invest on training and communication of practices. Theory is important.
Challenges
The conclusion is, for a good implementation of an Event Sourcing system we need a plan for Event Handling. So, what are the solutions for these challenges?
- Description of the structure of events as a reference for all the actors in the system.
- Same for the types of events. What can I expect from an event to start a given operation? What are the data needed and how is reflected in the event structure?
- Store for occurrences of events in time from all the possible operations in the system.
- How the events are reflected in data entities and how are they stored and read.
DESCRIPTION
We have designed and implemented the solutions for these challenges in order to go deeper into the Event Sourcing paradigm.
Event Internal Structure
WHAT
Definition and description of the internal structure of events. All events in the system share the same header structure. It contains information about the event type, tenant, business, and most importantly the correlation ID (please read this post about the importance of the correlation ID in Event Sourcing systems).
Is the same definition of a given event valid for different methods (read, write)?
While the event header structure is common to all event types the content in the body can change across different methods for the same operation. For instance:
- List of parameters in the body -> Read operation
- Event structure -> Insertion of a new event in the data entity
It is really simple! And it is because we are not following RPC imperative approaches. The finding of the event structure given for a data entity (table) tells the app (event handler) to insert the event and that's all. The event content can be a new item, the modification of an existing one or its deletion.
WHY
Any Event Sourcing system is based on well-defined data structures as events, oriented to reflect its nature of occurrences, not as imperative commands.
HOW
We store the Event Catalog and the Event Main Reference in Git.
We include the definition of the body content as a reference in the Event Catalog.
Event Catalog
WHAT
It is a description of the internal structure of the defined events.
WHY
Engineers in the team need a reference for data in event handling across the operations in the system.
HOW
JSON files in Git to track changes and general availability.
Event Store
WHAT
Running operations are based on events that are managed by message brokers.
WHY
We need to store our communications (the source of our events) as soon they are sent to the communication ways. They are stored in time series to be audited and put in relation to specific changes in the system. Immutability of events is an essential factor as we remarked in another post in this blog.
HOW
We use Azure Data Lakes as an unstructured storage with a directory structure based on time. By using U-SQL and PowerBI we make reports and visualizations of the event flows, providing full visibility and traceability.
Putting all together
By using the event definition, the event catalog and the event store we are starting to correctly handling a normalized Event Sourcing System.
But we need something more..
Event Data Driven Architecture
The data entities oriented to Event Sourcing are reflecting the time-based nature of changes in the status of data models. The usual tendency here is to keep separate events and states of data, encouraging the segregation of event handling and the effect of events in the data state. This is obviously another important anti-pattern directly inherited from relational modeling and concepts. Our advice is, keep the event structure and contextual information in the same table. Query-oriented databases (e.g. Cassandra) fits well in this kind of design. Assume the true nature of the event in time as a regular dimension in the data design.
Above, simple steps in Event Handling:
- Validation from reference in the Event Catalog.
- Storage in Event Store as the event occurs.
- Record of the event in the data entity oriented to Event Sourcing in the database.
In Event Sourcing Oriented Data modeling we set the events storage tightly coupled to the data. This has an important counterpart: We need to calculate the current state of the data entities for any reading operation. But as a common solution, the scheduled insertion of snapshots of data calculating the current status, materialized views and other measures solve the problem in a straightforward way and minimizing the amount of data involved in each operation. On the other side, the advantages are huge. We can go back in time by calculating the state of data entities at any moment, full traceability of operations, we can verify the immutability of the operations in the database for audit operations and many more. Finally this is possible:
"Event Sourcing ensures that all changes to application state are stored as a sequence of events. Not just can we query these events, we can also use the event log to reconstruct past states, and as a foundation to automatically adjust the state to cope with retroactive changes."
Martin Fowler (2005)
CONCLUSIONS
- By following strictly the principles of Event Sourcing we get important advantages for data management, control and audit of operations and provides re-build capabilities for our systems.
- The RPC inherited mindset is a real problem. To mitigate this problems read, read and read. To arrange frequent meetings with the team and explain clearly how your Event Sourcing implementation works will be helpful as well. By verbalizing to explain the architecture and specific problems will be easier to find real Event Sourcing oriented solutions being always aware we need to forget our RPC inheritance.
- New team members usually will have a similar lack of experience and mindset. Team planning has to be aware of this issue.
- The inclusion and operative enablement of the architectural components described above have made possible to start running Event Sourcing systems with the planned advantages.
[…] we have written something about it here and here. But basically we can conclude […]
ReplyDelete[…] previous posts, we explored Event Sourcing and how to implement a Fast Lane Notebook in Databricks. Now the question is: How to check the […]
ReplyDelete[…] that, we should never update data, just add it. A lot this topic has been discussed in this other post, so check it out before continue to read this (really, this post will still be here when you […]
ReplyDelete