Overselling Event Sourcing
I wrote this article for my blog in June 2020. Here is the link to it.
This post is a part of the Myth Busting series, mainly about all the misconceptions about Event Sourcing here and there. Each post either addresses a common misconception or a particular article on a public resource.
I will address specific issues with a given article. You might want to read the original article to get the context.
A good example of this is the combination of the Event Sourcing and the Command Query Responsibility Segregation (CQRS) architecture patterns.
Event Sourcing is not an architecture pattern.
I keep repeating that Event Sourcing is the way to persist state of the entities, nothing more really. Yes, it’s mind-bending at the start, but it’s what it is.
If you have a repository interface, like
OrderRepository with methods like
Update, you can convert it to use Event Sourcing. I demonstrated it in my book. If you have queries in your repository - here comes trouble. I don’t like having queries in repositories simply because repositories represent collections of entities and queries often need to go beyond that. And, here comes CQRS!
CQRS is not an architecture pattern either. CQRS simply states that when you update your entity state — you optimise the operation for a transaction. But when you run a query — you optimise it for reporting. That’s the exact reason why I don’t advise queries in repositories. Reporting models too often have zero interest in your domain model but have a lot of interest in the state of the system as a whole. Therefore, queries often go beyond boundaries of your entities and limiting your queries to work with a single entity type simply becomes a burden.
CQRS has its dual nature. At one hand, Greg Young said a number of time that CQRS is just a stepping stone towards Event Sourcing. At the same time, CQRS is not a part of Event Sourcing. It is just becomes very useful when you build event-sourced systems. Check Jimmy’s post from 2012 to know more.
Yet there is hardly any talk on the ton of hidden complexity with these patterns.
That’s not true. The Event Sourcing community keeps exposing issues and ways to overcome them, based on their own experience. DDD Europe organised the first event dedicated to Event Sourcing, which I was privileged to curate, and we had quite a few case-studies and experience-sharing sessions.
Dennis Doomen, for example, keeps sharing his experience on many forums, so go check it out.
Another thing that bothers me in the phrase above is ton. What is a ton? Doesn’t the ORM tool that a seasoned developer (referenced later in the article) uses has ton of complexity? I bet it does. Damn it, our computers are complex beyond comprehension of 90% of the people that create software that those computers execute. Did it even stop us from writing software? I hardly observe it happening. In fact, Event Sourcing is unknown for many. As anything unknown, many do it wrong. As anything done wrong, it creates trouble. I made many mistake designing and building event-sourced systems until I realised that most of the complexity is artificial and can be avoided.
Typically, loose coupling of microservice is achieved by communicating through https using the REST pattern and versioning the endpoints the microservice exposes. This enables microservices to evolve freely without breaking each other and thereby solve the dependency hell that many monolithic systems struggle with. Despite loose coupling, microservices still depend on each other for data.
Well, here comes a big trouble. Check Sam Newman’s principles of microservices. The industry gets aligned that microservices should be built around specific business capabilities and get hold on their own data. Check the Decompose by business capability pattern by Chris Richardson.
What is the main issue with services using RPC calls to exchange information? Well, it’s the high degree of coupling introduced by RPC by its nature. The whole group of services or even the whole system can go down if only one of the services stops working. This approach diminishes the whole idea of independent components.
In my practice I hardly encounter any need to use RPC for inter-service communication. Partially because I often use Event Sourcing, more about it later. But we always use asynchronous communication and exchange information between services using events, even without Event Sourcing.
For example, an order microservice in an e-commerce system needs customer data from the customer microservice. These dependencies between microservices are not ideal. Other microservices can go down and synchronous RESTful requests over https do not scale well due to their blocking nature. If there was a way to completely eliminate dependencies between microservices completely the result would be a more robust architecture with less bottlenecks.
You don’t need Event Sourcing to fix this issue. Event-driven systems are perfectly capable of doing that. Event Sourcing can eliminate some of the associated issues like two-phase commits, but again, not a requirement to remove the temporal coupling from your system.
Let’s start with the diagram:
I have to say one thing — it is not how you do Event Sourcing.
Oskar is completely right about describing state mutations represented by domain events as the essence of the pattern. However, the important part is that domain events live inside the context because they are an integral part of the domain model. Therefore, domain events are rarely exposed to the outside world as-is because it couples the domain model to the service contract since the become the contract. That’s the last thing you want to do. Separating events that go to the outside world, publishing these events as the service contract and keeping them stable is the right thing to do.
Another issue is that the power of Event Sourcing in eliminating two-phase commits is gone here. You can see on the diagram that persisting an event and publishing it to the bus are two distinct operations. What we always do is persist events to the store and use the store as the source of events. It is possible with products that support real-time event feeds, like EventStoreDB. It also eliminates the need to have any kind of event bus inside the service to build the read model (shown in green). It is also important that more often than not, the read-side projection must process events in order and no product that is called bus can do that.
Here is the corrected, simplified diagram of how it should be:
I can put the next sentence to the same bucket:
Event sourcing takes care of the write logic, where events are persisted in an event store and broadcasted using a publish/subscribe approach to inform microservices that there is a change in data.
Again, publish-subscribe is not related to Event Sourcing whatsoever, and it is a valid pattern on its own.
Immutability of history
Not being able to change events without significant time investment makes event sourcing especially unforgiving. Found an event modelling design decision that you regret in hindsight or doesn’t play nice with that new requirement you didn’t anticipate? That thing has happened, you have to deal with it now and with the complexity it brings.
It is true, but twisted. Some say that Event Sourcing gives you a time machine. You can get back to the event log and see what happened and for what reason. I’d say it gives you the story of your domain, for exactly the same purpose. The paradox of time travel is that you can twist the past and encounter unpredicted consequences when you get back to “now time”. And that is what I claim is happening in the world of state-oriented systems. We change “now” in hope that we corrected it to the state it should be, ignoring what has happened before. For sure, we believe that most of the time we “correct” the system state, but how is it different from cheating with financial figures, trying to adjust debts to profits? We rarely know how the system got to the state, which we presume incorrect, hence that never stopped developers from “fixing” the system state.
I cannot argue that from time to time it is a tempting decision to make and seems to be the fastest way to bring the system back on track. However, more than once I witnessed that one fix followed the other and developers were never able to find the root cause of the issue. They kept fixing the “incorrect” state repeatedly, even trying to automate it.
Dealing with ledger-based systems that keep all changes and even use it as the source of truth is drastically different. Yes, you cannot rewrite the past (although you actually can). But you can make deliberate corrective actions, also known as compensating actions, which are also recorded and can be used as the evidence of such correction. We might argue that it is more complex, but it’s hardly possible to debate around those corrections being way more explicit and deliberate, compared with changing data in tables, leaving no trace of such an intrusion.
To complement the thought, I can share my experience. More often than not, I find myself in situations when I chose the wrong model and used the state-based persistence because it was “easy”. Instead of recording events, I kept changing the system state. Guess what, a twist in the model, which appeared to be not exactly right (or entirely wrong) without having a history of changes, brought me to a very uncomfortable position of not being able to migrate to the new model simply because I didn’t have enough data. Domain events are often richer than any single piece of state. You usually keep the decision log as events with all the details about those decisions. When reflecting it in a piece of state, you might find some of those bits of information useless and leave them behind. After that, there’s no coming back. Also, moving from a pure last-state-record model to a temporal model, which has entities like day or week is virtually impossible. Again, because there’s not enough data since every temporal change was overwriting the previous one without leaving any trace.
I’d like also to refer to Greg’s article Why can’t I update an event?.
But this is definitely not the case for most of the teams out there. Event Sourcing is a big mental leap for developers. Not every developer is fluent in DDD as we have been working with databases and CRUD operations for a long time.
I am not sure how to comment on this one. Building a distributed system with microservices and keeping CRUD models? I can hardly believe it can actually work. Check some of the videos below to learn more:
- DDD Norway meetup: Microservices Without DDD is Risky Business! by Trond Hjorteland
- GOTO 2015: DDD and Microservices: At Last, Some Boundaries! by Eric Evans
search Google for more.
Maybe the whole reason to discuss this imaginary complexity of event-driven communication versus RPC calls between microservices is driven by incorrect service boundaries, which, in turn, lead to overly chatty cross-service communication and circular dependencies between them, leading to highly coupled distributed system where none of the components are really cohesive.
In fact, DDD and Event Sourcing are orthogonal. Indeed, Greg Young was the active advocate for both simply because it makes sense. DDD is not about microservices, neither is it about architecture. It is about understanding the domain, the business and the users of your software. Those are fundamental aspects of quality for any software, unless you’re building a Hello World app.
In fact, Oskar mentions that:
Developers are often asked to work on domains that they are less familiar with or develop a system to support a new business model that is still very susceptible to change.
Being vaguely acquainted with the domain leads with vaguely purposeful software. We all dealt with such systems where actions are cryptic, everything looks like a representation of a table in RDMBS and where a person’s birthday gets changed on one screen with the user’s password. Nobody likes that kind of software. We like nice modern app where everything is structured and targeted to do what we want quickly and efficiently. I hardly imagine that without proper understanding of the domain, a team of developers can build anything that gets close to such a goal.
Another challenge is selecting a fit-for-purpose solution to support Event Sourcing. As events will be the backbone of the architecture; it needs to be highly available and scalable.
Again, it depends, right? Bringing “new business models that tend to change” and “must scale” in one basket is unfair. If you’re building a prototype, you don’t care that much about scaling. At the end, premature optimisation is the source of all evil. When you actually get tens of thousands of users doing hundreds of thousands of transactions per day, you might think about scaling. Even then, proven products like EventStoreDB can serve you for a very long time without much of a worry about scale.
At first glance, Apache Kafka would seem like a good fit for Event Sourcing
No, it doesn’t, simply because it doesn’t have concepts of event streams and deals with topics and partitions instead.
Check the Apache Kafka is not for Event Sourcing article by Jesper Hammarbäck to know more.
Purpose-build software like EventStoreDB might be the choice to go with when it comes to get things right from the start and also address the scale later.
The final challenge is the General Data Protection Regulation (GDPR) privacy legislation
The impact of GDPR on event-sourced systems is highly overrated. Yes, if you design your system in a way that private data is scattered across a plethora of contexts and streams, it will create problems. Systems that are designed with privacy in mind (like they should) don’t really have issues with that. If you keep all the private information in a single stream per person, you delete it and all your read models gets updated (deleted) accordingly. You can cryto-shred private information using asymmetric keys and then delete the key. In that case, you can still spread the encrypted private information across multiple streams. I’d say the challenge is there for any system, and it just needs to be properly addressed.
Do most microservice architectures need Event Sourcing? No, they don’t.
No doubt in that. Microservices have nothing to do with Event Sourcing.
I firmly believe that for the majority of microservice architecture, having RESTful https dependencies between microservices is not the “your system is still a monolith” death sentence that some make it out to be.
It actually is. As I mentioned before, RPC calls between services introduce functional and temporal coupling, making the whole system unreliable as a single service can bring the whole goliath down in no time. Designing systems as a set of truly independent components can be done without Event Sourcing and REST or any other kind of RPC has nothing to do with it.
Reduce the number of dependencies between services through proper service sizing using practices like Bounded Context.
That is exactly right. If you do that, there’s no need for RPC calls over REST (or, more correctly put, HTTP) or anything else.
It’s important to keep a fit-for-purpose mindset, and not buy-in to the complex architecture patterns sales pitch. Silver bullets don’t exist in software engineering, tradeoffs do.
I can sign for every word in this phrase. I am just wondering what Event Sourcing has to do with it? CRUD-based systems are more likely be unfit for building proper systems, and it seems to survive over many decades anyway. I might also mention that microservices as such is a much more doubtful pattern for building software as many systems can be delivered as a monolith, at least in their early stages. Event Sourced or not, distributed systems are by nature complex, and I’d urge you to avoid distribution on early stages of the system lifetime until you feel a need for it.