The CAP theorem, and how it applies to microservices
It’s not unusual for developers and architects who jump into microservices for the first time to “want it all” in terms of performance, uptime and resiliency. After all, these are the goals that drive a software team’s decision to pursue this type of architecture design. The unfortunate truth is that trying to create an application that perfectly embodies all of these traits will eventually steer them to failure.
This phenomenon is summed up in something called the CAP theorem, which states that a distributed system can deliver only two of the three overarching goals of microservices design: consistency, availability and partition tolerance. According to CAP, not only is it impossible to “have it all” — you may even struggle to deliver more than one of these qualities at a time.
When it comes to microservices, the CAP theorem seems to pose an unsolvable problem. Which of these three things can you afford to trade away? However, the essential point is that you don’t have a choice. You’ll have to face that fact when it comes to your design stage, and you’ll need to think carefully about the type of application you’re building, as well as its most essential needs.
In this article, we’ll review the basics of how the CAP theorem applies to microservices, and then examine the concepts and guidelines you can follow when it’s time to make a decision.
CAP theory and microservices
Let’s start by reviewing the three qualities CAP specifically refers to:
- Consistency means that all clients see the same data at the same time, no matter the path of their request. This is critical for applications that do frequent updates.
- Availability means that all functioning application components will return a valid response, even if they are down. This is particularly important if an application’s user population has a low tolerance for outages (such as a retail portal).
- Partition tolerance means that the application will operate even during a network failure that results in lost or delayed messages between services. This comes into play for applications that integrate with a large number of distributed, independent components.
Databases often sit at the center of the CAP problem. Microservices often rely on NoSQL databases, since they’re designed to scale horizontally and support distributed application processes. And, partition tolerance is a “must have” in these types of systems because they are so sensitive to failure.
You can certainly design these kinds of databases for consistency and partition tolerance, or even for availability and partitioning. But designing for consistency and availability just isn’t an option.
The PACELC theorem
This prohibitive requirement for partition-tolerance in distributed systems gave rise to what is known as the PACELC theorem, a sibling to the CAP theorem. The acronym PACELC stands for “if partitioned, then availability and consistency; else, latency and consistency.” In other words: If there is a partition, the distributed system must trade availability for consistency; if not, the choice is between latency and consistency.
Designing your applications specifically to avoid partitioning problems in a distributed system will force you to sacrifice either availability or user experience to retain operational consistency. However, the key term here is “operational” — while latency is a primary concern during normal operations, a failure can quickly make availability the overall priority. So, why not create models for both scenarios?
It may help to frame CAP concepts in both “normal” and “fault” modes, provided that faults in a distributed system are essentially inevitable. This enables you to create two database and microservices implementation models: one that handles normal operation, and another that kicks in during failures. For example, you can design your database to optimize consistency during a partition failure, and then continue to focus on mitigating latency during normal operation.
Applying PACELC to microservices
If we use PACELC rather than “pure CAP” to define databases, we can classify them according to how they make the trades.
- In PACELC terms, relational database management systems and NoSQL databases that implement ACID (atomicity, consistency, isolation, urability) are designed to assure consistency, classifying them as PC/EC. Typical business applications, like human resources apps and ticketing systems, will likely use this model, particularly if there are multiple users using different component instances. Google’s Bigtable database is a good example of this.
- In-memory databases like MongoDB and Hazelcast fit into a PA/EC model, which is best suited for things like e-commerce apps, which need high availability even during network or component failures.
- Real-time applications, such as IoT systems, fit into the PC/EL model that databases like PNUTS provide. This is the case in any application where consistency across replications is critical.
- Database systems based on the PA/EL model, such as Dynamo and Cassandra, are best for real-time applications that don’t experience frequent updates, since consistency will be less of an issue.
Know the tradeoffs
The bottom line is this: It’s critical to know exactly what you’re trading in a PACELC-guided application, and to know which scenarios call for which sacrifice. Here are three things to remember when making your decision:
- Consistency is most valuable where many users update the same data elements.
- Availability is critical for applications involving consumers (who get frustrated easily) and also for some IoT applications.
- Latency is most likely critical for real-time and IoT applications where processing delays must be kept to a minimum.
Make your database choice wisely. Then, design your microservices workflows and framework to ensure you don’t compromise your goals.