An Introduction to Serverless Microservices
Microservices can be a bit confusing. This post outlines a few principles of microservices and explores how we might implement them using serverless.
Thinking about microservices, especially their communication patterns, can be a bit of a mind-bending experience for developers. The idea of splitting an application into several (if not hundreds of) independent services, can leave even the most experienced developer scratching their head and questioning their choices. Add serverless event-driven architecture into the mix, eliminating the idea of state between invocations, and introducing a new per function concurrency model that supports near limitless scaling, it's not surprising that many developers find this confusing. 😕 But it doesn't have to be. 😀
In this post, we'll outline a few principles of microservices and then discuss how we might implement them using serverless. If you are familiar with microservices and how they communicate, this post should highlight how these patterns are adapted to fit a serverless model. If you're new to microservices, hopefully you'll get enough of the basics to start you on your serverless microservices journey. We'll also touch on the idea of orchestration versus choreography and when one might be a better choice than the other with serverless architectures. I hope you'll walk away from this realizing both the power of the serverless microservices approach and that the basic fundamentals are actually quite simple. 👊
The Monolith has Limits 🦍
The monolithic approach to building software is a tried and true method. A single codebase that can communicate internally, maintain application state, and take advantage of shared libraries, not only seems highly logical, but in practice, more than adequately satisfies any given use case. While monoliths are often a small development team's first choice, as applications grow, we begin to encounter several problems with this approach.
Monolithic applications, by definition, are a single unit. This means that deploying a monolith to production is an all or nothing proposition. In some cases, this might not be a big deal. But as your application grows with new features and data structure changes, the likelihood of breaking something becomes much higher. Automated testing and User Acceptance Testing (UAT) can obviously help, but any changes to shared libraries, databases, or other system-wide dependencies can introduce significant risk.
Unless you are positive that your application will only be accessed by a few people, at some point, you need to think about scalability. In the Internet Age, applications can suddenly require a considerable amount of resources to handle incoming traffic. With monolithic applications, we must replicate the codebase across multiple servers and balance the load between them. Even if each server maintains application state, it is no longer shared across all servers, which depending on your application's design, may cause problems. You could use something like "sticky sessions" with your load balancer, but this won't help if a server fails and the user can no longer access the same machine.
Furthermore, it is unlikely that all parts of your application require the same amount of resources. So for example, if you have a component that needs to process images and perform resizing, spikes in traffic to that component will require you to scale your entire application, even if only one small part needs the extra resources. This can be incredibly inefficient, especially since your password reset component (that gets accessed only a few times a day) is now replicated across 50 servers.
Many small teams find it easy to work on a small codebase together. However, it is often the case, that as the codebase gets a bit larger or the team grows to more than a few people, individuals or groups begin to take ownership of system components. These splits could be geographic (if you have multiple offices), but even in smaller organizations, it's not uncommon for teams to form around specific parts of your application. This is almost certainly the case with larger organizations, assigning teams responsibility for individual services and products that make up a much larger overall system.
Monoliths can cause a number of issues in these scenarios. For example, shared libraries quickly become dependencies that are hard to change. If several teams are incorporating the database access layer library, a small change to that could potentially break another implementation. A significant amount of research may be needed to guarantee your change won't cause a problem, therefore many teams find it easier to create their own copy to ensure compatibility. This will often lead to deviations that almost never get reconciled. Also, managing deployments gets more difficult as multiple teams are merging in changes that need to go through the review and deployment process. A small bug fix to one team's component might need to wait for the weekly (or even more infrequent) deployment to get into production.
Another issue with monoliths is that you typically lock yourself into a narrow set of technologies. If your entire application is built in PHP, adding a component in Python becomes a bit more difficult. We could certainly have our servers support multiple programming languages, but that means we have to update all of our systems and worry about the extra processing required. The same is true of adding additional datasources. If some teams prefer Postgres over MySQL, our monolith now requires additional dependencies to support them, not only increasing the size of the codebase, but also adding another shared library with shared ownership.
The Flexibility of Microservices 🤸♀️
Microservices introduce us to a new kind of flexibility within our applications. They are loosely coupled services that are typically lightweight, highly modular, and self-contained. Before we talk about communication patterns, let's look at how microservices help us combat some of the limitations of monoliths.
Unlike monolith components, microservices should be independently deployable. This means that one team can own an entire service and release it fifty times a day if they wanted to. Releasing an updated version of the "billing service" only affects the billing service. As long as the team maintains a consistent (and backwards-compatible) interface for service-to-service communication (we'll talk about this more later), then what happens inside this service has absolutely no effect on the rest of the system.
In our example above, we mentioned an image processing and resizing component within our monolith. If a separate "image processing service" was created, it could scale appropriately to handle the load, either by horizontally scaling and load balancing the traffic, or even by simply provisioning a more powerful server to support the occasional spikes. This concern will almost completely go away when we introduce serverless, but let's not get ahead of ourselves. 😉
As service boundaries start to form around your application, components can be split into microservices, each with independent codebases. This makes it incredibly easy for teams to "own" a service. Shared libraries now must be installed versus simply required. This means that teams can take advantage of versioning, either by using package managers like NPM or Pip, or by tagging shared libraries with version numbers in your code repositories. If you need to make a change to the data access layer library, a new version could be created and released without forcing other teams to use the latest version. This may require some coordination, but doesn't introduce any immediate risk as in the monolithic approach.
Technology is almost never one size fits all. What may be a good fit for our billing system, might be a horrible choice for our machine learning component. As is true for database technologies. A NoSQL solution may work perfectly for our authentication system, but is of little use when it comes to our reporting engine. Microservices free us from adopting a single technology across our entire stack. Instead of limiting teams, we open up new possibilities by letting each team choose which programming languages, databases, and third-party libraries work best for them. There should be some practical limits, you probably don't want twenty different tech stacks in your organization, but overall, this flexibility provides you with many more options.
Loose Coupling and High Cohesion 🔗
Probably the two most important properties of microservices are the concepts of loose coupling and high cohesion.
Loose coupling means that services should know as little about the rest of the system as possible. A change to one service should have no effect on another service. In fact, you should be able to completely swap out a service (different tech, different database, etc.) so long as the communication interface is well-defined and maintains backwards-compatibility. Loosely coupled systems are highly flexible, and in most cases, easily scalable.
High cohesion means that common functionality should be contained within the same service. If we have multiple services that perform similar tasks, or have to communicate with each other very frequently, it often makes sense to combine them into a single service. This avoids duplicating efforts when changes occur and can reduce network traffic. On the other hand, if a service is performing multiple disparate functions, then splitting them into multiple services generally makes sense. This helps maintain service boundaries and avoids tightly coupling components.
Microservice Communication Patterns ☎️
If you follow the basic principles of microservice design, then you may be asking yourself, "what do I do if I need account information from my user service inside my billing service?" This is the question that often stumps developers and sends them running back to the monolith. You can't simply query the database for this information since you no longer have access to the database that stores it. This is why well-defined communication interfaces are so important to microservice architectures.
There are a number of different ways in which we can retrieve information from another microservice. A service can define an API that allows other services to retrieve information directly from it. Or, when changes are made that other services need to know about, the service can publish a message to a message bus notifying other subscribed services. Similarly, a service could send information to a queue and let another service consume the messages. These communications may be synchronous, which would wait for another service to return information, or asynchronous, which would simply trigger another process. Any of these types of communication patterns could be used to facilitate eventual consistency, the concept of synchronizing data between multiple services.
Communication between microservices should be as lightweight as possible and should typically be aware of just the contents of the message being passed. The term "dumb pipes" is often used to describe this type of communication. The brains should be contained within the microservices themselves. This means that we should avoid adding logic to our message routing and simply use the basic pub/sub functionality within our chosen message brokers.
Serverless Microservices ⚡️
We've barely scratched the surface of microservices here. I'm sure you still have several questions about error handling, resiliency, and much more. There is a great book called Building Microservices by Sam Newman that covers these topics in detail and is worth the read if you're heading down the microservices path. However, while serverless microservices share many of the same common patterns as microservices in general, there are some key differences (and advantages) to implementing them using serverless technology.
While we want to avoid creating "mini monoliths" that do too many things, microservices typically will follow a very similar pattern to monolithic approaches. The distinction being that they're obviously much smaller in scope, and focus on implementing limited functionality for a specific purpose. Serverless microservices are a bit different in this regard. While they too are small and specific, we generally take this a step further and create several serverless functions within each service, splitting functionality into even smaller components. This often has to do with the need for each function to handle different types of events.
In my post, Serverless Microservice Patterns for AWS, I outline 19 different patterns that demonstrate ways in which you can structure your microservices using available technologies at AWS. Many similar features are available through other providers as well, so they could most likely be adapted. I would suggest reading that post when you get a chance to see how you could incorporate these patterns into your microservice designs.
This has a number of advantages. A traditional microservice, for example, might have a subroutine to handle some complex calculation or process. As with monoliths, your entire service would need to scale in order to handle additional traffic to this one component. Since each serverless function should have a highly specific purpose, just that function needs to scale (which it would do automatically). This also means that the logic within that function is completely isolated, increasing code readability and maintainability. From a security standpoint, each individual function can be configured to limit access to other resources, helping to ensure that it can only perform its specific task. And speaking of configurability, each function can have its own memory settings and timeouts, which can dramatically increase performance and reduce overall costs.
Individual functions within a serverless microservice can even span VPCs or not be in one altogether. This allows you to do all kinds of interesting things when it comes to accessing resources as well as mitigating cold starts by appropriately isolating your functions. Functions can also communicate by synchronously or asynchronously invoking one another. Some people believe this to be an anti-pattern, but I disagree. Most services within microservices use the HTTP protocol to communicate, and invoking another function is no different. There are other ways to handle this too, which we'll discuss in a minute.
DRY Principle (Don't Repeat Yourself) 🤐
Another concern that is often raised by developers that are new to serverless, has to do with the duplication of functionality within each function. For example, if you have five functions that need to communicate with your database, then you have to include the database connection code in all five of your functions. Perhaps the same is true of your custom logging engine or other shared libraries. This is absolutely true, but the structure of our microservices makes this a non-issue.
Serverless functions are completely autonomous, and can run independently without needing access to any external dependencies at runtime. This means that any required libraries need to be packaged within each function. When we are building our serverless microservices, it is perfectly acceptable (and encouraged) to have a shared library within each service that individual functions can use to increase code reusability and minimize duplication. Even though there may be multiple functions, you typically deploy your entire microservice at the same time. Each function will include whatever components it needs from the shared library and package them during the deployment.
There is a clear distinction between this approach and that of sharing libraries across microservices. A shared library within a service is owned by the team who owns the microservice. They would also be responsible for the test suite that verifies that any changes to the shared library doesn't break other functions. As I mentioned earlier, if there is a shared module or component that multiple services use, it is best to package that and version it so that teams can upgrade to newer versions at their own pace.
Router and Receiver Functions 🚦
Another common pattern with serverless microservices that also falls under the DRY principle, is to create what I call router and receiver functions. We may create an API interface into our billing service, for example. This API might be public and require credentials. We might create our own event parser for requests made through an API Gateway, or we could use a framework like Lambda API. Either way, we've created a relatively well-defined interface for other clients and services to communicate with. However, what if we need to trigger a billing process from another service and we don't want to communicate over a public API? Or, what if our billing service needs to "listen" for customer account changes? This is where router and receiver functions come in.
Since we can easily access a shared library within each service, we can create multiple interfaces into our services by simply reusing standard functionality we've already built. In the case of triggering a billing process, we might have an endpoint in our API that allows us to do that. Code reuse 101 tells us that things like database interactions, cache controls, event parsing, and the like, should be abstracted into reusable components. By taking this one step further and encapsulating an entire process into a reusable component, we can easily have our API function parse the event, call our component, and then format and return an appropriate response. We can then reuse that same component within another function that can receive events with a completely different format. Our "router" function can now be accessed directly by another function, bypassing our API Gateway and the overhead and latency that comes with it.
Similarly, "receiver" functions can utilize the same shared components to perform tasks, but the distinction here is that these function would subscribe to a message bus or queue for event processing. In AWS, for example, a receiver function might subscribe to an SNS topic, allowing it to parse the unique format of an SNS message, execute the appropriate action, and then return a valid response. These types of functions that can accept different formats depending on the "pipe" in which they were sent, allows us to keep our pipes dumb, while giving us plenty of flexibility within our microservices.
Orchestration versus Choreography 💃🕺
Another often confusing set of concepts in microservices are orchestration and choreography. Orchestration is the idea of using some sort of controller to coordinate the behavior of multiple components or services. This is often used as a way to maintain state between multiple microservices. In general, I like to stay away from orchestration as much as possible. There are times when it may be necessary, but it has the negative side effect of tightly coupling services, making coordinating changes much more difficult and violating one of the central tenets of microservices: loose coupling.
Choreography, on the other hand, is a much more elegant way to handle the flow of information through your system. In many cases we don't need to wait for a responses when calling other services. We may drop a message in a message bus or queue and then just go on about our business. Or, we may simply invoke another service directly depending on our use case. Either way, we are not waiting for these services to complete before the initiating service can continue. This could kick off a chain of events that may take a few seconds or a few hours to complete. As Sam Newman says, "each service is smart enough to understand its role in the whole dance."
There are, of course, disadvantages to the choreographed approach. For one, it must be asynchronous, meaning any immediate feedback would only be an indication that the process has started. You also need to track the events and monitor the entire process as it communicates between services. AWS has the ability to capture failed calls to Lambda functions using Dead Letter Queues (DLQs) which can make this easier to track. Plus there are several serverless observability tools that can help in this regard as well. But like any distributed system, these are typical challenges that need to be addressed. In my opinion, the benefits far outweigh the risks.
As I mentioned earlier, sometimes orchestration is necessary, and even then, it is often to a very small degree. For example, I don't always consider a synchronous call to another service a form of orchestration. Like when using the Aggregator pattern, whether accessing one of your services or someone else's third-party API, you are technically orchestrating. But then that also means that making a database call would be orchestration too, which I don't think is a fair categorization. However, sometimes multiple services do need some help coordinating. That is where state machines come in.
State Machines 🤖
AWS and Microsoft Azure both have a concept of state machines. AWS calls them Step Functions and Azure calls them Durable Functions, but the underlying idea is the same. State machines can orchestrate the behavior of several functions, allowing you to add "state" between the steps (or transitions). This is a very powerful concept, especially when you are waiting for responses from several functions before moving on to the next step. State machines can also perform complex branching, fan-out/fan-in, wait timers, error handling and more. It is also important to note that these implementations of state machines are asynchronous, meaning that you wouldn't use them in a request/response scenario.
I am a huge fan of state machines in that they allow you to stitch together multiple serverless functions almost as if you were building a service flow in a traditional application. They work really well for complex chains of actions, most notably the execution of parallel events and the ability to aggregate the results. I find that using state machines is best within a service boundary, performing a complex operation as part of a single microservice. I wish I could say that using state machines to coordinate multiple services is an anti-pattern, but I have seen implementations where complex data flows across multiple microservices are masterfully handled using them. In many cases, simple choreography would also work, but these are choices that would be specific to your implementation.
Final Thoughts 🤔
There is a lot to think about when building any new application, but in today's world, anything that you put on the Internet needs to consider the possibility of scale. A few weeks ago, one of my blog posts made it to number 7 on Hacker News. Within seconds, I had hundreds of concurrent users that ended up crashing the nginx service on my server. While I don't believe teams should be spending months optimizing for scale, they should certainly consider what their technology choices might mean for it. Microservice architectures are not always the right choice for small teams, especially if there isn't a lot of experience with them. However, starting to experiment and slowly replacing parts of your monolith (see the Strangler pattern), can help your teams become more agile and productive.
Coupling this approach with serverless not only gives you the benefits of microservice architectures, but also significantly mitigates the scaling problem, reduces your infrastructure management requirements, and can ultimately save you money by reducing over-provisioning of services. It takes a bit of planning, and sometimes a bit of rethinking the way we're used to building applications, but in the end, serverless microservices will provide you with unparalleled flexibility as you grow your application.
I hope you enjoyed this post. I know there was a lot of information in here, but I hope it gave you some insights into the benefits of the serverless microservices approach and how you might be able to get started with it. Please let me know your thoughts in the comments, or feel free to contact me with any questions or feedback.