How To: Tag Your Lambda Functions for Smarter Serverless Applications

As our serverless applications start to grow in complexity and scope, we often find ourselves publishing dozens if not hundreds of functions to handle our expanding workloads. It's no secret that serverless development workflows have been a challenge for a lot of organizations. Some best practices are starting to emerge, but many development teams are simply mixing their existing workflows with frameworks like Serverless and AWS SAM to build, test and deploy their serverless applications.

Beyond workflows, another challenge serverless developers encounter as their applications expand, is simply trying to keep all of their functions organized. You may have several functions and resources as part of a microservice contained in their own git repo. Or you might simply put all your functions in a single repository for better common library sharing. Regardless of how code is organized locally, much of that is lost when all your functions end up in a big long list in the AWS Lambda console. In this post we'll look at how we can use AWS's resource tagging as a way to apply structure to our deployed functions. This not only give us more insight into our applications, but can be used to apply Cost-Allocation Tags to our billing reports as well. 👍

What's different about tagging serverless resources? 🤔

Tagging AWS resources is not new, in fact, it has been quite a staple in many organizations for everything from automation, to security, to cost tracking and more. AWS has a really good guide that outlines some tagging strategies as well as gives some examples of the types of tags that can be used. But while some organizations may have made tagging a central part of their AWS infrastructure, serverless applications present new challenges for operation teams.

A core tenet of serverless is to embrace infrastructure as code. As the underlying hardware is abstracted away, much of the provisioning of resources now falls on the developer. This has a number of benefits, but at the same time requires more discipline across the organization to maintain security and other operational standards. Tagging is no exception. Even if your organization doesn't maintain strict resource tagging standards, it's certainly good practice for developers to adopt as they build out their serverless applications.

Tagging our Lambda functions 🏷

Lambda functions, like other AWS resources, allow tags to add metadata to our infrastructure. AWS has a guide that explains how to add tags using the AWS Console or the CLI. But in reality, we are most likely going to want to automate these and not apply them manually on every deployment. The two most popular solutions for managing Lambda deployments are the Serverless Framework and Amazon's own Serverless Application Model (SAM). There are other solutions, but we'll focus on these two for now.

SAM templates are relatively straightforward if you have some experience with CloudFormation. According to the documentation, tags are a "map of string to string" when applied to Lambda functions:

yaml
AWSTemplateFormatVersion: '2010-09-09'
Transform: 'AWS::Serverless-2016-10-31'
Resources:
  MyFunction:
    Type: 'AWS::Serverless::Function'
    Properties:
      Handler: index.handler
      Runtime: nodejs8.10
      CodeUri: 's3://my-bucket/function.zip'
      Tags:
        TAG1: Tag Value
        TAG2: Tag2 Value

The Serverless Framework is just as simple, plus it allows you to assign GLOBAL tags that will apply to all functions in your serverless.yml.

Global tags set at the provider level:

yaml
provider:
  name: aws
  runtime: nodejs8.10
  region: us-east-1
  tags:
    GLOBAL-TAG1: foo
    GLOBAL-TAG2: bar

Function specific tags set under each function at the function level:

yaml
functions:
  myFunction:
    name: myFunction
    handler: myFunction.handler
    tags:
      TAG1: Tag Value
      TAG2: Tag2 Value

Serverless also provides a way to add tags at the "stack" level using stackTags. This parameter is also at the provider level, but will add tags to every resource generated by your CloudFormation template. This includes any resource that supports tags including DynamoDB, S3 buckets, functions and more:

yaml
provider:
  name: aws
  runtime: nodejs8.10
  region: us-east-1
  stackTags:
    RESOURCE-TAG1: foo
    RESOURCE-TAG2: bar

Lambda tagging best practices 😎

AWS outlines a number of best practices for tagging strategies in the guide I mentioned earlier. Below are the few that apply more directly to Lambda functions as well as a few practices I recommend.

Always use a standardized, case-sensitive format for tags, and implement it consistently across all resource types.

Being consistent is important (especially with case) as this affects how you can query tagged resources. In the Lambda console, case doesn't have any effect on tags. For example, if one function is tagged with STAGE=dev and another is tagged with stage=DEV, both will show if you filter by either combination. This is NOT the case with the API and CLI as they are both case-sensitive and would only show the exact combination.

Consider tag dimensions that support the ability to manage resource access control, cost tracking, automation, and organization.

There are some really cool things you can do with tags, Lambda, and access control, such as limiting the ability for certain roles to invoke functions, access other resources, etc. This is something to think about as applications grow and you want to apply more fine-grained access control. Cost tracking is another great way to use tags for Lambda. You can enable Cost-Allocation Tags that would allow you to track costs based on everything from individual Lambda function invocation, to cost per microservice, or even cost by HTTP method type.

Err on the side of using too many tags rather than too few tags.

Tags don't cost you anything, nor do they affect performance of your functions. The more tags you have, the more detailed you can get when using those tags. Querying across multiple dimensions can yield some interesting insights, so don't be afraid to add several tags. You can always remove the ones you don't need later, but you can't recapture past invocations.

Tag relationships and other dependencies

Lambda functions don't often exist in a vacuum. They are event-driven and require some type of trigger to invoke them. Tagging these relationships and dependencies is a great way to quickly see what services and resources are related. For example, adding a TRIGGER tag with a value of the SQS queue or SNS topic name is useful for grouping. Adding the API Gateway ROUTE that is used to access the function can make debugging easier.

Categorize functions by purpose

I like to use tags that quickly let me know the purpose of each function. Is the function for "data processing", "data enrichment", "image processing" or some other use? Having tags that can group function by purpose is super helpful with managing costs. In addition, I like adding a PUBLIC tag with either a true or false value to let me know which functions are accessed via API Gateway or Lambda@Edge.

Tag microservices

If you are using a microservices design, I find it useful to tag all the functions that are part of that microservice with a unique name. This allows me to quickly see which functions are grouped together and breakdown costs by SERVICE.

Tagging limitations 🙁

As with all good things that AWS brings us, there are a few limitations:

Resources are limited to 50 tags each
Tag keys and values are case-sensitive
Tag keys can contain up to 128 characters
Tag values can contain up to 256 characters
Tag keys can not be blank
Tag values CAN be blank
Tag keys and values must satisfy the regular expression pattern: ^([\p{L}\p{Z}\p{N}_.:/=+\-@]*)$]. This means no commas, asterisks, or semicolons.

The last one is particularly bothersome to me. I would prefer to tag API routes with values like /users/* to indicate that this function handles all user routes, but instead I often use /users/+. Also, you can't use commas, so you'll need to use a different strategy when adding multiple values.

Speaking of multiple values, the documentation appears to be incorrect in that it shows an example of multiple values being separated by commas. This does NOT work, so I'm assuming the docs were written incorrectly, or it was changed at some point and they weren't updated. Either way, dealing with multiple values is a bit of a pain. You can't use semicolons either, so adding either a hyphen or a colon has been my preferred choice. As with case sensitivity, the Lambda console behaves a bit differently. It allows for partial value matches, but the CLI does not. So if you had a TRIGGER tag that stores multiple values liked mySQSQueue:mySNSTopic, a TRIGGER:mySNSTopic filter would work in the console, but not the CLI.

Making use of tagged resources 🤓

As I mentioned earlier, a really great benefit of resource tags is the integration with billing. AWS gives detailed instructions how to set this up here, so I won't cover that again. I will say, that for me, this is an invaluable feature. It is especially handy when analyzing the cost of worker functions that handle services like data processing pipelines.

The other really cool thing you can do is query your functions using the API or the CLI. This can make for some interesting tooling to integrate with your CI/CD pipeline or even cloud management and automation tools. The API is documented here and the CLI requires using the resourcegroupstaggingapi service with the get-resources command documented here.

I like using the CLI for quick and easy queries:

sh
aws resourcegroupstaggingapi get-resources --profile myprofile --tag-filters Key=ROUTE,Values=/users+

You can also just query based on whether or not a tag exists:

sh
aws resourcegroupstaggingapi get-resources --profile myprofile --tag-filters Key=TRIGGER

Tagging other resources 📦

Another great use of tags with serverless applications is to use them to group other resources together. This is particularly useful with microservice architectures. For example, if you have a microservice with four Lambda functions that use a Kinesis stream and a DynamoDB table, applying a SERVICE: my-microservice-name tag to all of them would be great for cost analysis purposes and for organization as well.

The good news is that CloudFormation supports Kinesis and DynamoDB table tagging. This means that you can add the tags to your Serverless.yml or SAM templates to automate this process. CloudFormation does not support SQS or API Gateway tags yet, although both services support tags. If you wanted, you could tag these resources manually through the console or use the resource tagging API. Not the best scenario, but useful if you want the benefits of grouping all related resources.

SNS does not support tags at all 😡, which doesn't make sense, but I'm sure they eventually will. Until they do, adding a tag that references any dependent topics would be my preferred work around.

Final Thoughts

I hope you see the benefits of tagging and how it can help you organize and analyze costs for your serverless applications. There are certainly other ways to achieve organization using naming conventions, for example, but I prefer tagging as it gives you a tremendous amount of flexibility around automation, access, and cost analysis as well. If you're not using a tagging strategy with your serverless applications, it's definitely something you should consider. 👊

What are your tagging strategies and best practices for your Lambda functions? Are you using tags to automate or control access? Let me know in the comments!

If you're looking for other serverless and Lambda best practices, check out the 15 Key Takeaways from the Serverless Talk at AWS Startup Day. Also, be sure to read my Securing Serverless: A Newbie's Guide to make sure your serverless apps are following security best practices.

Tags: #aws, #serverless, #api-gateway, #aws-lambda, #serverless-cookbook, #microservices