Securing Serverless: A Newbie's Guide
A practical guide to security for your serverless applications. Learn best practices, how to mitigate risks, and how to keep your users' data safe.
So you've decided to build a serverless application. That's awesome! May I be the first to welcome you to the future. 🤖 I bet you've done a lot of research. You've probably even deployed a few test functions to AWS Lambda or Google Cloud Functions and you're ready to actually build something useful. You probably still have a bunch of unanswered questions, and that's cool. We can still build some really great applications even if we only know the basics. However, when we start working with new things we typically make a bunch of dumb mistakes. While some are relatively innocuous, security mistakes can cause some serious damage.
I've been working with serverless applications since AWS launched Lambda in early 2015. Over the last few years I've developed many serverless applications covering a wide range of use cases. The most important thing I've learned: SECURE YOUR FUNCTIONS! I can tell you from personal experience, getting burned by an attack is no bueno. I'd hate to see it happen to you. 😢
To make sure it doesn't happen to you, I've put together a list of 🔒Serverless Security Best Practices. This is not a comprehensive list, but it covers the things you ABSOLUTELY must do. I also give you some more things to think about as you continue on your serverless journey. 🚀
The Basics
Serverless is a new paradigm when it comes to building, deploying and maintaining applications. While there are some major benefits of using serverless (like no more patching or worrying about long-running compromised servers), it also introduces additional complexities in how we manage security and maintain our applications. Serverless also isn't a magic bullet against OWASP's Top Ten security risks (2017). All of these risks are still relevant and may even be harder to detect.
Below are the absolute minimum steps you need to take to protect your serverless application. I'm not trying to scare you, but try to imagine all life as you know it stopping instantaneously, and every molecule in your body exploding at the speed of light. That won't actually happen if you fail to follow these recommendations, but it will feel like it. 💥
Least Privilege Principle
You've probably heard this before, especially if you've worked with AWS or other cloud services. The concept is pretty simple: "every module must be able to access only the information and resources that are necessary for its legitimate purpose." (Wikipedia). Every AWS Lambda function requires an IAM Role. Make sure that you assign only permissions that the function MUST have. People love to use things like Action: -"sns:*"
. That means this function can do ANYTHING to the Simple Notification Service including creating new topics, deleting topics, and sending SMS messages. If your function doesn't need to do everything (which is totally doesn't), then don't use a damn wildcard for permissions.
Beware of Third-Party Packages
Isn't it great when other people write libraries for us so we don't have to write our own? Yes, it is, but did you know that most of the people who write these packages aren't very good at security? In the summer of 2017, a post on Github discussed an exploit that allowed the author to gain access to NPM accounts through a series of relatively simple attacks. This gave him the ability to affect nearly 54% of the NPM ecosystem. If a malicious user would have taken advantage of this, your application could have been compromised because you used require
to pull in a dependency.
Does that mean I can't use third-party dependencies in my serverless apps? Well, yes and no. There are some pretty amazing packages out there (not just for Node.js), but as a developer, you need to be aware of the security risks, especially when it comes to dependency chains. My first step when evaluating 3rd party packages is to look at the number of dependencies. If there are a lot of them (especially from multiple authors), then the risk is higher. You also need to look at the dependencies' dependencies. Then the dependencies' dependencies' dependencies. This could obviously go on and on and drive you insane. I typically avoid packages with lots of dependencies. If I find a package that does a lot of stuff, I'll often look at what packages it uses, and implement my own solution using its components. Sure it's more work, but it makes me feel warm and fuzzy.
But I have to use XYZ package because it does the most awesomest thing ever!!! Okay, I get it, I've been there too. If you are going to use a third-party module with a lot of dependencies, then use package locks or NPM shrinkwrap. This will allow you to "lock" your dependencies so that no new updates will creep into your code until you've had a chance to review them. If you are using nvm
(Node Version Manager), just remember that package locks weren't automatic until v5 of NPM. So if you're developing your local Node.js apps using v6.10 (or earlier) to match your provider's runtime, you will have to implement this manually.
See also A9:2017-Using Components with Known Vulnerabilities
Protect User Data ✋
Users are trusting you with their private information, whether that be their email address, phone number, credit card data, or pants size. You have a responsibility to keep that data safe and secure. Your serverless application is obviously going to need access to data to do anything exciting. This means you'll be passing data between other cloud services like Elasticache, RDS, and DynamoDB as well as potentially using other third-party APIs and webhooks. Data in transit should always be encrypted using a secure protocol like TLS. Traffic to and from your API Gateway always uses HTTPS, which is great, but make sure any front-end pages accessing that data are running HTTPS as well. DynamoDB now lets you automatically encrypt data at rest, but if using something like MySQL, make sure to encrypt passwords using a strong salted hashing function as well as encrypt other sensitive data using a secure key.
Another common mistake is exposing sensitive data through logs and alerts. If you're using a third-party framework, it's possible that built-in logging (which is great for development) can be a huge security hole. Exposing clear text passwords and other sensitive data to logs opens up another attack vector for hackers. Alerts can be even more dangerous. I always fire off an alert when something goes wrong with one of my serverless apps, but I never send sensitive data or a full stack dump. That information gets sent via SMS and/or email and becomes much easier to steal. Plus you're now trusting your email and mobile providers with your customer's sensitive data.
See also A3:2017- Sensitive Data Exposure
You might say to yourself, "that's cool, I'll just disable logging altogether and run my application in the dark." Au contraire mon frère, because...
Logging is your Best Friend 📝
Another major problem with serverless applications is that logging is up to the developer to implement. This means that unless you console.log
something, your application executes and then fades into the wind. Server-based applications typically have all kinds of logging that we can use to determine if something nefarious is happening. With serverless apps, we need to build our own logging mechanism so that we can properly monitor our app.
Here are a few suggestions on things you should definitely log:
- Logins: be sure to log things like the IP address, device, etc.
- Failed logins: log the number of failed attempts, IP address, device type, etc.
- Account modifications: things like updated passwords, email changes, etc.
- Other database interactions: confirmation of inserted, modified and deleted records
- Financial transactions: e.g. credit cards, PayPal, Apple Pay, etc.: log transaction numbers, IP address, the amount, and the user account
In addition to simply logging information, make sure you can actually DO something with this data. While capturing the number of failed login attempts is helpful for a forensic audit, it will do little good if someone is brute force attacking your authentication system. Start by limiting logins from the same username to a maximum number of failed attempts. Next, keep a running counter of login failures from the same IP block. If that number reaches some threshold, send yourself an alert so you can investigate. Speaking of thresholds, things like database connections, queries per second, memory consumption, and average execution time are good indicators of suspicious activity. Set appropriate alarms in AWS CloudWatch so that spikes in these metrics will notify you immediately.
Finally, make sure you capture USEFUL error messages. It's fine to return a "500: something went really wrong here" message to your users, but your system should capture as much detail as possible. This should include the stack trace, the input supplied (minus clear text passwords), the state of the application, the logged in user, and any other data that you can capture. When a major error occurs, send yourself a summary alert, just be sure not to include any sensitive data.
See also A10:2017- Insufficient Logging & Monitoring
Write Good Code 👨🏻💻
This is just plain ole good advice. Writing secure, well-tested code is critically important to securing your application. Code should be defensive, meaning it should be expecting someone to feed it bogus data. If the data pattern is unexpected, it should throw an error and notify you immediately. Your code should sanitize and escape all user supplied data. As Fox Mulder once said, "trust no one, and be careful of SQL injection attacks." I'm paraphrasing, of course. But seriously, people will throw all kinds of junk at your app. You need to make sure you strip out or escape potentially malicious code (like SQL commands and <script>
tags), set maximum string lengths (so many people forget this), and validate inputs with type and range checks.
Here are some other things you should consider when writing code for serverless apps:
- Use proper parsing: Never use something dumb like
eval()
.eval()
could execute something on the backend and reek havoc. UseJSON.parse()
instead, it will only convert properly formatted user-supplied JSON. - Minimize side-effects: Try to write pure functions with no side-effects (other than logging). Impure functions mutate variables, state and data outside of their lexical scope, which can make debugging and testing code very difficult. Given the same input, a pure function will return the same output every single time. This makes writing predictable tests much easier and allows you to simply add new tests when you start logging real world input to your functions.
- Be careful of frozen connections and variables: AWS Lambda "freezes" connections and variables outside of your main handler function and reuses these variables in successive executions. This feature is awesome because we can reuse database connections and save time by not needing to re-establish a connection every time the function runs. However, improperly used, this feature can leak data between user accounts and cause debugging headaches and corrupted user data. Never assign data specific to a user in a variable outside your main handler's scope.
See also A8:2017- Insecure Deserialization
Access should be a Privilege, not a Right
Chances are you'll be exposing functions to the public through an API. This means that anyone with an Internet connection can start banging up against your system, some for legitimate purposes, others just because they're jackasses. While we want our actual users to have a smooth experience, let's not make it easy for those with ill intent to take advantage of us. This means implementing proper authentication and access control.
If you're not familiar with authentication, read up on things like OAuth, JWT, and Bearer tokens. You need to make sure that you use an authentication method to authenticate EVERY endpoint. This gets tricky with serverless apps because you need to build authentication into every function that gets accessed directly while accounting for the ephemeral nature of your functions. Unlike a server-based application, there is no session management built in to serverless. I typically store active tokens in Redis and check them against every request. This lets me enforce token timeouts, count invocations, and manually expire tokens. If you are new to this, I do not suggest that you build your own authentication system. AWS Cognito is a good solution and fairly easy to implement.
A word of advice: don't rely solely on API keys if you are allowing users to modify data. Backend API calls for certain types of systems make authentication easier by using a static API key, but these can get compromised easily. If you do allow keys, limit what they can do and then provide additional authentication for actions that can destroy or modify data. Also, be aware of CSRF and never use something like cookie-based authentication.
Now that we've locked down access to the API itself, we also need to be aware of what our users can do once they're authenticated. Building in ACLs, or Access Control Lists, is a great way to add extra security to your API. This is obviously a much larger discussion, but the bottom line is that not every user should be able to do everything. If you have admin functions built into your API, you want to make sure that an average user doesn't have those same rights. Quick and dirty solution: assign a list of "permission ids" to each user, cache that with their token, protect every action in your system by checking against that list.
See also A2:2017-Broken Authentication and A5:2017-Broken Access Control
Protect Your Keys, Usernames and Passwords
Remember that time we talked about the Least Privilege Principle? The reason that's so important is because access keys get leaked all the time. Sometimes someone does something really stupid and checks them into Github, other times someone hardcodes it into a script on a server that gets compromised. And third-party modules? Yeah, they can easily expose keys as well. Does this mean we should just give up on life? Of course not. But we do need to take steps to make sure that our keys are as secure as possible.
Here are some suggestions to keep access keys safe and minimize security risks to our applications:
- Every developer should have separate keys: This is a little more work, but it makes it easy to shut off someone's access AND it is great for limiting what developers can do with their keys.
- Have separate keys for separate projects/products/components: Again, this is a little more work, but having different keys for different concerns mitigates the risk of a compromised key, limiting the scope of damage.
- Rotate keys on a regular basis: I know, this one is a lot of work, but it is extremely important to keeping our apps secure. Think about it, real hackers don't want you to know that you were hacked. Most are in this for profit, not glory, which means stealth attacks are much more common and therefore harder to detect. Not every compromised key will result in spinning up hundreds of virtual machines to mine Bitcoin, leaving most victims unaware that their customer data is being stolen. If you rotate your keys on a monthly (or more frequent 😬) basis, you can shut off a hacker's access (even if you don't know you were hacked). Pick someone to be your "keymaster", call them Vinz Clortho, laugh at them because they have this crappy job, and then make them rotate keys on a regular basis and securely distribute them to developers.
- Follow the Least Privilege Principle: I'm going to keep repeating this until you get it stuck in your head like a Bee Gees' song. Developers almost certainly don't need to be able to create EC2 instances or VPCs with their access keys. In this extremely rare case, create separate keys in addition to their normal developer keys. And for the love of all that is holy, DO NOT ever use the wildcard
*
for resources or actions! - Separate development and production environments: We'll talk about this a bit more later, but limiting access to production environments is Cloud Security 101. Many companies create multiple cloud service accounts that completely separate development resources from production ones. This means that we developers can make all kinds of dumb mistakes without waking up the next day with the "we just got hacked and it was my fault" hangover. 🤦🏻♂️
Great, so now our keys are relatively safe, but what about usernames and passwords to our databases, external API keys, and other sensitive information? For some reason, many developers' first instinct is to hard-code these into their scripts. Don't do that. First of all, you're most likely going to check that into your git repository, which isn't very smart. And second, every developer with access to your code repository will know the credentials for that service... forever... until you change it. Both of these open up security risks that can expose clear text credentials for systems that store user and other sensitive data.
Here are some tips to securing your credentials and protecting backend systems:
- Use AWS Systems Manager Parameter Store to store credentials: For AWS users, you can store encrypted values in the SSM Parameter Store and then give your Lambda functions access to them. It's also possible to store these as environment variables so they do not need to be queried at runtime.
- Again, Least Privilege Principle: Same thing applies when accessing other backend systems. If your application only needs to
insert
andselect
data from your MySQL database, create a user and password that ONLY has those permissions. Bonus if you limit it to certain tables too. Try using something like AWS Secrets Manager to automate this. - Separate development and production: I may sound like a broken record, but having separate systems (preferably in different accounts) that provide less restrictive access to wild and crazy developers, and Fort Knox-esque protection to production systems, significantly limits your risk of compromised credentials.
Implement CORS
Not the beer. 🍺 Cross-Origin Resource Sharing, or CORS, is a "mechanism that uses additional HTTP headers to let a user agent gain permission to access selected resources from a server on a different origin (domain) than the site currently in use." (MDN Web Docs) Essentially, these extra headers tell the web browser whether or not your API is accessible from the domain it is calling from. CORS does not apply to programmatic API access (e.g. cURL calls), but it is a very important security component when accessing your API from a web browser.
A web browser will send a preflight OPTIONS
request to your API. Your API should respond with headers like Access-Control-Allow-Origin
, Access-Control-Allow-Methods
and Access-Control-Allow-Headers
. The Access-Control-Allow-Origin
tells the browser which domains can access the API. If the current domain doesn't match, the browser logs an error. This is an important security feature. First, you most likely don't want someone else building a tool that duplicates access to your API. This can expose your users to all kinds of phishing attacks and other ways to compromise security. Second, a rogue script or plugin could attempt to steal user tokens and call your API on their behalf. CORS isn't foolproof, but it is a piece of the larger security puzzle.
Common Attack Vectors
Earlier we discussed writing good code and how we should never trust data sent by a user to our systems. The reason for this is due to a number of attacks that don't require compromising infrastructure security, but instead, simply take advantage of poorly written code. There are several common types of attacks, but the two most popular are Injection Attacks and Cross-Site Scripting Attacks (XSS).
Injection Attacks
Injection attacks can take many forms, but with web-accessible APIs, it typically involves an attacker sending SQL or system commands through your existing parameters. For example, if you return a user's data using SELECT * FROM Users where user_id = ${request.user};
, an attacker could pass in $request.user
as 1 OR 2
, allowing them to gain access to another user's data. Or they could send it in as 1; DELETE * FROM Users;
, which would delete all the data from your Users
table! These types of attacks can affect NoSQL, ORMs, OS commands, and others.
Serverless also introduces a new injection attack vector called Event Injection. Since serverless is event-driven, there are even more ways that a function can be triggered and invoked with untrusted user-supplied data. Read my post Event Injection: A New Serverless Attack Vector for more information.
The good news is that these types of attacks are relatively easy to thwart. Be sure to:
- Use prepared statements: This will parameterize inputs unlike concatenated SQL queries.
- Escape all user input:
;DELETE * FROM Users
becomes';DELETE * FROM Users'
, the added single quotes make a big difference. - Add LIMITS to queries: If you expect only ONE result (like retrieving a user's info), then
LIMIT
your query to ONE record. - Check type, range, and length: If you are expecting a number between 1 and 10, validate that the input is a number between 1 and 10. If your text field has a maximum of 200 characters, make sure the input doesn't exceed 200 characters.
See also A1:2017- Injection
XSS - Cross-Site Scripting Attacks
XSS attacks are usually the second phase of a successful Injection attack. These can be Reflected XSS, when unsanitized user input is returned back in an HTML response, Stored XSS, when unsanitized data is stored in a system's datastore and returned back and displayed to a user, and DOM XSS, when SPAs and Javascript apps dynamically load malicious code. Following the suggestions above for preventing Injection attacks is a good first step, but your front-end can be vulnerable even if the backend is secure. A successful XSS attack can effectively take over a user's session and give them authorized access to your API.
See also A7:2017- Cross-Site Scripting (XSS) for ways to mitigate XSS in your front-end systems.
DoS Attacks
I don't want to spend a lot of time on this topic, but it is worth mentioning. I'm sure you already know this, but a DoS (or Denial-of-Service) attack is when an attacker tries to make your service unavailable by flooding it with requests. A DDoS (or Distributed Denial-of-Service) attack is the same thing, just from multiple sources. While DoS attacks are typically not a major "security" risk, meaning they are unlikely to result in a system or data breach, there are some things to consider in regard to your serverless application.
- Serverless applications can scale almost indefinitely: This is good news and bad news. While your application might be able to scale up to defeat a DoS attack, your wallet might not. Thousands of requests per second could rack up some HUGE bills.
- Data sources have a max capacity too: Even if you scale up your functions to handle a DoS attack, you still run the risk of overwhelming your backend data stores. Caching can help, but you should think about rate limiting database calls per user as well.
- You can rate limit your API: AWS API Gateway lets you rate limit the number of API calls per second. This can help to mitigate charges, but it still results in your service being unavailable to your users during an attack. AWS supposedly automatically protects against DDoS attacks, but I'm not sure to what extent.
Turning Security up to an 11 🔊
If you've done the basics, which I agree is a lot to do, then you're well on your way to having a secure serverless application. If you'd like to take your security to the next level, then here are a few more suggestions.
Use the ⚡Serverless Framework
The Serverless framework is amazing! I use it with every serverless project that I work on because it makes organizing, deploying, and securing my applications a lot easier. Learn more at Serverless.com.
Implement CI/CD
Continuous Integration and Continuous Deployment go hand in hand with separating development and production environments. Code reviews, automated testing, and automatic deployment to a production system will help to ensure that production keys, usernames/passwords, and other sensitive credentials aren't unnecessarily exposed.
Create Different IAM Roles per Function
Least Privilege Principle!!! Most functions have different needs; creating a single role for all the functions in your application can open up security holes. By creating a different role for every function, especially on a team with multiple developers, you mitigate risk by restricting each function to its intended purpose.
Delete Old Functions
Old functions are a liability. As soon as a function is no longer necessary, remove it from your cloud service and delete its IAM role. You can always redeploy it later. Old functions can contain stale code that could compromise updated data structures, bypass new security enhancements, and more. Avoid the risk and remove the function.
Where Do We Go From Here?
So that's it! You're on your way to becoming a serverless security expert! 😀 Hopefully you know more now than you did a few minutes ago and will feel more confident building your serverless apps.
If you want to learn more about serverless security, I suggest you read some of the following articles and whitepapers by people who know a lot more about security than I do:
- Yan Cui's excellent Many-faced threats to Serverless security - October 25, 2017
- Hacking Severless Runtimes whitepaper by Andrew Krug and Graham Jones - July 15, 2017
- Serverless Security implications—from infra to OWASP by Guy Podjarny - April 19, 2017
- The Ten Most Critical Security Risks in Serverless Architectures by PureSec - January 17, 2018
Did I miss something? Do you disagree with me? Did I turn you off to serverless architecture? Did my multiple 🚫 Ghostbusters references upset you? Let me know in the comments.
If you're new to serverless, read my post 10 Things You Need To Know When Building Serverless Applications to jumpstart your serverless knowledge.