Talk: "AWS Lambda @ Scale: Designing for High Load"
Event: Nerdearla 2023 (https://nerdear.la)
Author: Alexis Cuadrado
In the realm of serverless computing, AWS Lambda stands as a powerhouse, providing the backend for many of today's most scalable and robust applications. A fully managed compute service capable of running your code at any scale, its surface-level simplicity can be misleading: when operating under a high influx of requests, an adecquate management of the intricate dynamics of the Lambda concurrency model is key to success. In this talk, we will demystify Lambda concurrency and throughput to help you design your applications for optimal performance under high load. We will explore strategies for optimizing individual Lambda functions for latency and illustrate how to strategically employ concurrency controls. Tailored for AWS developers, architects, and system administrators, this session seeks to transform AWS Lambda from a 'black box' to a transparent and fine-tuned engine for your serverless applications.
10. OPTIMIZEHANDLERLOGIC
EMPLOY EFFICIENT ALGORITHMS
A B
C
Put those hard-won
whiteboarding skills to use
Parallelize I/O operations
(e.g. S3 downloads)
Use Step Functions
LAMBDA @ SCALE #NERDEARLA
AVOID ORCHESTRATION
MOVE WORK OUTSIDE HANDLER
D
MULTI-THREADING
Download Code Start Environment Set up Runtime Run Static Code Run Handler
1 2 3 4 5
Reusable objects should be
statically initialized
11. TURNUPTHERAM!
Example Python Function: Return all prime numbers between 0 and 10K
MEMORY (MB) EXECUTION DURATION (MS)
128 170
256 80
512 40
1024 20
1536 17
3008 17
MEMORY = vCPU NETWORK THROUGHPUT
lowest latency
negative returns beyond this point
Beware of
LAMBDA @ SCALE #NERDEARLA
Download Code Start Environment Set up Runtime Run Static Code Run Handler
1 2 3 4 5
TIP: Enable Lambda Insights for profiling
14. ONCOLDSTARTFREQUENCY
LAMBDA @ SCALE #NERDEARLA
?
?
?
? COLD
WARM
Isn’t that what we’re all asking in our own lives?
HOW CAN WE GET MORE OF THIS?
... AND LESS OF THIS?
26. HOWLAMBDAPRICINGWORKS
$
COSTPER
TRANSACTION
also COST PER EXECUTION
$
= COMPUTE
CHARGES
REQUEST
CHARGES
Rates vary based on Region and CPU Architecture
Free Tier available
Elegible for Savings Plans
fixed fee per request
EXECUTION
DURATION
ALLOCATED
MEMORY
LATENCY
determines
influences
LAMBDA @ SCALE #NERDEARLA