Stable invocation target while you push new Lambda code.
→Publish numbered, immutable versions; expose an alias that points to a version. Callers invoke the alias ARN.
Why: Versions are frozen snapshots of code + config; aliases provide indirection so callers never invoke `$LATEST` directly.
Reference↗
Gradual rollout of a new Lambda version with auto-rollback on errors.
→Alias with weighted version routing (e.g. 90/10). CodeDeploy `LambdaCanary10Percent5Minutes` or `LambdaLinear*` shifts traffic and watches CloudWatch alarms.
Why: Built-in traffic shifting + alarm-driven rollback removes hand-coded canary logic.
Reference↗
Inject config (DB URL, feature flags) without redeploys.
→Lambda environment variables. KMS-encrypted at rest; reference a custom CMK for additional encryption-in-transit at retrieval.
Reference↗
Share NumPy / pandas / common runtime across many Lambdas.
→Package as a Lambda Layer; up to 5 layers per function, total 250 MB unzipped. Versioned ARN per layer.
Reference↗
Latency-sensitive synchronous Lambda — no cold starts allowed.
→Provisioned Concurrency on the alias. Pre-initializes N execution environments; pay per GB-second.
Why: Eliminates cold start at predictable cost. Set application auto-scaling on the alias to flex with load.
Reference↗
Java or Python Lambda with heavy init code; need fast cold start without paying for Provisioned Concurrency.
→Enable SnapStart on a published version. AWS snapshots the initialized runtime and resumes from it.
Why: Free for Java; charged per restore for Python/.NET. Cuts cold starts from seconds to <1s without idle cost.
Reference↗
Lambda needs to consume a Kinesis stream / DynamoDB Stream / SQS queue / MSK topic.
→Event source mapping (pull-based). Lambda polls; batch size + maximum batching window tune throughput vs latency. Failure → DLQ via On-Failure destination.
Why: For pull sources, the service can't directly invoke Lambda; the mapping is Lambda's polling adapter.
Reference↗
Async Lambda success/failure routing without Lambda DLQ.
→OnSuccess / OnFailure destinations on the function. Targets: SNS, SQS, EventBridge, another Lambda. Includes invocation context.
Why: Destinations capture the full event + response; legacy DLQ only captures the event payload.
Reference↗
Pick API Gateway type for a new REST API.
→HTTP API: cheaper, faster, JWT auth built-in, simpler. REST API: full features (mapping templates, request validators, WAF, private endpoints, X-Ray, API caching).
Why: Default to HTTP API unless you need a REST-only feature. WebSocket APIs are a separate product for stateful real-time.
Reference↗
Promote API changes from dev → test → prod without redeploying separate APIs.
→Stages on a single API. Deploy a stage to publish; stage variables hold environment-specific values like Lambda alias names.
Reference↗
Backend Lambda expects a different shape than what the client sends.
→Request/response mapping template (REST API only). VTL with `$input`, `$context`, `$util` to transform JSON.
Why: Mapping templates run in API Gateway — no extra Lambda hop, no extra latency or cost.
Reference↗
Validate a custom token (not Cognito, not IAM) before routing the request.
→Lambda authorizer. TOKEN type reads a header; REQUEST type reads full request context. Returns IAM policy + principalId. Cached per identity for TTL.
Reference↗
Validate a Cognito User Pool JWT on every request.
→Cognito User Pool authorizer (REST) or JWT authorizer (HTTP). API Gateway validates the token; no Lambda needed.
Why: Native validation is cheaper and faster than a Lambda authorizer for the common JWT case.
Reference↗
Throttle/quota a partner API consumer.
→Usage Plan + API Key. Plan ties keys to a stage with rate limit (req/sec) + burst + quota (req/day or month).
Reference↗
Reduce backend load for repeated GET requests.
→Stage-level cache (REST API). TTL configurable; cache key derived from method + path + selected query/header params.
Reference↗
Update an item only if a precondition holds (e.g. status == "PENDING").
→PutItem/UpdateItem with `ConditionExpression`. Failure raises `ConditionalCheckFailedException`.
Why: Server-side check avoids read-modify-write races without locking.
Reference↗
All-or-nothing across multiple DynamoDB items.
→`TransactWriteItems` / `TransactGetItems`. Up to 100 items / 4 MB; 2× the WCU/RCU cost of normal writes/reads.
Reference↗
Increment a counter without read-modify-write.
→UpdateExpression `ADD count :inc`. Server applies the delta atomically.
Reference↗
Need an additional access pattern beyond the primary key.
→GSI: alternate partition + sort key, eventually consistent, separate capacity, can be added any time. LSI: same partition key, alternate sort key, strong-consistency option, must be created at table creation.
Reference↗
Index only items that have a particular attribute (e.g. only ACTIVE orders).
→Sparse index: omit the attribute on items you want excluded. Items without the indexed attribute don't appear in the GSI/LSI.
Reference↗
Bulk read/write many items.
→`BatchGetItem` (up to 100 items / 16 MB) and `BatchWriteItem` (up to 25 items / 16 MB). Not atomic; partial failures returned in `UnprocessedItems`.
Reference↗
Prevent lost updates from concurrent writers.
→Version attribute + `ConditionExpression: version = :v`. Failed writes retry by re-reading.
Reference↗
Trigger downstream actions on every DynamoDB change.
→DynamoDB Streams + Lambda event source mapping. Stream view: NEW_IMAGE / OLD_IMAGE / NEW_AND_OLD_IMAGES / KEYS_ONLY.
Reference↗
Browser uploads/downloads directly to S3 without your server proxying bytes.
→SDK `getSignedUrl` for GET or PUT. Expiry up to 7 days when signed by IAM user (sigv4); shorter for role-derived sessions.
Why: Off-loads bandwidth from your backend; URL is a temporary capability scoped to one object + method.
Reference↗
Upload a large file (≫100 MB) reliably from the SDK.
→`CreateMultipartUpload` → parallel `UploadPart` → `CompleteMultipartUpload`. SDK high-level transfer manager handles part sizing automatically.
Why: Required >5 GB; recommended ≥100 MB. Failed parts re-upload independently. Set lifecycle to abort incomplete multiparts to reclaim storage.
Reference↗
Run code when an object is created/deleted in S3.
→S3 Event Notifications → Lambda / SNS / SQS / EventBridge. Filter by prefix and suffix.
Reference↗
Browser app fetches from S3 across origins (`fetch('https://bucket.s3...')`); CORS preflight fails.
→Configure bucket CORS rules: allowed origins, methods (GET/PUT), headers, and exposed headers.
Reference↗
Filter rows from a 50 GB CSV/JSON/Parquet object without downloading it.
→S3 Select with SQL. Returns matching rows only; pay for scan + return bytes.
Reference↗
Sign in a user from a public mobile/web client without sending the password.
→Cognito User Pool with `USER_SRP_AUTH` flow. Client computes SRP proof; backend never sees the password. Returns ID + access + refresh tokens.
Reference↗
Federated user (Google/Apple/Cognito UP) needs temporary AWS credentials to call AWS APIs directly from a mobile app.
→Cognito Identity Pool. Exchanges identity provider token → IAM role → temporary AWS credentials via STS.
Why: User Pools authenticate users; Identity Pools authorize them to AWS resources.
Reference↗
Pick a Step Functions workflow type.
→Standard: long-running (≤1 year), exactly-once, $0.025/1k transitions, full history. Express: ≤5 min, at-least-once or at-most-once, billed per request + duration; for high-volume ETL/streaming.
Reference↗
Workflow step fails; want retry with backoff and route to a recovery state.
→`Retry` array (per-state, with `BackoffRate` + `MaxAttempts`) and `Catch` for terminal failure routing. Match by `ErrorEquals` (e.g. `States.TaskFailed`, custom error names).
Reference↗
Apply the same workflow to each item in an array, with concurrency cap.
→Map state with `ItemsPath` and `MaxConcurrency`. Distributed Map handles 10k+ items with S3-backed input.
Reference↗
Trigger Lambda on either a cron schedule or matching incoming events.
→EventBridge rule. Schedule: `rate(...)` or `cron(...)`. Pattern: JSON event filter; match on source, detail-type, detail fields.
Reference↗
Route events from SQS / Kinesis / DynamoDB Streams / MSK to a target with optional filter + transform.
→EventBridge Pipes. Source → Filter → Enrichment (Lambda/Step Functions) → Target. No Lambda needed for the simple cases.
Reference↗
Process messages strictly in order per customer, with deduplication.
→SQS FIFO queue. `MessageGroupId` partitions ordering (parallelism per group); `MessageDeduplicationId` (or content-based dedup) drops duplicates within 5 minutes.
Reference↗
Consumer pulls a message but crashes before deleting it.
→Message hidden for VisibilityTimeout seconds, then reappears for redelivery. Tune to longest expected processing time + buffer.
Why: Too short → duplicate processing. Too long → slow recovery on crash. ChangeMessageVisibility extends in-flight if needed.
Reference↗
One event must reach multiple consumers (Lambdas / SQS queues / HTTP endpoints).
→SNS topic with multiple subscribers. Subscription filter policies route only matching messages per subscriber.
Reference↗
Tune Kinesis Data Streams capacity for write throughput.
→Each shard = 1 MB/s or 1000 records/s in, 2 MB/s out. Add shards (split) or use On-Demand mode for auto-scaling.
Reference↗