Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

Created byRishi Tiwari

Last updated 5/2026

English

What you'll learn

Learn AWS Lambda Durable Functions to write stateful, long-running workflows directly in plain JavaScript without complex external tools.
Master the checkpoint-and-replay model to understand exactly how AWS Lambda automatically saves progress, exits, and resumes without losing state.
Build complex serverless applications including the Saga pattern for automatic rollbacks, parallel API calls, and Human in the loop.
Ensure production-grade reliability using AWS CDK for IaC, the native testing framework, and CloudWatch for observability.

Course content

16 sections • 67 lectures • 4h 52m total length

AWS CLI Installation3:49
In this lecture, you'll install and configure the AWS CLI and verify your setup.
Documentation and Code0:43
You'll also get access to the Lambda durable functions course GitHub repository, Durable Lambda resources, and official AWS documentation used throughout the course.
Cost0:13

Why AWS Lambda Durable Functions?3:06
Discover why AWS Lambda Durable Functions exists and the problem it solves. We trace the evolution from stateless Lambda functions (with no built-in state, manual retry logic, and a 15-minute hard limit)through Step Functions (powerful but requiring you to learn a separate JSON/YAML language called ASL), and finally to Durable Functions: workflows written in pure JavaScript, no new language to learn.
What is a Lambda Durable Function?1:14
What is a Lambda Durable Function?
Create Your First Lambda Durable Function13:31
Hands-on from start to finish. You'll create a durable function in the AWS Consolet. Then you'll paste a working handler using withDurableExecution and context.step, invoke it, and watch the execution history in the console timeline.
Built-in Deduplication2:50
Lambda Durable Functions has idempotency baked in through the --durable-execution-name flag. Submit the same execution name twice, and you get back the same execution, not two separate runs. This lecture demonstrates how the durable execution name acts as a global idempotency key for your entire workflow.
Synchronous vs Asynchronous Invocation of Durable Lambda2:24
Not all invocations are equal. This lecture breaks down the difference between RequestResponse (synchronous) and Event (asynchronous) invocation for durable functions. You'll learn when synchronous invocation works (only when your total execution timeout is 15 minutes or less) and why longer workflows must use async invocation. We cover the exact CLI flags, the error you'll hit if you get it wrong (You cannot synchronously invoke a durable function with an executionTimeout greater than 15 minutes).

Avoid Non-Deterministic Code Outside of Durable Step5:43
One of the most common mistakes in Lambda Durable Functions is the hardest to debug. Because the handler always re-runs from the top on every replay, any code outside a step that produces a different value each time (like Date.now(), Math.random(), crypto.randomUUID(), or an API call) will return a different result on replay than it did on the first invocation, causing a NonDeterministicExecutionError. In this lecture, you'll see exactly why this breaks the checkpoint-replay model and learn the fix: move all non-deterministic code inside a context.step so the result is checkpointed and returned from cache on every subsequent replay.
Avoid Variable Manipulation From Durable Step1:05
A subtle trap that looks completely harmless until your workflow replays. If you declare a let variable outside a step and mutate it from inside the step body, that mutation is silently discarded on replay, leaving the variable empty when the next step tries to use it. This lecture demonstrates the broken pattern (let x; context.step(...) { x = result }) side-by-side with the correct one (const x = await context.step(...) { return result }). The rule is simple: always return values from steps, never mutate the outer scope.
Branching of Durable Workflow1:47
Durable workflows are just code, so regular JavaScript if/else branching works exactly as you'd expect, and this lecture shows you how. You'll build a media-type router where the workflow takes a different durable step depending on the incoming event: text content goes to Amazon Bedrock for analysis, images go to Amazon Rekognition, and unsupported types exit early without running any steps at all. Key insight: early returns before any context.step call are perfectly valid, no checkpoint is written, and no charge is incurred.
Error Handling in Durable Step6:39
What happens when a step throws?
By default, it retries 5 times with exponential backoff.
More Durable Operations0:26
In the next few sections, we will go through the fundamentals of different durable operations.

Different Types of Wait Operations2:04
Lambda Durable Functions gives you three distinct ways to pause a workflow, each designed for a different scenario.
context.wait pauses for a fixed duration (no compute charges during the pause).
waitForCondition repeatedly checks an external system until a condition is met, i.e., the polling pattern.
waitForCallback suspends the workflow indefinitely until an outside system explicitly signals it to resume.
invoke waits for another lambda function to finish executing.
Wait for Condition Operation11:56
A deep-dive into context.waitForCondition.
Demo: Wait For Condition Operation With AWS Polly7:59
In this AWS integration demo, you'll invoke Amazon Polly's StartSpeechSynthesisTask (an async API that returns immediately with a TaskId) inside a context.step, then use waitForCondition with createWaitStrategy to poll GetSpeechSynthesisTask every 5–15 seconds until the task status reaches "completed".
The full IAM policy for Polly + S3 is included.
This is the canonical pattern for any AWS service that returns a job ID first and completes asynchronously: Textract, Transcribe, Rekognition Video, Glue jobs, and more.
Callback Operation7:13
waitForCallback is how you integrate with anything outside AWS: a human reviewer, a third-party webhook, a payment gateway, or a mobile app.
Callback Using waitForCallback Composite Operation1:55
Callback Using waitForCallback Composite Operation.
Resolve Callback Using AWS CLI or AWS CloudShell2:50
With a live callback waiting, this lecture walks through how to send the resolution from the command line.
Project1: Human In The Loop Workflow14:39
The first capstone project, pulling together everything from Section 4. You'll build a complete human-in-the-loop pipeline.
Invoke Operation4:47
context.invoke operations lets you kick off another lambda function and wait for it to finish. Durable function checkpoints and suspends (no charges) till invocation of another function is finished.

Types of Concurrent Operations0:54
Sequential steps mean the total workflow time is the sum of every step's duration. This lecture introduces the concurrency model in Lambda Durable Functions: how multiple operations launch within the same Lambda invocation.
context.parallel4:43
context.parallel runs branches concurrently, i.e., each branch is a different function doing a different job. You'll build an order pre-check workflow that simultaneously verifies inventory, validates payment, and confirms shipping availability, then collects all three results with result.getResults().
Key concepts: each branch receives its own isolated child context (ctx), branch functions are defined inline or as named functions, and all branches start in the same Lambda invocation.
You'll see the execution history showing all three steps running concurrently rather than one after another.
Parallel Operation Configurations9:10
context.parallel has a completionConfig option that controls how failures are handled across branches.
context.map Operation1:54
context.map concurrently applies the same operation to every item in an array concurrently, with each item getting its own isolated child context and its own named step.
You'll process an array of three orders (shoes, shirt, jacket) in parallel, with each item fulfilled in fulfill-0, fulfill-1, fulfill-2 steps that you can track individually in the execution history.
You'll learn the difference between parallel (heterogeneous branches, fixed list) and map (homogeneous operation, dynamic array).

Child Context2:51
context.runInChildContext groups multiple steps and wait operations under a single named logical unit, like a sub-workflow with its own isolated checkpoint counter. On replay, the entire child context is replayed as one atomic unit rather than step by step, making it more efficient and keeping execution history clean and collapsible. This lecture covers two reasons you'd reach for it directly.
First: grouping. You'll build an order processor where validation and charging run inside a process-order child context, the result surfaces as a single entry in the execution timeline, and the parent just receives the final return value.
Second: concurrency correctness. The replay model assigns sequential IDs to operations in the order they are called, but concurrent branches resolve in a different order each run. When two branches each have multiple chained steps sharing the parent counter, the IDs get mismatched on replay, and each step gets the wrong cached result. Wrapping each branch in its own child context gives it an isolated counter, so the parent only tracks two IDs (one per branch) regardless of internal execution order. The payoff at the end: context.parallel and context.map does this automatically, and you never need to call runInChildContext manually for parallel work.

Errors3:29
What actually happens when a step throws?
This lecture walks through the default error behavior.
Retries8:19
Add a retryStrategy to any step and the SDK handles retry logic for you, no try/catch, no manual loops. This lecture covers createRetryStrategy with all its options: maxAttempts, initialDelay, maxDelay, backoffRate, and JitterStrategy.FULL (randomizes delay to prevent thundering herd).
You'll also learn the two ways to filter which errors are retryable: retryableErrorTypes (uses instanceof for class-based errors like NetworkError) and retryableErrors (matches on error message substrings or regex like /timeout/i). Non-matching errors bypass the strategy entirely and fail immediately.
Retries - Custom Retry Strategy5:26
createRetryStrategy covers most cases but sometimes you need different logic per error type and per attempt count simultaneously.
A custom retry strategy is just a function (error, attemptCount) => { shouldRetry, delay }. This lecture builds one from scratch.
Retry Presets1:18
The SDK ships two built-in shortcuts so you don't always have to configure from scratch.
Step Semantics5:43
The most important concept for non-idempotent operations.
By default, context.step uses AtLeastOncePerRetry. If Lambda crashes mid-step with no checkpoint saved, the step re-executes on replay. Safe for idempotent operations, dangerous for payments or SMS.
StepSemantics.AtMostOncePerRetry changes this: the SDK writes a START checkpoint before running the step body, so on replay it sees the START, skips re-execution, and throws StepInterruptedError instead.
Step Semantics Demo5:58
The four-scenario live experiment that makes semantics concrete

What is Serialization and Deserialization (SerDes)?6:09
Checkpoints are stored as stringified JSON, and JSON does not know about JavaScript classes.
When a step returns an instance of your Order class, the checkpoint saves a plain JSON object. On replay, the SDK reads that JSON back, but the result is now a bare {} — no label() method, no prototype, nothing.
Calling order.label() throws TypeError: order.label is not a function. This lecture demonstrates the problem live: an Order class with a label() method, returned from a step, works perfectly on the first invocation and crashes on replay. This is the exact scenario where SerDes is required.
Custom Serialization and Deserialization (SerDes) Strategy2:40
A SerDes is an object with two methods: serialize and deserialize. You can implement your own logic for these methods.
createClassSerdes Helper Function1:52
Writing serialize/deserialize manually for every class gets repetitive. createClassSerdes(MyClass) is the SDK's built-in shortcut. It generates the SerDes for you by calling new MyClass() and using Object.assign to copy parsed JSON fields onto the instance.
createClassSerdesWithDates1:38
Serdes helper function to preserve date methods.
Serdes Use Case: Reduce Checkpoint Storage Cost By Compressing The Data2:03
SerDes is not just for class restoration. The serialize method can return any string, which means you can run the data through any transformation before checkpointing it. This lecture uses Node's built-in zlib to gzip-compress a 500-row Report object before it hits the checkpoint store: serialize runs gzipSync(json).toString('base64'), deserialize runs gunzipSync(Buffer.from(data, 'base64')).
Large step results (reports, API payloads, document content) can be compressed 60–80% before storage, directly cutting checkpoint storage costs.
The same pattern extends to encryption, S3 offload for very large payloads, or any custom wire format your downstream system requires.

Saga Pattern6:10
When a distributed workflow touches multiple external systems, a failure midway through leaves everything in an inconsistent state. The Saga pattern solves this by building a compensation stack as steps succeed: each successful step pushes a corresponding undo operation. If any step throws, the catch block runs all compensations in reverse order.
Saga Pattern Example9:50
This lecture is about the saga pattern demo.

Requirements

Basic understanding of AWS services including Lambda, S3, DynamoDB and IAM
Fundamentals of JavaScript
Active AWS account for creating resources

Description

AWS Lambda Durable Functions: From Zero to Hero

You're building a workflow that includes payment confirmation, human approval, and an external API call that takes hours. On regular Lambda, you reach for SQS, DynamoDB state tables, and Step Functions JSON and end up with five services stitched together with glue code that breaks in ways you can't predict.

Until now, coordinating multi-step asynchronous processes on AWS meant manually stitching together SQS queues, maintaining custom state tables in DynamoDB, or wrestling with massive, unreadable Step Functions JSON/YAML definitions. Worse, you were constantly fighting the hard 15-minute execution limit of standard stateless Lambda functions.

AWS Lambda Durable Functions change everything.

This new execution model lets you write long-running, stateful workflows that can run for up to one full year entirely in code. Workflows automatically pause while waiting for payments, human approvals, callbacks, or external events, then resume exactly where they left off without losing state. They survive failures, handle retries natively, and charge you zero compute costs while suspended.

No orchestration glue code. No custom state tables. No workflow spaghetti. Just durable serverless workflows written using the Lambda programming model you already know.

This course takes you from the fundamentals of Durable Functions to building resilient, production-ready serverless systems.

Why Learn From This Course?

This isn't a shallow overview of the documentation.

The curriculum is built by an AWS-certified engineer whose AWS courses have been featured on freeCodeCamp and who contributed bug fixes directly to the official AWS Durable Functions SDK and docs.

You'll learn how Durable Functions behave in real-world environments, inspect execution histories and CloudWatch logs, understand replay behavior, troubleshoot failures, implement retries, and apply production-grade engineering patterns that go far beyond simple demos.

What You Will Build

QuantaSneaks Drop E-Commerce System: Build a distributed sneaker drop platform that coordinates multiple Lambda functions, manages workflow state, integrates AI risk scoring, and implements a real human-in-the-loop approval process.

When an order is rejected, a Saga Pattern automatically compensates payment.

What You'll Learn

• Checkpoint & Replay Internals

• Durable Operations & Workflow Design

• waitForCondition, waitForCallback & Heartbeats

• Parallel Execution & Map Operations

• Retry Strategies & Failure Recovery

• Idempotency & Execution Semantics

• Saga Pattern & Distributed Transactions

• Human in Workflows

• Testing Durable Functions

• CloudWatch Observability & Execution History

• Infrastructure as Code with AWS CDK

Requirements

• Basic AWS knowledge (Lambda, IAM, CloudWatch)

• JavaScript fundamentals

• AWS Account (Free Tier is sufficient)

If you want to master one of AWS's most powerful new serverless capabilities before it becomes mainstream, this course is for you. The window to learn this before it becomes mainstream is right now. Enroll and be ahead of it.

Who this course is for:

Backend Developers & AWS Engineers who are tired of over-engineering architectures by stitching together SQS, Step Functions, and DynamoDB just to coordinate a few Lambda functions.
AI Engineers building LLM pipelines or human-in-the-loop applications who need workflows that can suspend for hours or days without burning unnecessary compute.
AWS Cloud Architects who want to gain a massive competitive edge by mastering a brand-new, game-changing AWS service before it becomes mainstream.

What you'll learn

Explore related topics

Course content

Course Prerequisites3 lectures • 5min

Getting Started with AWS Lambda Durable Functions5 lectures • 23min

Durable Steps5 lectures • 16min

Mastering Wait Operations in Durable Lambda Functions8 lectures • 53min

Concurrent Operations in Durable Lambda Functions4 lectures • 17min

Project2: Employee Reimbursement Manager (Document Approval Workflow Pattern)2 lectures • 22min

Workflows with Child Contexts1 lecture • 3min

Error Handling, Retries, and Execution Semantics6 lectures • 30min

Serialization and Deserialization (SerDes) in Durable Lambda Functions5 lectures • 14min

Implementing the Saga Pattern with Durable Lambda Functions2 lectures • 16min

Requirements

Description

Who this course is for: