
Introduction to real world bot traffic, key drivers behind the July 2025 surge including AI data demand, Google updates and security gaps, with strategies to manage AI bots on modern infrastructure.
Walkthrough of the course lab components: Flask app, Terraform modules, and the final AWS architecture with CloudFront and WAF.
Covers identification and classification, auto-scaling, caching with CloudFront, and the degraded content strategy, including missing assets solutions.
Covers perimeter defenses with WAF, application-layer defenses with Bot Control SDK integration, and the strategic bot policy framework of allow, block, and degrade.
Docker Compose setup, running the container, verifying the test page, and reviewing the Dockerfile and Gunicorn entry point.
Application routes, templates, the WAF integration URL variable, and how the test HTML page is structured.
Installing tfenv, managing multiple Terraform versions, switching between versions, and verifying the installation.
AWS CLI profile setup, exporting environment variables, verifying account identity, and configuring env.tfvars.
Creating the ECR repository, running the build and push scripts, and verifying the image in the AWS Console.
Terraform workflow overview, version requirements, and expectations for the infrastructure deployment sections.
Creating the S3 state bucket, DynamoDB lock table, VPC, subnets, route tables, internet gateway, and security groups.
Route 53 hosted zone creation, NS record update at the domain registrar, and DNS-validated SSL certificates for ALB and CloudFront.
Application Load Balancer deployment with HTTP to HTTPS redirect, certificate attachment, S3 access logging, and CloudWatch alarms.
Clarifies why the project name prefix for some variables differs in this version of the code. All concepts remain fully applicable.
Launch template configuration, user data script, IAM role with ECR and CloudWatch permissions, and Auto Scaling group setup.
Target group creation, health check configuration at the /ping endpoint, and host-header-based listener rules on the ALB.
Instance verification, Auto Scaling group status, key pair setup, SSH configuration, and confirming the Docker container is running.
Distributions, behaviors, origins, cache policies, origin request policies, and response header policies explained.
CloudFront distribution with ALB and S3 origins, multiple cache behaviors, Route 53 DNS alias, and S3 log buckets.
WAF web ACL creation, Kinesis Firehose log streaming to S3, IP sets for allow and block lists, and CloudWatch alarms.
Resource inspection in the console, verifying the test page via CloudFront, and implementing maintenance mode using ALB rules and S3.
Auto-scaling inertia explained: metric delay, cooldown periods, container startup time, and why scaling cannot respond to two-minute bot spikes.
Real ALB metrics and application logs showing a Petalbot and Ahrefsbot spike, CPU saturation, and auto-scaling inability to mitigate the burst.
Edge locations, regional edge caches, cache policies, cache keys, TTL settings, invalidation, and the difference between CloudFront Functions and Lambda@Edge.
Use cases for serving degraded content to bots: cached full content, simplified HTML, and hybrid approaches that balance freshness with cost savings.
Concrete AWS implementation: Lambda@Edge architecture for User-Agent inspection, forwarding bot traffic to a secondary CloudFront distribution backed by S3.
Terraform packaging, IAM role, function association, and the Python logic for User-Agent detection and origin swap.
Deployment, testing with a custom User-Agent, verifying S3 content delivery, and observing cache collision between bot and human responses.
Demonstrates how bot and human cache mixing occurs when both share the same cache key, and why this produces incorrect responses for real users.
JavaScript CloudFront Function that adds an x-bot header, cache policy update to include the new header, and verification of separate cache behavior.
Explains how cached HTML can reference assets that no longer exist after deployment, and why crawlers behave differently from browsers.
Reproduces the 404 error in the lab, demonstrates broken page rendering, and explains why per-POP caching makes detection difficult.
Enabling Origin Shield in Terraform, understanding its centralized caching benefit, and proving that it does not fully solve the missing assets problem.
Versioned S3 bucket for CSS and JS files, origin change in CloudFront behaviors, retention policy, and verification that old assets remain available.
Creating self-contained HTML with inline CSS and JS for degraded content, eliminating all external asset dependencies.
Summary of degraded content routing, cache separation with CloudFront Functions, and immutable asset deployments as the production pattern.
WAF as a reverse proxy, AWS managed rules versus custom rules, the ACL structure, COUNT mode, and how rules are evaluated by priority.
IP sets for allow and block rules, combining IP conditions with User-Agent checks, and security risks of whitelisting.
GEO match rule with country codes, NOT statement for IP whitelist exceptions, and the strategy of blocking risk countries while preserving bot access.
Athena configuration, S3 results bucket, WAF log table creation with partitioning, and geographic bot distribution queries.
JA4 fingerprint overview, enabling it in CloudFront and WAF, Terraform rule with custom key aggregation, and fallback behavior.
Athena percentile query to calculate thresholds, real traffic pattern analysis, scope-down statements, and production data examples.
Per-URI rate limits for API endpoints and category pages using single-match and OR-condition rules in Terraform.
Modern web applications are increasingly exposed to a surge of automated traffic driven by AI crawlers, LLM scrapers, and malicious bots. These automated requests can consume bandwidth, distort analytics, increase infrastructure costs, and degrade application performance. Traditional defense mechanisms are no longer sufficient to handle these evolving threats. This course provides a comprehensive, hands-on approach to building a robust, multi-layered defense system against AI-driven bot traffic using AWS services.
In this course, you will learn how to design and deploy a production-grade infrastructure using Terraform, AWS CloudFront, AWS WAF, Lambda@Edge, and other essential tools. Starting with a simple Flask application, you will progressively build a complete AWS environment, including networking, load balancing, auto-scaling, and edge delivery. You will then enhance this architecture with intelligent traffic routing, bot-aware caching strategies, and degraded content delivery techniques to efficiently manage bot traffic without impacting real users.
The course also emphasizes real-world problem-solving, such as handling sudden bot traffic spikes, preventing cache collisions, and resolving missing asset issues. Additionally, you will analyze traffic data using Amazon Athena to generate actionable insights and implement a strategic bot management policy based on real data.
By the end of this course, you will have the skills to design, deploy, and manage a scalable, secure, and cost-efficient AWS-based system that effectively defends against modern AI bot threats using a data-driven and infrastructure-as-code approach.