
Explore common Kinesis data streams issues, including hot shards, throughput limits, data duplication, and ordering across shards; learn remedies with random partition prefixes, unique producer IDs, and shard scaling.
Perform a hands-on with Kinesis Data Streams, creating a stream, selecting capacity mode, and sending and consuming records using the SDK and KPL, with monitoring via CloudWatch.
The snow family uses physical storage devices to transfer terabytes from on premises to AWS and export data from S3, ideal for network-restricted environments.
Explore how to mask CSV data using S3 Object Lambda access point and a Lambda function, configure permissions, and test with a sample dataset in AWS data analytics training.
Explore AWS database services, compare relational RDBs like Oracle and MySQL with NoSQL options such as DynamoDB and DocumentDB, and learn managed patching, automatic failover, read replicas, and security features.
Understand Amazon RDS backups, including automatic backups, manual snapshots, and point-in-time recovery that create a new DB instance and cannot restore to an existing one from S3 transaction logs.
Explore Aurora backups, offering more extensive backup capabilities than RDS, including continuous automatic backups, point-in-time recovery, cloning, and backtrack up to 72 hours.
Create and monitor an Aurora database from the RDS console, selecting MySQL compatibility, configuring cluster settings, backups, multi-availability zones, endpoints, monitoring, and performance insights for optimization.
Discover how Amazon Redshift distributes data across nodes using key, even, and old style distribution to optimize joins, local processing, and replication for small lookup tables.
Explore data control language (DCL) in Redshift, managing permissions via IAM and lake formation for specific users, roles, or tables, plus commands for cluster management and system information.
Learn how Amazon Redshift supports batch data analysis and streaming ingestion, use the managed Apache Flink service for real-time analytics, and query data on S3 with Redshift Spectrum and Athena.
Explore DAX, a fully managed in-memory cache for DynamoDB that speeds reads to microseconds via write-through caching to cache and DynamoDB, with a default time-to-live of five minutes.
Perform a hands-on DynamoDB session focusing on scanning and querying a test product three table, including batch write, projection, and key condition examples.
Demonstrate creating DynamoDB tables with local secondary indexes and global secondary indexes, setting the LSI price as sort key and GSI sku and price, then query via the GSI.
Explore Memorydb for Amazon Redis, a fully managed in-memory database delivering high performance, multi-availability zone durability, primary and replica consistency, backups up to 35 days, and up to 500 nodes.
Explore Amazon DocumentDB, a MongoDB-compatible, schemaless document database with a JSON-like structure, scalable multi-AZ deployment, and MongoDB API migration within a VPC.
Explore ETL processing concepts, comparing batch ETL with ELT in cloud data warehouses, and review AWS services for batch and streaming ETL, including AWS Glue, EMR, Lambda, and Flink.
Learn how AWS Glue provides an etl tool with a crawler, data catalog, and schema registry to run etl jobs, apply data quality rules, and automate workflows in Glue Studio.
Explore AWS Glue crawler capabilities, populating the AWS Glue Data Catalog with schemas and storage details from databases, streaming data, and S3, and leverage the schema registry for real-time streaming.
Configure asynchronous invocation for an AWS Lambda function, deliberately trigger a failure, and route failed events to an SQS dead letter queue while validating retries and monitoring.
Explore Amazon EMR, a large-scale distributed platform using Spark, Hive, and Presto to handle petabytes of data, with Elastic MapReduce and S3-based data persistence.
Explore how AWS step functions coordinates multiple services into scalable workflows using state machines, parallel processing, and conditional branches, contrasting with SQS for decoupled, linear tasks.
Perform a hands-on setup of api gateway with lambda aliases to separate development and production. Implement canary routing and weighted traffic for a new feature.
Explore Amazon MWAA, the managed airflow service, to orchestrate pipelines with dags, s3 docs, ec2 execution, and cloudwatch monitoring, and learn dag vs undirected graphs for workflow dependencies.
Improve OpenSearch performance by tuning shard size ten to fifty gigabytes and shard count twenty-to-fifty, use bulk indexing, and apply search after with phrase search, proximity, fuzzy, and language-based search.
Explore Amazon QuickSight, a managed BI tool that scales to hundreds of thousands of users and uses an in-memory engine for high performance without loading data; compare standard and enterprise.
Demonstrate a hands-on permission boundary in AWS IAM, showing how a two-part boundary allows all IAM actions yet denies attaching admin access to other users.
Learn how AWS STS issues temporary credentials via assume role for cross-account access, with expiration and dynamic generation, enabling secure CLI and environment variables usage.
Learn Amazon CloudWatch, the integrated operational monitoring service for AWS resources, collecting logs and metrics, triggering alerts via SNS and Lambda, and routing events with EventBridge.
Explore AWS CloudTrail, which records user activity and API calls as management and data events, logs timestamps and resources, and enables insights for anomaly detection; data events require explicit enablement.
Set up a VPC flow log workflow by creating an S3 bucket and EC2 instance, configuring IAM roles, saving logs to S3, and querying them with Amazon Athena.
Explore VPC endpoints to access AWS services privately from your VPC using interface and gateway endpoints, via PrivateLink, with S3 or DynamoDB, and DNS or ENI considerations.
Set up and test a vpc endpoint for s3 by creating gateway and interface endpoints, attaching an iam role with s3 read access, and validating access from an ec2 instance.
Explore AWS Direct Connect basics, including dedicated and hosted connections, private/public/transit virtual interfaces, and using Direct Connect Gateway and Transit Gateway for multi-region connectivity, with VPN backup and resilience options.
Configure CloudFront origin access control (OAC) to block direct access to an S3 bucket and deliver content only via CloudFront, by creating a private bucket, a distribution, and bucket policy.
Set up a CloudFront distribution with an S3 origin and insert a custom header using a CloudFront function. Then rewrite a page URL with Lambda@Edge to serve S3 origin content.
*Notice: The Data Analytics - Specialty exam was retired in April 2024. This course has been transitioned to align with the Data Engineer - Associate certification. The content reflects the latest Data Engineer Associate services and guidelines. If you are interested in a dedicated course specifically for the Data Engineer Associate exam, please feel free to contact me for a coupon.
Let's dive into the world of AWS Data Analytics. This comprehensive course provides you with everything you need to understand and utilize AWS data analytics services effectively.
Inside this course, you will find:
Over 600 pages of detailed presentation slides.
More than 10 hours of in-depth video content.
Curriculum aligned with the latest exam guidelines and updated AWS services.
Hands-on demonstrations through actual console operations.
Practice tests to solidify your knowledge.
The course is organized into the following domains:
Domain 1: Collection:
Covers Amazon Kinesis, Amazon Data Firehose, Amazon Managed Service for Apache Kafka, Amazon SQS, Amazon MQ, AWS Snow Family, AWS Transfer Family, AWS Database Migration Service, Amazon AppFlow, and AWS Data Exchange.
Domain 2: Storage and Data Management:
Focuses on Amazon S3, AWS Lake Formation, Amazon EFS, Amazon EBS, Amazon RDS, Amazon Aurora, Amazon Redshift, Amazon DynamoDB, Amazon MemoryDB for Redis, Amazon DocumentDB, and Amazon Neptune.
Domain 3: Processing:
Explores AWS Glue, AWS Lambda, Amazon EMR, AWS Step Functions, Amazon API Gateway, and Amazon MWAA.
Domain 4: Analysis and Visualization:
Includes Amazon Athena, Amazon OpenSearch, Amazon SageMaker, and Amazon QuickSight.
Domain 5: Security:
Details AWS IAM, AWS STS, AWS KMS, AWS Secrets Manager, Amazon Macie, Amazon GuardDuty, Amazon Detective, Amazon CloudWatch, AWS CloudTrail, and Network Security.
Additional Resources:
Covers Developer Tools and Cost Management.
About Me (Maruchin Tech)
Hi, I'm the instructor behind Maruchin Tech! I've developed over 40 courses and practice tests here on Udemy, including a deep dive into AWS with more than 20 specialized courses.
I'm proud to have taught over 80,000 students and to have earned an average instructor rating of 4.5+ stars.
My professional focus is in the EdTech industry, where I'm passionate about creating high-quality educational content on cloud technology and programming.
My background is in computer science, and after graduating, I worked for a major Japanese automotive company. I then transitioned into IT consulting, specializing in projects for the manufacturing and logistics sectors.
I am fully certified in all active AWS certifications (as of 2024).
I'm excited to help you achieve your learning goals!