Hello, my name is Chandra Lingam, and I am your instructor for Data Lake in AWS.
In this course, we will start by understanding when a data lake is the right solution as opposed to a data warehouse.
Throughout the next two hours, you will learn all the components of a data lake.
One of its advantages is the flexibility to directly query files using SQL.
You will start by building a Glue Data catalog and using Athena to query.
Then, we will work on Glue ETL, a powerful Apache Spark-based solution for data transformation.
You will learn finer-points on Glue Catalog Management and Schema Evolution
To demonstrate the scalability of Athena, we will query the Amazon Customer Reviews data set with over 130 million reviews.
Finally, we will build a serverless application using Kinesis Firehose, Lambda, Comprehend AI, Glue, Athena and S3 that can process unlimited customer reviews, perform sentiment analysis, and store it in the data lake for querying.
I look forward to meeting you soon!