In The video above, the reporter discusses the top AWS data engineering and analytics services to learn for data engineering. The motivation behind the video is the vast number of AWS services, exceeding 200, and the challenge faced by individuals wanting to perform data engineering tasks without knowing which services are essential. The reporter explains a hypothetical example of a data engineer tasked with integrating two different data sources into a central repository for advanced analytics by the data analytics team.
Data Ingestion is highlighted as the first step, with a focus on batch and streaming ingestion pipelines.
The AWS services discussed include AWS Lambda, AWS Glue, and Amazon EMR for handling different scenarios based on data volume. The second part explores Storage – S3, emphasizing the use of AWS S3 as the data lake to store ingested data. The importance of landing raw data in the data lake and optimizing storage formats is highlighted.
Transformation, Data Catalog, Data Warehouse, Data Analytics, Application Integration, Orchestration, and Monitoring are subsequent sections covering AWS services such as AWS Glue, AWS Redshift, AWS Athena, AWS QuickSight, Amazon EventBridge, AWS Step Functions, and AWS CloudWatch. The reporter concludes by expressing the hope that the overview provides a good starting point for data engineers to delve deeper into relevant AWS services.