Page 1
AWS Glue Job Knowledge Latest
Questions and Correct Answers
What is AWS Glue?
Ans: A serverless data integration service
What is AWS Glue crawler?
Ans: - A data discovery tool used to search datastore
- Scans and analyzes data.
- Infers data schema and creates tables in the AWS Glue
Data Catalog, making it queryable
What is the purpose of AWS Glue's Data Catalog?
Ans: A metadata repository for all your data across
various data sources. It allows you to store and query
data (almost like MySQL)
What is the process of AWS Glue Data Discovery?
Ans: When AWS Glue Crawler connects to your
datastore (like s3 or DynamoDB) infers the schema and
create a table in glue catalog by writing metadata
What does ETL mean?
, Page 2
Ans: - Extract: Data from Data Source (like raw S3 or
DynamoDB)
- Transform: Using your Glue Script (Like CSV to
Parquet)
- Load: to data target (like trusted S3 or DynamoDB)
What is extraction of data?
Ans: The process of retrieving data so it can be
processed, transformed, and loaded into another data
store
What is the transformation of data?
Ans: The process of writing a Glue job script for like
cleansing, re-formatting, aggregating, mapping, and
more on the extracted data. The objective is to prepare
and optimize the data which can be easily consumed
What is loading data?
Ans: The process of inserting the transformed data into
the target database, data warehouse, data store from
where the users and various processes can consume it
What is it called when the extraction, transformation, and
load processes work together?
, Page 3
Ans: An ETL job
Benefits of using AWS and Terraform for cloud data?
Ans: -AWS: Using Amazon Web Services (AWS) for cloud
data offers benefits like scalability, reliability, and a
wide range of cloud services and tools. AWS provides
managed data storage, processing, and analytics
services, making it easier to build and manage data
solutions at scale.
-Terraform: Terraform is an Infrastructure as Code (IaC)
tool that offers benefits such as automation, version
control, and consistency. When used in conjunction with
AWS, Terraform allows you to define and provision cloud
data resources and configurations in a structured and
repeatable manner.
Together, AWS and Terraform streamline cloud data
management by providing cloud infrastructure and
automation capabilities.term-37
What is the Glue Console?
Ans: The Glue Console is a interface that allows you to
interact with and manage various aspects of AWS Glue,
such as creating and managing ETL jobs, defining data
catalogs, setting up crawlers, and monitoring and
orchestrating data workflows. The Glue Console is a
AWS Glue Job Knowledge Latest
Questions and Correct Answers
What is AWS Glue?
Ans: A serverless data integration service
What is AWS Glue crawler?
Ans: - A data discovery tool used to search datastore
- Scans and analyzes data.
- Infers data schema and creates tables in the AWS Glue
Data Catalog, making it queryable
What is the purpose of AWS Glue's Data Catalog?
Ans: A metadata repository for all your data across
various data sources. It allows you to store and query
data (almost like MySQL)
What is the process of AWS Glue Data Discovery?
Ans: When AWS Glue Crawler connects to your
datastore (like s3 or DynamoDB) infers the schema and
create a table in glue catalog by writing metadata
What does ETL mean?
, Page 2
Ans: - Extract: Data from Data Source (like raw S3 or
DynamoDB)
- Transform: Using your Glue Script (Like CSV to
Parquet)
- Load: to data target (like trusted S3 or DynamoDB)
What is extraction of data?
Ans: The process of retrieving data so it can be
processed, transformed, and loaded into another data
store
What is the transformation of data?
Ans: The process of writing a Glue job script for like
cleansing, re-formatting, aggregating, mapping, and
more on the extracted data. The objective is to prepare
and optimize the data which can be easily consumed
What is loading data?
Ans: The process of inserting the transformed data into
the target database, data warehouse, data store from
where the users and various processes can consume it
What is it called when the extraction, transformation, and
load processes work together?
, Page 3
Ans: An ETL job
Benefits of using AWS and Terraform for cloud data?
Ans: -AWS: Using Amazon Web Services (AWS) for cloud
data offers benefits like scalability, reliability, and a
wide range of cloud services and tools. AWS provides
managed data storage, processing, and analytics
services, making it easier to build and manage data
solutions at scale.
-Terraform: Terraform is an Infrastructure as Code (IaC)
tool that offers benefits such as automation, version
control, and consistency. When used in conjunction with
AWS, Terraform allows you to define and provision cloud
data resources and configurations in a structured and
repeatable manner.
Together, AWS and Terraform streamline cloud data
management by providing cloud infrastructure and
automation capabilities.term-37
What is the Glue Console?
Ans: The Glue Console is a interface that allows you to
interact with and manage various aspects of AWS Glue,
such as creating and managing ETL jobs, defining data
catalogs, setting up crawlers, and monitoring and
orchestrating data workflows. The Glue Console is a