Answers
A data analyst is designing a solution to interactively query datasets with SQL using a JDBC
connection. Users will join data stored in Amazon S3 in Apache ORC format with data stored in
Amazon Elasticsearch Service (Amazon ES) and Amazon Aurora MySQL.Which solution will
provide the MOST up-to-date results?
A. Use AWS Glue jobs to ETL data from Amazon ES and Aurora MySQL to Amazon S3. Query
the data with Amazon Athena.
B. Use Amazon DMS to stream data from Amazon ES and Aurora MySQL to Amazon Redshift.
Query the data with Amazon Redshift.
C. Query all the datasets in place with Apache Spark SQL running on an AWS Glue developer
endpoint.
D. Query all the datasets in place with Apache Presto running on Amazon EMR. D. Query
all the datasets in place with Apache Presto running on Amazon EMR.
A company developed a new elections reporting website that uses Amazon Kinesis Data Firehose
to deliver full logs from AWS WAF to an Amazon S3 bucket.The company is now seeking a low-
cost option to perform this infrequent data analysis with visualizations of logs in a way that
requires minimal development effort.Which solution meets these requirements?
A. Use an AWS Glue crawler to create and update a table in the Glue data catalog from the logs.
Use Athena to perform ad-hoc analyses and use Amazon QuickSight to develop data
visualizations.
B. Create a second Kinesis Data Firehose delivery stream to deliver the log files to Amazon
Elasticsearch Service (Amazon ES). Use Amazon ES to perform text- based searches of the logs
for ad-hoc analyses and use Kibana for data visualizations.
,C. Create an AWS Lambda function to convert the logs into .csv format. Then add the function to
the Kinesis Data Firehose transformation c A. Use an AWS Glue crawler to create and
update a table in the Glue data catalog from the logs. Use Athena to perform ad-hoc analyses and
use Amazon QuickSight to develop data visualizations.
A large company has a central data lake to run analytics across different departments. Each
department uses a separate AWS account and stores its data in anAmazon S3 bucket in that
account. Each AWS account uses the AWS Glue Data Catalog as its data catalog. There are
different data lake access requirements based on roles. Associate analysts should only have read
access to their departmental data. Senior data analysts can have access in multiple departments
including theirs, but for a subset of columns only.Which solution achieves these required access
patterns to minimize costs and administrative tasks?
A. Consolidate all AWS accounts into one account. Create different S3 buckets for each
department and move all the data from every account to the central data lake account. Migrate
the individual data catalogs into a central data catalog and apply fine-grained permissions to give
to each user the required access to t C. Set up an individual AWS account for the central
data lake. Use AWS Lake Formation to catalog the cross-account locations. On each individual
S3 bucket, modify the bucket policy to grant S3 permissions to the Lake Formation service-
linked role. Use Lake Formation permissions to add fine-grained access controls to allow senior
analysts to view specific tables and columns.
A company wants to improve user satisfaction for its smart home system by adding more
features to its recommendation engine. Each sensor asynchronously pushes its nested JSON data
into Amazon Kinesis Data Streams using the Kinesis Producer Library (KPL) in Java. Statistics
from a set of failed sensors showed that, when a sensor is malfunctioning, its recorded data is not
always sent to the cloud.The company needs a solution that offers near-real-time analytics on the
data from the most updated sensors.Which solution enables the company to meet these
requirements?
, A. Set the RecordMaxBufferedTime property of the KPL to "-1" to disable the buffering on the
sensor side. Use Kinesis Data Analytics to enrich the data based on a company-developed
anomaly detection SQL script. Push the enriched data to a fleet of Kinesis data streams and
enable the data transformation feature to flatten the JSON file. Instantiate a dense st B.
Update the sensors code to use the PutRecord/PutRecords call from the Kinesis Data Streams
API with the AWS SDK for Java. Use Kinesis Data Analytics to enrich the data based on a
company-developed anomaly detection SQL script. Direct the output of KDA application to a
Kinesis Data Firehose delivery stream, enable the data transformation feature to flatten the JSON
file, and set the Kinesis Data Firehose destination to an Amazon Elasticsearch Service cluster.
A global company has different sub-organizations, and each sub-organization sells its products
and services in various countries. The company's senior leadership wants to quickly identify
which sub-organization is the strongest performer in each country. All sales data is stored in
Amazon S3 in Parquet format.Which approach can provide the visuals that senior leadership
requested with the least amount of effort?
A. Use Amazon QuickSight with Amazon Athena as the data source. Use heat maps as the visual
type.
B. Use Amazon QuickSight with Amazon S3 as the data source. Use heat maps as the visual
type.
C. Use Amazon QuickSight with Amazon Athena as the data source. Use pivot tables as the
visual type.
D. Use Amazon QuickSight with Amazon S3 as the data source. Use pivot tables as the visual
type. A. Use Amazon QuickSight with Amazon Athena as the data source. Use heat maps
as the visual type.
A company has 1 million scanned documents stored as image files in Amazon S3. The
documents contain typewritten application forms with information including the applicant first
name, applicant last name, application date, application type, and application text. The company