DP-900 QUESTIONS with Correct Verified
Solutions
What three main types of workload can be found in a typical modern data warehouse?
- Streaming Data
- Batch Data
- Relational Data
A ____________________ is a continuous flow of information, where continuous does not
necessarily mean regular or constant.
data stream
__________________________ focuses on moving and transforming data at rest.
Batch processing
This data is usually well organized and easy to understand. Data stored in relational
databases is an example, where table rows and columns represent entities and their
attributes.
Structured Data
This data usually does not come from relational stores, since even if it could have some sort
of internal organization, it is not mandatory. Good examples are XML and JSON files.
Semi-structured Data
Data with no explicit data model falls in this category. Good examples include binary file
formats (such as PDF, Word, MP3, and MP4), emails, and tweets.
,Unstructured Data
What type of analysis answers the question "What happened?"
Descriptive Analysis
What type of analysis answers the question "Why did it happen?"
Diagnostic Analysis
What type of analysis answers the question "What will happen?"
Predictive Analysis
What type of analysis answers the question "How can we make it happen?"
Prescriptive Analysis
The two main kinds of workloads are ______________ and _________________.
extract-transform-load (ETL)
extract-load-transform (ELT)
______ is a traditional approach and has established best practices. It is more commonly
found in on-premises environments since it was around before cloud platforms. It is a
process that involves a lot o data movement, which is something you want to avoid on the
cloud if possible due to its resource-intensive nature.
ETL
________ seems similar to ETL at first glance but is better suited to big data scenarios since
it leverages the scalability and flexibility of MPP engines like Azure Synapse Analytics,
Azure Databricks, or Azure HDInsight.
,ELT
_______________ is a cloud service that lets you implement, manage, and monitor a cluster
for Hadoop, Spark, HBase, Kafka, Store, Hive LLAP, and ML Service in an easy and
effective way.
Azure HDInsight
_____________ is a cloud service from the creators of Apache Spark, combined with a
great integration with the Azure platform.
Azure Databricks
____________ is the new name for Azure SQL Data Warehouse, but it extends it in many
ways. It aims to be the comprehensive analytics platform, from data ingestion to
presentation, bringing together one-click data exploration, robust pipelines, enterprise-
grade database service, and report authoring.
Azure Synapse Analytics
A ___________ displays attribute members on rows and measures on columns. A simple
____________ is generally easy for users to understand, but it can quickly become difficult
to read as the number of rows and columns increases.
table
A _____________ is a more sophisticated table. It allows for attributes also on columns and
can auto-calculate subtotals.
matrix
, Objects in which things about data should be captured and stored are called:
____________.
A. tables
B. entities
C. rows
D. columns
B. entities
You need to process data that is generated continuously and near real-time responses are
required. You should use _________.
A. batch processing
B. scheduled data processing
C. buffering and processing
D. streaming data processing
D. streaming data processing
A. Extract, Transform, Load (ETL)
B. Extract, Load, Transform (ELT)
1. Optimize data privacy.
2. Provide support for Azure Data Lake.
Solutions
What three main types of workload can be found in a typical modern data warehouse?
- Streaming Data
- Batch Data
- Relational Data
A ____________________ is a continuous flow of information, where continuous does not
necessarily mean regular or constant.
data stream
__________________________ focuses on moving and transforming data at rest.
Batch processing
This data is usually well organized and easy to understand. Data stored in relational
databases is an example, where table rows and columns represent entities and their
attributes.
Structured Data
This data usually does not come from relational stores, since even if it could have some sort
of internal organization, it is not mandatory. Good examples are XML and JSON files.
Semi-structured Data
Data with no explicit data model falls in this category. Good examples include binary file
formats (such as PDF, Word, MP3, and MP4), emails, and tweets.
,Unstructured Data
What type of analysis answers the question "What happened?"
Descriptive Analysis
What type of analysis answers the question "Why did it happen?"
Diagnostic Analysis
What type of analysis answers the question "What will happen?"
Predictive Analysis
What type of analysis answers the question "How can we make it happen?"
Prescriptive Analysis
The two main kinds of workloads are ______________ and _________________.
extract-transform-load (ETL)
extract-load-transform (ELT)
______ is a traditional approach and has established best practices. It is more commonly
found in on-premises environments since it was around before cloud platforms. It is a
process that involves a lot o data movement, which is something you want to avoid on the
cloud if possible due to its resource-intensive nature.
ETL
________ seems similar to ETL at first glance but is better suited to big data scenarios since
it leverages the scalability and flexibility of MPP engines like Azure Synapse Analytics,
Azure Databricks, or Azure HDInsight.
,ELT
_______________ is a cloud service that lets you implement, manage, and monitor a cluster
for Hadoop, Spark, HBase, Kafka, Store, Hive LLAP, and ML Service in an easy and
effective way.
Azure HDInsight
_____________ is a cloud service from the creators of Apache Spark, combined with a
great integration with the Azure platform.
Azure Databricks
____________ is the new name for Azure SQL Data Warehouse, but it extends it in many
ways. It aims to be the comprehensive analytics platform, from data ingestion to
presentation, bringing together one-click data exploration, robust pipelines, enterprise-
grade database service, and report authoring.
Azure Synapse Analytics
A ___________ displays attribute members on rows and measures on columns. A simple
____________ is generally easy for users to understand, but it can quickly become difficult
to read as the number of rows and columns increases.
table
A _____________ is a more sophisticated table. It allows for attributes also on columns and
can auto-calculate subtotals.
matrix
, Objects in which things about data should be captured and stored are called:
____________.
A. tables
B. entities
C. rows
D. columns
B. entities
You need to process data that is generated continuously and near real-time responses are
required. You should use _________.
A. batch processing
B. scheduled data processing
C. buffering and processing
D. streaming data processing
D. streaming data processing
A. Extract, Transform, Load (ETL)
B. Extract, Load, Transform (ELT)
1. Optimize data privacy.
2. Provide support for Azure Data Lake.