100% de satisfacción garantizada Inmediatamente disponible después del pago Leer en línea o como PDF No estas atado a nada 4,6 TrustPilot
logo-home
Examen

DATABRICKS EXAM 2024/2025 WITH 100% ACCURATE SOLUTIONS

Puntuación
-
Vendido
-
Páginas
42
Grado
A+
Subido en
03-09-2024
Escrito en
2024/2025

DATABRICKS EXAM 2024/2025 WITH 100% ACCURATE SOLUTIONS

Institución
DATABRICKS ENGINEER ASSOCIATE
Grado
DATABRICKS ENGINEER ASSOCIATE

Vista previa del contenido

DATABRICKS - DATA ENGINEER
ASSOCIATE EXAM 1 2024/2025

You were asked to create a table that can store the below data, <orderTime> is a timestamp but the
finance team when they query this data normally prefer the <orderTime> in date format, you would like
to create a calculated column that can convert the <orderTime> column timestamp datatype to date
and store it, fill in the blank to complete the DDL.



CREATE TABLE orders (

orderId int,

orderTime timestamp,

orderdate date _____________________________________________ ,

units int)



A. AS DEFAULT (CAST(orderTime as DATE))

B. GENERATED ALWAYS AS (CAST(orderTime as DATE))

C. GENERATED DEFAULT AS (CAST(orderTime as DATE))

D. AS (CAST(orderTime as DATE))

E. Delta lake does not support calculated columns, value should be inserted into the table as part of the
ingestion process - Precise Answer ✔✔B. GENERATED ALWAYS AS (CAST(orderTime as DATE))



Explanation

The answer is, GENERATED ALWAYS AS (CAST(orderTime as DATE))



https://docs.microsoft.com/en-us/azure/databricks/delta/delta-batch#--use-generated-columns



Delta Lake supports generated columns which are a special type of columns whose values are
automatically generated based on a user-specified function over other columns in the Delta table. When
you write to a table with generated columns and you do not explicitly provide values for them, Delta
Lake automatically computes the values.

,Note: Databricks also supports partitioning using generated column



The data engineering team noticed that one of the job fails randomly as a result of using spot instances,
what feature in Jobs/Tasks can be used to address this issue so the job is more stable when using spot
instances?



A. Use Databrick REST API to monitor and restart the job

B. Use Jobs runs, active runs UI section to monitor and restart the job

C. Add second task and add a check condition to rerun the first task if it fails

D. Restart the job cluster, job automatically restarts

E. Add a retry policy to the task - Precise Answer ✔✔E. Add a retry policy to the task



The answer is, Add a retry policy to the task



Tasks in Jobs support Retry Policy, which can be used to retry a failed tasks, especially when using spot
instance it is common to have failed executors or driver.



What is the main difference between AUTO LOADER and COPY INTO?



A. COPY INTO supports schema evolution.

B. AUTO LOADER supports schema evolution.

C. COPY INTO supports file notification when performing incremental loads.

D. AUTO LOADER supports reading data from Apache Kafka

E, AUTO LOADER Supports file notification when performing incremental loads. - Precise Answer ✔✔E,
AUTO LOADER Supports file notification when performing incremental loads.



Explanation

Auto loader supports both directory listing and file notification but COPY INTO only supports directory
listing.

,Auto loader file notification will automatically set up a notification service and queue service that
subscribe to file events from the input directory in cloud object storage like Azure blob storage or S3.
File notification mode is more performant and scalable for large input directories or a high volume of
files.



Auto Loader and Cloud Storage Integration



Auto Loader supports a couple of ways to ingest data incrementally



Directory listing - List Directory and maintain the state in RocksDB, supports incremental file listing

File notification - Uses a trigger+queue to store the file notification which can be later used to retrieve
the file, unlike Directory listing File notification can scale up to millions of files per day.




[OPTIONAL]

Auto Loader vs COPY INTO?



Auto Loader

Auto Loader incrementally and efficiently processes new data files as they arrive in cloud storage
without any additional setup. Auto Loader provides a new Structured Streaming source called cloudFiles.
Given an input directory path on the cloud file storage, the cloudFiles source automatically processes
new files as they arrive, with the option of also processing existing files in that directory.

When to use Auto Loader instead of the COPY INTO?



You want to load data from a file location that contains files in the order of millions or higher. Auto
Loader can discover files more efficiently than the COPY INTO SQL command and can split file processing
into multiple batches.

You do not plan to load subsets of previously uploaded files. With Auto Loader, it can be more difficult
to reprocess subsets of files. However, you can use the COPY INTO SQL



Why does AUTO LOADER require schema location?

, A. Schema location is used to store user provided schema



B. Schema location is used to identify the schema of target table



C. AUTO LOADER does not require schema location, because its supports Schema evolution



D. Schema location is used to store schema inferred by AUTO LOADER



E. Schema location is used to identify the schema of target table and source table - Precise Answer
✔✔D. Schema location is used to store schema inferred by AUTO LOADER



Explanation

The answer is, Schema location is used to store schema inferred by AUTO LOADER, so the next time
AUTO LOADER runs faster as does not need to infer the schema every single time by trying to use the
last known schema.



Auto Loader samples the first 50 GB or 1000 files that it discovers, whichever limit is crossed first. To
avoid incurring this inference cost at every stream start up, and to be able to provide a stable schema
across stream restarts, you must set the option cloudFiles.schemaLocation. Auto Loader creates a
hidden directory _schemas at this location to track schema changes to the input data over time.



The below link contains detailed documentation on different options



Auto Loader options | Databricks on AWS



Which of the following statements are incorrect about the lakehouse?



A. Support end-to-end streaming and batch workloads



B. Supports ACID

Escuela, estudio y materia

Institución
DATABRICKS ENGINEER ASSOCIATE
Grado
DATABRICKS ENGINEER ASSOCIATE

Información del documento

Subido en
3 de septiembre de 2024
Número de páginas
42
Escrito en
2024/2025
Tipo
Examen
Contiene
Preguntas y respuestas

Temas

$17.99
Accede al documento completo:

100% de satisfacción garantizada
Inmediatamente disponible después del pago
Leer en línea o como PDF
No estas atado a nada

Conoce al vendedor

Seller avatar
Los indicadores de reputación están sujetos a la cantidad de artículos vendidos por una tarifa y las reseñas que ha recibido por esos documentos. Hay tres niveles: Bronce, Plata y Oro. Cuanto mayor reputación, más podrás confiar en la calidad del trabajo del vendedor.
EXAMCOLLECTIVES Herzing University
Ver perfil
Seguir Necesitas iniciar sesión para seguir a otros usuarios o asignaturas
Vendido
1760
Miembro desde
3 año
Número de seguidores
1159
Documentos
23311
Última venta
1 semana hace
Ace Your Exams with Elite Study Resources | ExamEliteHub on Stuvia

I offer genuine and dependable exam papers that are directly obtained from well-known, reputable institutions as a highly regarded professional who specializes in sourcing study materials. These papers are invaluable resources made to help people who want to become nurses and people who work in other fields prepare for exams. Because of my extensive experience and in-depth knowledge of the subject, I take great care to ensure that each exam paper meets the highest quality, accuracy, and relevance standards, making them an essential component of any successful study plan.

Lee mas Leer menos
4.1

446 reseñas

5
252
4
57
3
84
2
18
1
35

Documentos populares

Recientemente visto por ti

Por qué los estudiantes eligen Stuvia

Creado por compañeros estudiantes, verificado por reseñas

Calidad en la que puedes confiar: escrito por estudiantes que aprobaron y evaluado por otros que han usado estos resúmenes.

¿No estás satisfecho? Elige otro documento

¡No te preocupes! Puedes elegir directamente otro documento que se ajuste mejor a lo que buscas.

Paga como quieras, empieza a estudiar al instante

Sin suscripción, sin compromisos. Paga como estés acostumbrado con tarjeta de crédito y descarga tu documento PDF inmediatamente.

Student with book image

“Comprado, descargado y aprobado. Así de fácil puede ser.”

Alisha Student

Preguntas frecuentes