Professional Exam and Answers |Latest
Update|
An upstream system has been configured to pass the date for a given batch of data to the
Databricks Jobs API as a parameter. The notebook to be scheduled will use this parameter to
load data with the following code:
df = spark.read.format("parquet").load(f"/mnt/source/(date)")
Which code block should be used to create the date Python variable used in the above code
block?
A. date = spark.conf.get("date")
B. input_dict = input()date= input_dict["date"]
C. import sysdate = sys.argv[1]
D. date = dbutils.notebooks.getParam("date")
E. dbutils.widgets.text("date", "null")date = dbutils.widgets.get("date") - ✔✔E.
dbutils.widgets.text("date", "null")
date = dbutils.widgets.get("date")
The Databricks workspace administrator has configured interactive clusters for each of the data
engineering groups. To control costs, clusters are set to terminate after 30 minutes of inactivity.
Each user should be able to execute workloads against their assigned clusters at any time of the
day.
Assuming users have been added to a workspace but not granted any permissions, which of the
following describes the minimal permissions a user would need to start and attach to an
already configured cluster.
, A. "Can Manage" privileges on the required cluster
B. Workspace Admin privileges, cluster creation allowed, "Can Attach To" privileges on the
required cluster
C. Cluster creation allowed, "Can Attach To" privileges on the required cluster
D. "Can Restart" privileges on the required cluster
E. Cluster creation allowed, "Can Restart" privileges on the required cluster - ✔✔D. "Can
Restart" privileges on the required cluster
When scheduling Structured Streaming jobs for production, which configuration automatically
recovers from query failures and keeps costs low? - ✔✔D.
Cluster: New Job Cluster;
Retries: Unlimited;
Maximum Concurrent Runs: 1
The data engineering team has configured a Databricks SQL query and alert to monitor the
values in a Delta Lake table. The recent_sensor_recordings table contains an identifying
sensor_id alongside the timestamp and temperature for the most recent 5 minutes of
recordings. The below query is used to create the alert:
SELECT MEAN(temperature), MAX(temperature), MIN(temperature)
FROM recent_sensor_recordings
GROUP BY sensor_id
The query is set to refresh each minute and always completes in less than 10 seconds. The alert
is set to trigger when mean (temperature) > 120. Notifications are triggered to be sent at most
every 1 minute.