100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.6 TrustPilot
logo-home
Exam (elaborations)

Databricks Lakehouse Platform Questions and Answers with complete solution

Rating
-
Sold
-
Pages
10
Grade
A+
Uploaded on
15-12-2023
Written in
2023/2024

What is a data lakehouse? - A Datalakehouse combines the ACID transactions and data goveranance of a datawarehouse with the flexibility and cost-efficiency of a data lake. How does the databricks datalakehouse differ from the a traditional datawarehouse? - It enables both batch and streaming analytics What storage system is databricks built on top of? -DBFS - Databricks file storage. DBFS is an abstraction layer that is provisioned whenever you create a databricks cluster. The actual data is stored on cloud. What are the 2 components of a Delta Table? - The File and the transaction log What are the benefits of Delta Tables? - ACID Transactions to object storage Provides audit trail of all changes Scalable metadata handling Builds on standard formats (Parquet and JSON) What 2 methods can you use for time travel? - Can input the version number (VERSION AS OF) Or can input the timestamp (TIMESTAMP AS OF) What does the OPTIMIZE command do? - Compacts small files into larger files in order to increase speed and table performance What is Z-ORDER indexing? - You can specify and index based off of a column, and this will cause the files to be compacted into files that are ordered by the z-index. This means that when querying, ifyou know the record by it's z-index, then databricks knows that record is only in file 1 and so only that file will be searched which saves a lot of time. What does the VACUUM command do? - VACUUM will allow to specify a time period for a certain table where before that, older files will be deleted. Note: this will mean you can only time-travel to the beginning of the retention period as previous files will no longer exist. What is the Hive metastore? - The Hive metastore is a special read-only table created within Hive that stores the meta-data related to all tables in a Hive database. What's the difference between managed and external tables? - Managed tables are managed by databricks and both the metadata and actual data are stored in the databricks workspace. External tables have the actual data on an external datastore. This means that if a table is dropped, the actual data is not deleted, unlike managed tables. What's the difference between the create table command and the CTAS command? - CTAS allows to create to table or view based on the result of a select query. CTAS will infer schema whereas CREATE command requires manual declaration of schema. What's the difference between deep and shallow cloning? - Deep cloning will copy both the metadata and the actual data to a target datastore. Shallow clone will not actually copy the underlying data. What is a view? - It is a virtual representation of the data without storing any data itself. What are the 3 types of views? - Normal view - Created within a database and can only be accessed within that database Global view - These views can be accessed in any database within a workspace Temporary view - Views only exist for that specific spark sessionWhich of the following developer operations in CI/CD flow can be implemented in Databricks Repos? - Merge when code is committed - Pull request and review process - Trigger Databricks Repos API to pull the latest version of code into production folder - Resolve merge conflicts - Delete a branch - Trigger Databricks Repos API to pull the latest version of code into production folder if you run the command VACUUM transactions retain 0 hours? What is the outcome of this command? - Command will be successful, but no data is removed - Command will fail if you have an active transaction running - Command will fail, you cannot run the command with retentionDurationcheck enabled - Command will be successful, but historical data will be removed - Command runs successful and compacts all of the data in the table - Command will fail, you cannot run the command with retentionDurationcheck enabled

Show more Read less
Institution
Databricks Lakehouse Platform
Course
Databricks Lakehouse Platform









Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
Databricks Lakehouse Platform
Course
Databricks Lakehouse Platform

Document information

Uploaded on
December 15, 2023
Number of pages
10
Written in
2023/2024
Type
Exam (elaborations)
Contains
Questions & answers

Subjects

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
Brainarium Delaware State University
View profile
Follow You need to be logged in order to follow users or courses
Sold
1833
Member since
2 year
Number of followers
1043
Documents
22341
Last sold
1 hour ago

3.8

320 reviews

5
148
4
61
3
54
2
16
1
41

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions