Databricks Lakehouse Questions With 100% Correct Answers.
What is the focal point of Databricks - the Lakehouse will not work without this? Why? - Delta Delta Lake is the core component that makes the Lakehouse work. Enables low latency queries required in the BI space. What is Delta? - - Data Warehouse Capabilities (ACID transactions, Time travel, Schema enforcements) on top of cheap, scalable, open cloud storage What problem does Delta solve? - Data Lake reliability - Corrupted tables from failed jobs - Unable to fix/modify tables (append only) - Data inconsistency (no guarantees in query results) Data Lake performance - Slow queries due to small files & small folders Why do customers care about Delta? - 1. Reliability and performance - Always guarantees a successful view of the data along with responsive queries at scale. 2. Open Format - No vendor lock-in 3. Cost Savings - Same ability as a Data Warehouse only cheaper and open Common Delta misconceptions? - OSS vs Databricks Delta: Delta Lakes enhanced capabilities are only within Databricks and not in open source Delta. - NOT TRUE anymore, we open sourced all of Delta."Contributors to the Delta OSS project are only from Databricks." - This is quickly changing as more and more contributors are added from outside of Databricks. Selling Delta: What should I be on the lookout for? -1. Data Warehouse Users - Data Warehouses have similar capabilities as Delta Lake, so talk about openness of Delta and cost savings. - Additionally Delta is a lot more flexible than a warehouse (Unstructured, support for different data models, etc). 2. Large Datasets The larger the data volume the more Delta Lake shines What is the cost advantage of Delta? - - No extra costs outside of compute/usage ($DBU) - Customers will pay for storage through the cloud vendor What is Databricks SQL? - - Provides the capabilities customers have historically consumed from data warehouses, but it is built on top of the unified Lakehouse platform. What is "Serverless"? - - Serverless SQL provides near instant compute for SQL queries - The compute is within the Databricks cloud environment, whereas it is normally done within the customers. What problem does DB SQL solve? - - Opens up the Lakehouse to millions of technologists who know SQL but not Python or Scala and who work in SQL editors instead of notebooks What problem does Serverless solve? - - Improves customer experience when using DB SQL by reducing the start up time from minutes down to seconds and lowers the overall cost avoiding long running idle SQL endpoints
Escuela, estudio y materia
- Institución
- Databricks Lakehouse
- Grado
- Databricks Lakehouse
Información del documento
- Subido en
- 15 de diciembre de 2023
- Número de páginas
- 5
- Escrito en
- 2023/2024
- Tipo
- Examen
- Contiene
- Preguntas y respuestas
Temas
-
databricks lakehouse
Documento también disponible en un lote