100% de satisfacción garantizada Inmediatamente disponible después del pago Tanto en línea como en PDF No estas atado a nada 4,6 TrustPilot
logo-home
Resumen

Summary Data Warehousing: Part 1

Puntuación
-
Vendido
-
Páginas
7
Subido en
19-02-2019
Escrito en
2018/2019

Business Information Management - Data Management. First part of Module 2 Data Warehousing. This document contains all the slides, notes from 2 students and extra information collected on Internet. I had a 15 out 20 for this subject.

Mostrar más Leer menos
Institución
Grado










Ups! No podemos cargar tu documento ahora. Inténtalo de nuevo o contacta con soporte.

Escuela, estudio y materia

Institución
Estudio
Grado

Información del documento

Subido en
19 de febrero de 2019
Número de páginas
7
Escrito en
2018/2019
Tipo
Resumen

Temas

Vista previa del contenido

Data Warehousing 1
Module 2: Data Warehousing

1 Definition
A data warehouse (DW) is simply a single, complete, and consistent store of data obtained
from a variety of sources and made available to end users in a way they can understand and
use it in a business context

Notes:
We have seen operational databasesNow we will see data warehousing
Operational databases - the structure was built to avoid redundancy in order to easily
manage day to day transactions that you support all activities that need to be done in real
time. These can be complex.

Data warehousing has a diferent perspective- goal is to have a structure that is as simple as
possible. Goal is to make queries in order to retrieve info out of all of the data that we have.
We are building the data warehouse for bi. We are not interested in supporting day to day
activities, we want data about production, sales, etc. ,n order to retrieve the data, we have a
special kind of structure that is called the star diagram or the snowfake diagram. ,t has been
known for years. ,t is implemented into a db using a relational diagram. The structure is
much simpler though. A lot of profs like to use the star schema which is the simplest
structure. You have at most 2 links between two tables.
The idea is to centralize all of the data we need for decision making. THis is why a dw is built
out of diferent data sources. Our main concern is to be sure that we are having everything
we need to make decisions just form the dw. ,t is a single point that must be complete and
consistent. The data should be consistent, up to date,e tc.
You define the subject that are important for your analysis for ex. Sales or products
You need to select the right transaction
,t’s not real-time data, but make analysis about events that happenfor ex. ,f you have
data from last 6 months and adding the info from yesterday not will change everything.
You update DW specific time, once a week, once a month, etc.
The star schema gets its name from the physical model's[3] resemblance to a star shape
with a fact table at its center and the dimension tables surrounding it representing the star's
points.

,n computing, a snowflae scheal is a logical arrangement of tables in a multidimensional
database such that the entity relationship diagram resembles a snowfake shape. The
snowfake schema is represented by centralized fact tables which are connected to multiple
dimensions.[citation needed]. "Snowfaking" is a method of normalizing the dimension
tables in a star schema. When it is completely normalized along all the dimension tables, the
resultant structure resembles a snowfake with the fact table in the middle. The principle
behind snowfaking is normalization of the dimension tables by removing low cardinality
attributes and forming separate tables.

The snowfake schema is similar to the star schema. However, in the snowfake schema,
dimensions are normalized into multiple related tables, whereas the star schema's

1

,dimensions are denormalized with each dimension represented by a single table. A complex
snowfake shape emerges when the dimensions of a snowfake schema are elaborate,
having multiple levels of relationships, and the child tables have multiple parent tables
("forks in the road").

,n computing, the stlr scheal is the simplest style of data mart schema and is the approach
most widely used to develop data warehouses and dimensional data marts.[1] The star
schema consists of one or more fact tables referencing any number of dimension tables. The
star schema is an important special case of the snowfake schema, and is more efective for
handling simpler queries.

Whlt is l Dltl Wlrehouse? A Prlcttoners Viewpoint
A DW is a
- Subject -oriented
- ,ntegrated
- Time-varying the db in the state that it is right now, it is not exactly the same state
that it will be in 5 minutes, because of the operational usage, in the dw, it is diferent,
you build it into defined moments in time. ,t isn’t real-time data. You make analysis
about business events, sales, etc., but moment to moment data will not change your
analysis.
- Nonvolatile
Collection of data that is used primarily in organizational decision making.

Subject-oriented:
- A data warehouse is organized around subjects such as sales, products, customers
- Focuses on modeling and analysis of data for decision makers
- Excludes data not useful in decision support process.
,t centralized all data for decision making and built out diferent sources
This tool is only relevant for (middle) managers to make decisions




ETL tool
Integrlton




2

, - Data Warehouse is constructed by integrating multiple heterogeneous sources.
- Data Preprocessing is applied to ensure consistency.

Notes:
- The more we integrate the data management, easier to build DWgood ERP system
- Have ETL tools which take the data and converts it to the right place in the dw.

Integrlted


Everything is
put together
on the same
place
The
DW




integrates all the data necessary for analysis


Tiae-vlrying:
- Provides information from historical perspective e.g. past 5-10 years
- Data is collected at diferent times (e.g. every month to update the data warehouse)
- Every key structure contains either implicitly or explicitly an element of time

Note:
Within the dw, every transaction has a reference to a date of time within it.
,f you are collecting data about sales, it is important to know when the sale has been made
because this gives you relevant info about your sales.

Nonvolltle
- Once data is recorded it will not be updated anymore
- Historicll dltl will not be deleted
- Only new data is added
- A data warehouse requires two
operations in data accessing
o ,nitial loading of data (ETL)
o Access of data

Note:



3
$4.92
Accede al documento completo:

100% de satisfacción garantizada
Inmediatamente disponible después del pago
Tanto en línea como en PDF
No estas atado a nada

Conoce al vendedor

Seller avatar
Los indicadores de reputación están sujetos a la cantidad de artículos vendidos por una tarifa y las reseñas que ha recibido por esos documentos. Hay tres niveles: Bronce, Plata y Oro. Cuanto mayor reputación, más podrás confiar en la calidad del trabajo del vendedor.
shafaqsara Katholieke Universiteit Leuven
Seguir Necesitas iniciar sesión para seguir a otros usuarios o asignaturas
Vendido
14
Miembro desde
6 año
Número de seguidores
7
Documentos
12
Última venta
3 meses hace

4.0

2 reseñas

5
0
4
2
3
0
2
0
1
0

Recientemente visto por ti

Por qué los estudiantes eligen Stuvia

Creado por compañeros estudiantes, verificado por reseñas

Calidad en la que puedes confiar: escrito por estudiantes que aprobaron y evaluado por otros que han usado estos resúmenes.

¿No estás satisfecho? Elige otro documento

¡No te preocupes! Puedes elegir directamente otro documento que se ajuste mejor a lo que buscas.

Paga como quieras, empieza a estudiar al instante

Sin suscripción, sin compromisos. Paga como estés acostumbrado con tarjeta de crédito y descarga tu documento PDF inmediatamente.

Student with book image

“Comprado, descargado y aprobado. Así de fácil puede ser.”

Alisha Student

Preguntas frecuentes