Garantie de satisfaction à 100% Disponible immédiatement après paiement En ligne et en PDF Tu n'es attaché à rien 4,6 TrustPilot
logo-home
Resume

Summary Data Warehousing: Part 1

Note
-
Vendu
-
Pages
7
Publié le
19-02-2019
Écrit en
2018/2019

Business Information Management - Data Management. First part of Module 2 Data Warehousing. This document contains all the slides, notes from 2 students and extra information collected on Internet. I had a 15 out 20 for this subject.

Montrer plus Lire moins










Oups ! Impossible de charger votre document. Réessayez ou contactez le support.

Infos sur le Document

Publié le
19 février 2019
Nombre de pages
7
Écrit en
2018/2019
Type
Resume

Sujets

Aperçu du contenu

Data Warehousing 1
Module 2: Data Warehousing

1 Definition
A data warehouse (DW) is simply a single, complete, and consistent store of data obtained
from a variety of sources and made available to end users in a way they can understand and
use it in a business context

Notes:
We have seen operational databasesNow we will see data warehousing
Operational databases - the structure was built to avoid redundancy in order to easily
manage day to day transactions that you support all activities that need to be done in real
time. These can be complex.

Data warehousing has a diferent perspective- goal is to have a structure that is as simple as
possible. Goal is to make queries in order to retrieve info out of all of the data that we have.
We are building the data warehouse for bi. We are not interested in supporting day to day
activities, we want data about production, sales, etc. ,n order to retrieve the data, we have a
special kind of structure that is called the star diagram or the snowfake diagram. ,t has been
known for years. ,t is implemented into a db using a relational diagram. The structure is
much simpler though. A lot of profs like to use the star schema which is the simplest
structure. You have at most 2 links between two tables.
The idea is to centralize all of the data we need for decision making. THis is why a dw is built
out of diferent data sources. Our main concern is to be sure that we are having everything
we need to make decisions just form the dw. ,t is a single point that must be complete and
consistent. The data should be consistent, up to date,e tc.
You define the subject that are important for your analysis for ex. Sales or products
You need to select the right transaction
,t’s not real-time data, but make analysis about events that happenfor ex. ,f you have
data from last 6 months and adding the info from yesterday not will change everything.
You update DW specific time, once a week, once a month, etc.
The star schema gets its name from the physical model's[3] resemblance to a star shape
with a fact table at its center and the dimension tables surrounding it representing the star's
points.

,n computing, a snowflae scheal is a logical arrangement of tables in a multidimensional
database such that the entity relationship diagram resembles a snowfake shape. The
snowfake schema is represented by centralized fact tables which are connected to multiple
dimensions.[citation needed]. "Snowfaking" is a method of normalizing the dimension
tables in a star schema. When it is completely normalized along all the dimension tables, the
resultant structure resembles a snowfake with the fact table in the middle. The principle
behind snowfaking is normalization of the dimension tables by removing low cardinality
attributes and forming separate tables.

The snowfake schema is similar to the star schema. However, in the snowfake schema,
dimensions are normalized into multiple related tables, whereas the star schema's

1

,dimensions are denormalized with each dimension represented by a single table. A complex
snowfake shape emerges when the dimensions of a snowfake schema are elaborate,
having multiple levels of relationships, and the child tables have multiple parent tables
("forks in the road").

,n computing, the stlr scheal is the simplest style of data mart schema and is the approach
most widely used to develop data warehouses and dimensional data marts.[1] The star
schema consists of one or more fact tables referencing any number of dimension tables. The
star schema is an important special case of the snowfake schema, and is more efective for
handling simpler queries.

Whlt is l Dltl Wlrehouse? A Prlcttoners Viewpoint
A DW is a
- Subject -oriented
- ,ntegrated
- Time-varying the db in the state that it is right now, it is not exactly the same state
that it will be in 5 minutes, because of the operational usage, in the dw, it is diferent,
you build it into defined moments in time. ,t isn’t real-time data. You make analysis
about business events, sales, etc., but moment to moment data will not change your
analysis.
- Nonvolatile
Collection of data that is used primarily in organizational decision making.

Subject-oriented:
- A data warehouse is organized around subjects such as sales, products, customers
- Focuses on modeling and analysis of data for decision makers
- Excludes data not useful in decision support process.
,t centralized all data for decision making and built out diferent sources
This tool is only relevant for (middle) managers to make decisions




ETL tool
Integrlton




2

, - Data Warehouse is constructed by integrating multiple heterogeneous sources.
- Data Preprocessing is applied to ensure consistency.

Notes:
- The more we integrate the data management, easier to build DWgood ERP system
- Have ETL tools which take the data and converts it to the right place in the dw.

Integrlted


Everything is
put together
on the same
place
The
DW




integrates all the data necessary for analysis


Tiae-vlrying:
- Provides information from historical perspective e.g. past 5-10 years
- Data is collected at diferent times (e.g. every month to update the data warehouse)
- Every key structure contains either implicitly or explicitly an element of time

Note:
Within the dw, every transaction has a reference to a date of time within it.
,f you are collecting data about sales, it is important to know when the sale has been made
because this gives you relevant info about your sales.

Nonvolltle
- Once data is recorded it will not be updated anymore
- Historicll dltl will not be deleted
- Only new data is added
- A data warehouse requires two
operations in data accessing
o ,nitial loading of data (ETL)
o Access of data

Note:



3
€3,99
Accéder à l'intégralité du document:

Garantie de satisfaction à 100%
Disponible immédiatement après paiement
En ligne et en PDF
Tu n'es attaché à rien

Faites connaissance avec le vendeur

Seller avatar
Les scores de réputation sont basés sur le nombre de documents qu'un vendeur a vendus contre paiement ainsi que sur les avis qu'il a reçu pour ces documents. Il y a trois niveaux: Bronze, Argent et Or. Plus la réputation est bonne, plus vous pouvez faire confiance sur la qualité du travail des vendeurs.
shafaqsara Katholieke Universiteit Leuven
Voir profil
S'abonner Vous devez être connecté afin de suivre les étudiants ou les cours
Vendu
14
Membre depuis
6 année
Nombre de followers
7
Documents
12
Dernière vente
3 mois de cela

4,0

2 revues

5
0
4
2
3
0
2
0
1
0

Récemment consulté par vous

Pourquoi les étudiants choisissent Stuvia

Créé par d'autres étudiants, vérifié par les avis

Une qualité sur laquelle compter : rédigé par des étudiants qui ont réussi et évalué par d'autres qui ont utilisé ce document.

Le document ne convient pas ? Choisis un autre document

Aucun souci ! Tu peux sélectionner directement un autre document qui correspond mieux à ce que tu cherches.

Paye comme tu veux, apprends aussitôt

Aucun abonnement, aucun engagement. Paye selon tes habitudes par carte de crédit et télécharge ton document PDF instantanément.

Student with book image

“Acheté, téléchargé et réussi. C'est aussi simple que ça.”

Alisha Student

Foire aux questions