DATA MANAGEMENT
Database Management
Components
- Data:
o Structured (model behind)
o Unstructured (videos, sound)
o Internal / External
- Informaton: organized data,
meaning and value
- Knowledge: processed data, informaton is
applicable to business decision problems
Taxonomy
- Descriptve analytcs: use data to understand past and present (reportng, SQL, data warehouse)
- Predictve analytcs: predict future behavior based on past performance (tme series)
- Prescriptve analytcs: make decisions to achieve best performance (operatons, research, AI)
Process & Product
- Process: BI tools + applicatons + techniques » BI soluton
o Warehousing, SQL, reportng, digital dashboards, statstcs, visualizatons, etc.
- Product: BI process » informaton + knowledge (enables decision making)
o Understanding customer preferences, growth opportunites, internal efciency, etc.
Relational databases
- Database is environment with multple tables that can be connected to each other
- Allow data to be grouped into tables and set relatonships between tables
- Relatonal model enables you to view data logically rather than physically
- Keys: one or more atributes that determine other atributess A » B,C,D (functonally dependent)
Types of keys
- Primary key: uniquely identies each record in a table (can never be null, chosen from candidate)
- Super key: one or more atributes that uniquely identfy each row in a table
- Candidate key: super key without unnecessary atributes
- Foreign key: atributes whose values match the primary key values in the related table
o Employee: EMP (PK)s JOB (FK)
o Job: JOB (PK)
o Beneit: EMP+PLAN (PK)s EMP, PLAN (FK)
o Plan: PLAN (PK)
- Table relatonships:
o One-to-one: separate tables (driver : car)
o One-to-many: common (course : classes
o Many-to-many: not supported directly
, Data Warehouses
Structure Query Language [SQL]
Basic form
SELECT * Get data out of the table, select ields/columns you want
FROM Purchase Fill in the name of the table
WHERE Product=”Bagel” Conditon
To eliminate duplicates, use the word SELECT DISTINCT
Aggregate functons Grouping
SELECT Sum (Price * Quanttt) AS TotalSales
FROM Purchase
GROUP BY Product
Conditon on grouping
SELECT Product, Sum (Price * Quanttt) AS TotalSales
FROM Purchase
GROUP BY Product
HAVING Sum (Price * Quanttt) > 20
Joining database tables
SELECT Name, LastName
FROM Plater, Team
WHERE PlatDat = “Mondat” AND Plater.TeamCode = Team.TeamCode
Types of joins
- Inner join » Retain only rows in both sets
- Left join » Join matching rows from b to a
- Right join » Join matching rows from a to b
- Full outer join » Retain all values, all rows
- Minus » All rows in a that don’t have a match in b
Data Warehouse
- Database that is set up separately from the organizaton’s databases for decision making
- Consolidaton: connect all diferent data to select most important variables for decision making
- Quality: diferent sources » inconsistent representatons / codes / formats » improve this!
Database Management
Components
- Data:
o Structured (model behind)
o Unstructured (videos, sound)
o Internal / External
- Informaton: organized data,
meaning and value
- Knowledge: processed data, informaton is
applicable to business decision problems
Taxonomy
- Descriptve analytcs: use data to understand past and present (reportng, SQL, data warehouse)
- Predictve analytcs: predict future behavior based on past performance (tme series)
- Prescriptve analytcs: make decisions to achieve best performance (operatons, research, AI)
Process & Product
- Process: BI tools + applicatons + techniques » BI soluton
o Warehousing, SQL, reportng, digital dashboards, statstcs, visualizatons, etc.
- Product: BI process » informaton + knowledge (enables decision making)
o Understanding customer preferences, growth opportunites, internal efciency, etc.
Relational databases
- Database is environment with multple tables that can be connected to each other
- Allow data to be grouped into tables and set relatonships between tables
- Relatonal model enables you to view data logically rather than physically
- Keys: one or more atributes that determine other atributess A » B,C,D (functonally dependent)
Types of keys
- Primary key: uniquely identies each record in a table (can never be null, chosen from candidate)
- Super key: one or more atributes that uniquely identfy each row in a table
- Candidate key: super key without unnecessary atributes
- Foreign key: atributes whose values match the primary key values in the related table
o Employee: EMP (PK)s JOB (FK)
o Job: JOB (PK)
o Beneit: EMP+PLAN (PK)s EMP, PLAN (FK)
o Plan: PLAN (PK)
- Table relatonships:
o One-to-one: separate tables (driver : car)
o One-to-many: common (course : classes
o Many-to-many: not supported directly
, Data Warehouses
Structure Query Language [SQL]
Basic form
SELECT * Get data out of the table, select ields/columns you want
FROM Purchase Fill in the name of the table
WHERE Product=”Bagel” Conditon
To eliminate duplicates, use the word SELECT DISTINCT
Aggregate functons Grouping
SELECT Sum (Price * Quanttt) AS TotalSales
FROM Purchase
GROUP BY Product
Conditon on grouping
SELECT Product, Sum (Price * Quanttt) AS TotalSales
FROM Purchase
GROUP BY Product
HAVING Sum (Price * Quanttt) > 20
Joining database tables
SELECT Name, LastName
FROM Plater, Team
WHERE PlatDat = “Mondat” AND Plater.TeamCode = Team.TeamCode
Types of joins
- Inner join » Retain only rows in both sets
- Left join » Join matching rows from b to a
- Right join » Join matching rows from a to b
- Full outer join » Retain all values, all rows
- Minus » All rows in a that don’t have a match in b
Data Warehouse
- Database that is set up separately from the organizaton’s databases for decision making
- Consolidaton: connect all diferent data to select most important variables for decision making
- Quality: diferent sources » inconsistent representatons / codes / formats » improve this!