Lecture notes and some additional info from the suggested book chapter
Suggested book chapter: C.S. Jensen, T.B. Pedersen, C. Thomsen, "Fundamental
Concepts". Chapter 2 in "Multidimensional Databases and Data Warehousing".
2010. Morgan & Claypool. ISBN 978-1608455379.
, How to get insights from data
1. Formulate “questions to data”
2. Imagine visualizations/reports
3. Design star schema(s) for cube(s) by analyzing (1) and (2) for fact(s) and dimensions
4. Create (empty) database with schema
5. Fill database by transforming sources
6. Use: Analytics or (Predictive) modeling by connecting to the database
How to go from sources to senses: Transform (re-shape) sources into cubes in a database
(DMBS, safe space) and access (connect) through visualization environment and analytical
applications
Data science process
1. Sources
2. Prepare
3. Analyze
4. Use
Sources
1. Information systems
2. Sensors
3. Internet
4. Social media
Prepare
1. Search
2. Harvest
3. Combine
4. Transform
5. Clean
Analyze
1. Machine learning
2. Mining
3. Visualize
Use
1. Interpret
2. Deploy
3. Decide
4. Monitor
* Data preparation and cleaning is 80% of scientists time