Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Summary

Samenvatting Business Intelligence | UGent | 2025/26

Rating
-
Sold
-
Pages
117
Uploaded on
03-06-2026
Written in
2025/2026

Samenvatting voor Business Intelligence in het Schakelprogramma Master of Science in de Handelswetenschappen aan de Universiteit Gent. De aantekeningen behandelen fundamentele concepten waarom data science belangrijk is voor bedrijven, big data, data warehouses en data lakes, en data-analytical thinking. Ideaal voor examenvoorbereiding en het begrijpen van de kernconcepten van data science in bedrijfscontext.

Show more Read less
Institution
Course

Content preview

BUSINESS INTELLIGENCE
H0 INLEIDING ................................................................................................................................................. 1
WAAROM IS DATA SCIENCE BELANGRIJK VOOR BEDRIJVEN? ...................................................................................... 1
Wet van de massale digitale opslag ....................................................................................................................... 1
Big data................................................................................................................................................................ 1
Maslows hiërarchie van big data ............................................................................................................................ 1
Data warehouses & data marts ......................................................................................................................... 1
Data lakes ............................................................................................................................................................ 2
Data warehouse VS. data lakes ............................................................................................................................. 2
Data in bedrijven .................................................................................................................................................. 2
Data value trap ..................................................................................................................................................... 2
H1.1 DATA-ANALYTICAL THINKING .................................................................................................................... 3
INTRODUCTIE .......................................................................................................................................................... 3
WAAROM DATA-ANALYTICAL THINKING EN DATA SCIENCE? ....................................................................................... 3
Data opportunities ................................................................................................................................................ 3
Compliance to regulations − naleving van de voorschriften ..................................................................................... 4
Possible applications ........................................................................................................................................... 5
VOORBEELDEN ........................................................................................................................................................ 6
Hurricane Frances − WalMart ................................................................................................................................ 6
Pregnancy prediction − Target ............................................................................................................................... 6
Churn prediction − Megatrends ............................................................................................................................. 6
WAT IS DATA-ANALYTICAL THINKING? ....................................................................................................................... 6
Data science capability as strategic asset .......................................................................................................... 7
Signet Bank VS. Capital One .................................................................................................................................. 7
Amazon ............................................................................................................................................................... 8
Harrah’s Casinos .................................................................................................................................................. 8
Waardering van Facebook en Twitter ..................................................................................................................... 8
WAT IS DATA SCIENCE OF DATAWETENSCHAP? .......................................................................................................... 8
SAMENVATTING ..................................................................................................................................................... 11
H1.2 BUSINESS PROBLEMS & DATA SCIENCE SOLUTIONS ................................................................................ 12
VERSCHILLENDE DATAMINING TAKEN ..................................................................................................................... 12
Classification & class probability estimation ........................................................................................................ 12
Regression ......................................................................................................................................................... 12
Similarity matching ............................................................................................................................................. 12
Clustering .......................................................................................................................................................... 13
Co-occurrence grouping ..................................................................................................................................... 13
Profiling ............................................................................................................................................................. 13
Link prediction ................................................................................................................................................... 13
Data reduction ................................................................................................................................................... 13
Causal modeling ................................................................................................................................................ 13
Conclusion ........................................................................................................................................................ 14
Two high-level primary goals: prediction and description ............................................................................. 14

, SUPERVISED VS. UNSUPERVISED METHODS............................................................................................................ 14
Voorbeeld .......................................................................................................................................................... 14
HET DATAMINING PROCES ...................................................................................................................................... 15
Belangrijk onderscheid ..................................................................................................................................... 15
Knowledge discovery in databases ................................................................................................................... 15
ANDERE ANALYSETECHNIEKEN EN -TECHNOLOGIEËN............................................................................................. 17
Statistics ............................................................................................................................................................ 17
Database querying ............................................................................................................................................ 17
OLAP-tools......................................................................................................................................................... 17
Data warehousing .............................................................................................................................................. 18
Regression analysis .......................................................................................................................................... 18
Machine learning (AI) and datamining (KDD) ..................................................................................................... 18
H2.1 INTRODUCTION TO PREDICTIVE MODELING ............................................................................................. 19
TERMINOLOGIE ..................................................................................................................................................... 19
Model ................................................................................................................................................................ 19
In data science ................................................................................................................................................... 19
Two high-level primary goals: prediction & description ..................................................................................... 19
Instance............................................................................................................................................................. 19
Inductie & deductie .......................................................................................................................................... 19
SUPERVISED SEGMENTATIE .................................................................................................................................... 19
Complicaties ..................................................................................................................................................... 20
HET SELECTEREN VAN INFORMATIEVE ATTRIBUTEN ................................................................................................. 21
Entropie ............................................................................................................................................................. 21
Information gain ................................................................................................................................................ 22
Voorbeeld: IG berekenen .................................................................................................................................... 22
Numeric values ................................................................................................................................................. 23
Regressieproblemen .......................................................................................................................................... 23
SUPERVISED SEGMENTATIE MET BOOMSTRUCTUURMODELLEN ............................................................................... 23
Voorbeeld .......................................................................................................................................................... 24
Lichaamsvorm ................................................................................................................................................... 24
Samenvatting ..................................................................................................................................................... 25
ANDERE VOORSTELLINGEN .................................................................................................................................... 26
Visualisatie van segmenten .............................................................................................................................. 26
Decision lines & hyper planes (beslissingslijnen & hypervlakken) ................................................................ 26
Bomen als reeksen van regels ......................................................................................................................... 27
PROBABILITY ESTIMATION (WAARSCHIJNLIJKHEIDSSCHATTING) .............................................................................. 27
Voorbeeld .......................................................................................................................................................... 27
H2.2 FITTING A MODEL TO DATA....................................................................................................................... 28
CONTENTS ............................................................................................................................................................... 28
Decision Trees vS. parametric modeling ......................................................................................................... 28
Drie assumpties ................................................................................................................................................. 28
LINEAR DISCRIMINANT FUNCTIONS ........................................................................................................................ 28
Instance space ................................................................................................................................................... 28

, lineaire discriminerende functie .......................................................................................................................... 29
Optimaliseren v/d objective function ................................................................................................................... 30
Voorbeeld lineaire discriminatie ....................................................................................................................... 30
CLASSIFICATION: SCORING & RANKING .................................................................................................................. 30
LINEAR MODEL FOR CLASSIFICATION................................................................................................................................ 31
SUPPORT VECTOR MACHINES (SVM) ................................................................................................................... 31
Logistieke regressie ............................................................................................................................................ 32
Linear regression ................................................................................................................................................ 33
WHAT IF THE DATA IS NON-LINEAR? ......................................................................................................................... 34
H3.1 OVERFITTING & ITS AVOIDANCE ................................................................................................................ 35
OVERFITTING ......................................................................................................................................................... 35
Definitie ............................................................................................................................................................. 35
Wat nu? ............................................................................................................................................................. 35
Holdout data & fitting graphs ............................................................................................................................... 35
VOORSPELLINGSTECHNIEKEN & OVERFITTING ................................................................................................... 36
WAAROM IS OVERFITTEN SLECHT?...................................................................................................................... 39
AVOIDING OVERFITTING !!! ............................................................................................................................................ 40
CROSS VALIDATION ........................................................................................................................................... 40
LEARNING CURVES ............................................................................................................................................ 42
VERMIJDEN VAN OVERFITTING & COMPLEXITEITSCONTROLE .............................................................................. 42
H3.2 SIMILARITY, NEIGHBORS & CLUSTERS ....................................................................................................... 45
CALCULATE SIMILARITY ............................................................................................................................................. 45
gEBRUIK VAN SIMILARITY .................................................................................................................................... 45
AFSTAND ........................................................................................................................................................... 46
NEAREST-NEIGHBOUR REASONING (NN) ............................................................................................................ 47
Goniometrische interpretatie, overfitting & complexity control .............................................................................. 50
3 problemen met k-NN ....................................................................................................................................... 51
Technische details m.b.t. NN Heterogene attributen............................................................................................. 52
Technische details m.b.t. Andere afstandsfuncties ............................................................................................... 52
CLUSTERING AS SIMILARITY-BASED SEGMENTATION ............................................................................................... 54
Supervised vs. unsupervised ............................................................................................................................... 54
Clustering = unsupervised segmentation ............................................................................................................. 54
2 soorten clustering ............................................................................................................................................ 55
Hiërarchische clustering vs. centroid clustering (k-means) ................................................................................... 58
Clustering resultaten .......................................................................................................................................... 58
H4.1 DECISION ANALYTICAL THINKING 1 : WHAT IS A GOOD MODEL? ....................................................... 59
INTRODUCTIE ........................................................................................................................................................ 59
EVALUEREN VAN CLASSIFIERS ................................................................................................................................ 59
Plain accuracy ................................................................................................................................................... 59
Probleem met ongebalanceerde klassen ............................................................................................................. 60
Confusion matrix ................................................................................................................................................ 61
Problemen met ongelijke kosten en baten ............................................................................................................ 63
GENERALIZING BEYOND CLASSIFIERS ..................................................................................................................... 63

, Algemene principe .............................................................................................................................................. 63
EXPECTED VALUE FRAMEWORK .............................................................................................................................. 64
Using expected value to frame classifier use ........................................................................................................ 64
Gebruik v/d expected value voor de evaluatie v/d classifier ................................................................................... 65
Kosten & baten binnen expected value framework ................................................................................................ 66
BASELINE PERFORMANCE (& CONSEQUENCES) ........................................................................................................... 69
Baseline model .................................................................................................................................................. 69
Algemene principes ............................................................................................................................................ 69
Andere ............................................................................................................................................................... 70
H4.2 VISUALISING MODEL PERFORMANCE ....................................................................................................... 71
RANKING IN PLAATS VAN CLASSIFICEREN .......................................................................................................... 71
WINSTCURVES ................................................................................................................................................... 73
ROC curves & AUC (Area under curve) ................................................................................................................. 74
CUMULATIEVE RESPONS- & LIFTCURVES ............................................................................................................ 77
VOORBEELD CHURNPREDICTION ...................................................................................................................... 78
H5.1 EVIDENCE AND PROBABILITIES ................................................................................................................ 82
VOORBEELD ...................................................................................................................................................... 82
COMBINING EVICENCE PROBABILISTICALLY ...................................................................................................... 82
JOINT PROBABILITY & INDEPENDENCE ............................................................................................................... 83
BAYES' RULE ...................................................................................................................................................... 83
Het toepassen van de bayes’ rule op data science ................................................................................................ 84
Conditional independence & naive bayes............................................................................................................. 85
Voordelen & nadelen van naïve bayes .................................................................................................................. 86
EEN MODEL VAN BEWIJSVOERING "LIFT" ............................................................................................................ 86
Voorbeeld: bewijsliften van facebook likes ........................................................................................................... 86
SAMENVATTING ................................................................................................................................................. 87
H5.2 REPRESENTING AND MINING TEXT ............................................................................................................ 88
DATA PREPARATION ............................................................................................................................................... 88
WAAROM IS TEKST BELANGRIJK? ............................................................................................................................ 88
WAAROM IS TEKST MOEILIJK? ................................................................................................................................. 88
REPRESENTATION - WEERGAVE .............................................................................................................................. 89
Bag of words ...................................................................................................................................................... 89
Term frequency .................................................................................................................................................. 89
Normalisatie en stemming .................................................................................................................................. 90
meten van spaarzaamheid (sparseness): inverse document frequency .................................................................. 91
Combinatie van TF & IDF: TFIDF .......................................................................................................................... 92
VOORBEELD.............................................................................................................................................................. 92
THE RELATIONSHIP OF IDF TO ENTROPY ........................................................................................................................... 93
BEYOND BAG OF WORDS ....................................................................................................................................... 94
N-gram sequence ............................................................................................................................................... 94
Named Entity Extraction ...................................................................................................................................... 94
Topic models ..................................................................................................................................................... 95
VOORBEELD: DATAMINING OM DE KOERSBEWEGING TE VOORSPELLEN .................................................................. 96

Written for

Institution
Study
Course

Document information

Uploaded on
June 3, 2026
File latest updated on
June 3, 2026
Number of pages
117
Written in
2025/2026
Type
SUMMARY

Subjects

$14.21
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller
Seller avatar
sahinselin03

Get to know the seller

Seller avatar
sahinselin03 Katholieke Hogeschool VIVES
Follow You need to be logged in order to follow users or courses
Sold
11
Member since
1 year
Number of followers
0
Documents
26
Last sold
3 weeks ago

0.0

0 reviews

5
0
4
0
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions