Geschreven door studenten die geslaagd zijn Direct beschikbaar na je betaling Online lezen of als PDF Verkeerd document? Gratis ruilen 4,6 TrustPilot
logo-home
Samenvatting

Samenvatting Business Intelligence | UGent | 2025/26

Beoordeling
-
Verkocht
-
Pagina's
117
Geüpload op
03-06-2026
Geschreven in
2025/2026

Samenvatting voor Business Intelligence in het Schakelprogramma Master of Science in de Handelswetenschappen aan de Universiteit Gent. De aantekeningen behandelen fundamentele concepten waarom data science belangrijk is voor bedrijven, big data, data warehouses en data lakes, en data-analytical thinking. Ideaal voor examenvoorbereiding en het begrijpen van de kernconcepten van data science in bedrijfscontext.

Meer zien Lees minder

Voorbeeld van de inhoud

BUSINESS INTELLIGENCE
H0 INLEIDING ................................................................................................................................................. 1
WAAROM IS DATA SCIENCE BELANGRIJK VOOR BEDRIJVEN? ...................................................................................... 1
Wet van de massale digitale opslag ....................................................................................................................... 1
Big data................................................................................................................................................................ 1
Maslows hiërarchie van big data ............................................................................................................................ 1
Data warehouses & data marts ......................................................................................................................... 1
Data lakes ............................................................................................................................................................ 2
Data warehouse VS. data lakes ............................................................................................................................. 2
Data in bedrijven .................................................................................................................................................. 2
Data value trap ..................................................................................................................................................... 2
H1.1 DATA-ANALYTICAL THINKING .................................................................................................................... 3
INTRODUCTIE .......................................................................................................................................................... 3
WAAROM DATA-ANALYTICAL THINKING EN DATA SCIENCE? ....................................................................................... 3
Data opportunities ................................................................................................................................................ 3
Compliance to regulations − naleving van de voorschriften ..................................................................................... 4
Possible applications ........................................................................................................................................... 5
VOORBEELDEN ........................................................................................................................................................ 6
Hurricane Frances − WalMart ................................................................................................................................ 6
Pregnancy prediction − Target ............................................................................................................................... 6
Churn prediction − Megatrends ............................................................................................................................. 6
WAT IS DATA-ANALYTICAL THINKING? ....................................................................................................................... 6
Data science capability as strategic asset .......................................................................................................... 7
Signet Bank VS. Capital One .................................................................................................................................. 7
Amazon ............................................................................................................................................................... 8
Harrah’s Casinos .................................................................................................................................................. 8
Waardering van Facebook en Twitter ..................................................................................................................... 8
WAT IS DATA SCIENCE OF DATAWETENSCHAP? .......................................................................................................... 8
SAMENVATTING ..................................................................................................................................................... 11
H1.2 BUSINESS PROBLEMS & DATA SCIENCE SOLUTIONS ................................................................................ 12
VERSCHILLENDE DATAMINING TAKEN ..................................................................................................................... 12
Classification & class probability estimation ........................................................................................................ 12
Regression ......................................................................................................................................................... 12
Similarity matching ............................................................................................................................................. 12
Clustering .......................................................................................................................................................... 13
Co-occurrence grouping ..................................................................................................................................... 13
Profiling ............................................................................................................................................................. 13
Link prediction ................................................................................................................................................... 13
Data reduction ................................................................................................................................................... 13
Causal modeling ................................................................................................................................................ 13
Conclusion ........................................................................................................................................................ 14
Two high-level primary goals: prediction and description ............................................................................. 14

, SUPERVISED VS. UNSUPERVISED METHODS............................................................................................................ 14
Voorbeeld .......................................................................................................................................................... 14
HET DATAMINING PROCES ...................................................................................................................................... 15
Belangrijk onderscheid ..................................................................................................................................... 15
Knowledge discovery in databases ................................................................................................................... 15
ANDERE ANALYSETECHNIEKEN EN -TECHNOLOGIEËN............................................................................................. 17
Statistics ............................................................................................................................................................ 17
Database querying ............................................................................................................................................ 17
OLAP-tools......................................................................................................................................................... 17
Data warehousing .............................................................................................................................................. 18
Regression analysis .......................................................................................................................................... 18
Machine learning (AI) and datamining (KDD) ..................................................................................................... 18
H2.1 INTRODUCTION TO PREDICTIVE MODELING ............................................................................................. 19
TERMINOLOGIE ..................................................................................................................................................... 19
Model ................................................................................................................................................................ 19
In data science ................................................................................................................................................... 19
Two high-level primary goals: prediction & description ..................................................................................... 19
Instance............................................................................................................................................................. 19
Inductie & deductie .......................................................................................................................................... 19
SUPERVISED SEGMENTATIE .................................................................................................................................... 19
Complicaties ..................................................................................................................................................... 20
HET SELECTEREN VAN INFORMATIEVE ATTRIBUTEN ................................................................................................. 21
Entropie ............................................................................................................................................................. 21
Information gain ................................................................................................................................................ 22
Voorbeeld: IG berekenen .................................................................................................................................... 22
Numeric values ................................................................................................................................................. 23
Regressieproblemen .......................................................................................................................................... 23
SUPERVISED SEGMENTATIE MET BOOMSTRUCTUURMODELLEN ............................................................................... 23
Voorbeeld .......................................................................................................................................................... 24
Lichaamsvorm ................................................................................................................................................... 24
Samenvatting ..................................................................................................................................................... 25
ANDERE VOORSTELLINGEN .................................................................................................................................... 26
Visualisatie van segmenten .............................................................................................................................. 26
Decision lines & hyper planes (beslissingslijnen & hypervlakken) ................................................................ 26
Bomen als reeksen van regels ......................................................................................................................... 27
PROBABILITY ESTIMATION (WAARSCHIJNLIJKHEIDSSCHATTING) .............................................................................. 27
Voorbeeld .......................................................................................................................................................... 27
H2.2 FITTING A MODEL TO DATA....................................................................................................................... 28
CONTENTS ............................................................................................................................................................... 28
Decision Trees vS. parametric modeling ......................................................................................................... 28
Drie assumpties ................................................................................................................................................. 28
LINEAR DISCRIMINANT FUNCTIONS ........................................................................................................................ 28
Instance space ................................................................................................................................................... 28

, lineaire discriminerende functie .......................................................................................................................... 29
Optimaliseren v/d objective function ................................................................................................................... 30
Voorbeeld lineaire discriminatie ....................................................................................................................... 30
CLASSIFICATION: SCORING & RANKING .................................................................................................................. 30
LINEAR MODEL FOR CLASSIFICATION................................................................................................................................ 31
SUPPORT VECTOR MACHINES (SVM) ................................................................................................................... 31
Logistieke regressie ............................................................................................................................................ 32
Linear regression ................................................................................................................................................ 33
WHAT IF THE DATA IS NON-LINEAR? ......................................................................................................................... 34
H3.1 OVERFITTING & ITS AVOIDANCE ................................................................................................................ 35
OVERFITTING ......................................................................................................................................................... 35
Definitie ............................................................................................................................................................. 35
Wat nu? ............................................................................................................................................................. 35
Holdout data & fitting graphs ............................................................................................................................... 35
VOORSPELLINGSTECHNIEKEN & OVERFITTING ................................................................................................... 36
WAAROM IS OVERFITTEN SLECHT?...................................................................................................................... 39
AVOIDING OVERFITTING !!! ............................................................................................................................................ 40
CROSS VALIDATION ........................................................................................................................................... 40
LEARNING CURVES ............................................................................................................................................ 42
VERMIJDEN VAN OVERFITTING & COMPLEXITEITSCONTROLE .............................................................................. 42
H3.2 SIMILARITY, NEIGHBORS & CLUSTERS ....................................................................................................... 45
CALCULATE SIMILARITY ............................................................................................................................................. 45
gEBRUIK VAN SIMILARITY .................................................................................................................................... 45
AFSTAND ........................................................................................................................................................... 46
NEAREST-NEIGHBOUR REASONING (NN) ............................................................................................................ 47
Goniometrische interpretatie, overfitting & complexity control .............................................................................. 50
3 problemen met k-NN ....................................................................................................................................... 51
Technische details m.b.t. NN Heterogene attributen............................................................................................. 52
Technische details m.b.t. Andere afstandsfuncties ............................................................................................... 52
CLUSTERING AS SIMILARITY-BASED SEGMENTATION ............................................................................................... 54
Supervised vs. unsupervised ............................................................................................................................... 54
Clustering = unsupervised segmentation ............................................................................................................. 54
2 soorten clustering ............................................................................................................................................ 55
Hiërarchische clustering vs. centroid clustering (k-means) ................................................................................... 58
Clustering resultaten .......................................................................................................................................... 58
H4.1 DECISION ANALYTICAL THINKING 1 : WHAT IS A GOOD MODEL? ....................................................... 59
INTRODUCTIE ........................................................................................................................................................ 59
EVALUEREN VAN CLASSIFIERS ................................................................................................................................ 59
Plain accuracy ................................................................................................................................................... 59
Probleem met ongebalanceerde klassen ............................................................................................................. 60
Confusion matrix ................................................................................................................................................ 61
Problemen met ongelijke kosten en baten ............................................................................................................ 63
GENERALIZING BEYOND CLASSIFIERS ..................................................................................................................... 63

, Algemene principe .............................................................................................................................................. 63
EXPECTED VALUE FRAMEWORK .............................................................................................................................. 64
Using expected value to frame classifier use ........................................................................................................ 64
Gebruik v/d expected value voor de evaluatie v/d classifier ................................................................................... 65
Kosten & baten binnen expected value framework ................................................................................................ 66
BASELINE PERFORMANCE (& CONSEQUENCES) ........................................................................................................... 69
Baseline model .................................................................................................................................................. 69
Algemene principes ............................................................................................................................................ 69
Andere ............................................................................................................................................................... 70
H4.2 VISUALISING MODEL PERFORMANCE ....................................................................................................... 71
RANKING IN PLAATS VAN CLASSIFICEREN .......................................................................................................... 71
WINSTCURVES ................................................................................................................................................... 73
ROC curves & AUC (Area under curve) ................................................................................................................. 74
CUMULATIEVE RESPONS- & LIFTCURVES ............................................................................................................ 77
VOORBEELD CHURNPREDICTION ...................................................................................................................... 78
H5.1 EVIDENCE AND PROBABILITIES ................................................................................................................ 82
VOORBEELD ...................................................................................................................................................... 82
COMBINING EVICENCE PROBABILISTICALLY ...................................................................................................... 82
JOINT PROBABILITY & INDEPENDENCE ............................................................................................................... 83
BAYES' RULE ...................................................................................................................................................... 83
Het toepassen van de bayes’ rule op data science ................................................................................................ 84
Conditional independence & naive bayes............................................................................................................. 85
Voordelen & nadelen van naïve bayes .................................................................................................................. 86
EEN MODEL VAN BEWIJSVOERING "LIFT" ............................................................................................................ 86
Voorbeeld: bewijsliften van facebook likes ........................................................................................................... 86
SAMENVATTING ................................................................................................................................................. 87
H5.2 REPRESENTING AND MINING TEXT ............................................................................................................ 88
DATA PREPARATION ............................................................................................................................................... 88
WAAROM IS TEKST BELANGRIJK? ............................................................................................................................ 88
WAAROM IS TEKST MOEILIJK? ................................................................................................................................. 88
REPRESENTATION - WEERGAVE .............................................................................................................................. 89
Bag of words ...................................................................................................................................................... 89
Term frequency .................................................................................................................................................. 89
Normalisatie en stemming .................................................................................................................................. 90
meten van spaarzaamheid (sparseness): inverse document frequency .................................................................. 91
Combinatie van TF & IDF: TFIDF .......................................................................................................................... 92
VOORBEELD.............................................................................................................................................................. 92
THE RELATIONSHIP OF IDF TO ENTROPY ........................................................................................................................... 93
BEYOND BAG OF WORDS ....................................................................................................................................... 94
N-gram sequence ............................................................................................................................................... 94
Named Entity Extraction ...................................................................................................................................... 94
Topic models ..................................................................................................................................................... 95
VOORBEELD: DATAMINING OM DE KOERSBEWEGING TE VOORSPELLEN .................................................................. 96

Documentinformatie

Geüpload op
3 juni 2026
Bestand laatst geupdate op
3 juni 2026
Aantal pagina's
117
Geschreven in
2025/2026
Type
SAMENVATTING
€12,06
Krijg toegang tot het volledige document:

Verkeerd document? Gratis ruilen Binnen 14 dagen na aankoop en voor het downloaden kan je een ander document kiezen. Je kan het bedrag gewoon opnieuw besteden.
Geschreven door studenten die geslaagd zijn
Direct beschikbaar na je betaling
Online lezen of als PDF

Maak kennis met de verkoper
Seller avatar
sahinselin03

Maak kennis met de verkoper

Seller avatar
sahinselin03 Katholieke Hogeschool VIVES
Bekijk profiel
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
11
Lid sinds
1 jaar
Aantal volgers
0
Documenten
26
Laatst verkocht
3 weken geleden

0,0

0 beoordelingen

5
0
4
0
3
0
2
0
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via Bancontact, iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo eenvoudig kan het zijn.”

Alisha Student

Bezig met je bronvermelding?

Maak nauwkeurige citaten in APA, MLA en Harvard met onze gratis bronnengenerator.

Bezig met je bronvermelding?

Veelgestelde vragen