Toegepaste statistiek: Theorie en oefeningen
Inhoudstafel
Algemene informatie ............................................................................................................................................... 4
Doel van de cursus ...................................................................................................................................................... 4
Evaluatie ...................................................................................................................................................................... 4
Samenvatten en beschrijven ................................................................................................................................... 4
Waarom statistiek? ..................................................................................................................................................... 4
Empirisch wetenschappelijk onderzoek ..................................................................................................................... 5
Regels/principes voor onderzoek ................................................................................................................................ 5
Exploratief en inferentieel onderzoek......................................................................................................................... 5
Het hypothetisch-deductieve model ........................................................................................................................... 6
Onderzoekseenheden .................................................................................................................................................. 6
Variabelen.................................................................................................................................................................... 7
Meetniveau van variabelen ........................................................................................................................................ 7
Beschrijvende en Inferentiële statistiek ....................................................................................................................10
Frequentieverdeling ..................................................................................................................................................10
Staafdiagram .............................................................................................................................................................12
Histogram (classificeren) ............................................................................................. Error! Bookmark not defined.
Frequentieverdeling versus kansverdeling ...............................................................................................................14
Tak-blad-grafiek / Stem-and-leaf plot ......................................................................................................................17
Centrummaten ..........................................................................................................................................................17
Spreidingsmaten .......................................................................................................................................................20
Samenvatting en waarschuwingen ........................................................................................................................ 24
Cancer Survival (Case study) .....................................................................................................................................24
Populatie en steekproef ............................................................................................................................................26
Kankersterfte in buurt van Hanford Reactor (case study) .......................................................................................27
Correlatie of causaliteit? ...........................................................................................................................................30
Centrummaten en uitbijters......................................................................................................................................30
Scheefheid .................................................................................................................................................................31
Ongelijke klassenbreedte: frequentiedichtheid .......................................................................................................34
Transformatie en z-score ..........................................................................................................................................35
Lineaire transformatie ..........................................................................................................................................36
Z-score ...................................................................................................................................................................36
Kansverdelingen ........................................................................................................................................................38
Normaalverdeling: twee kenmerken (μ, σ) ........................................................................................................40
Het berekenen van kansen........................................................................................................................................41
,Statistiek deel 1 Pauline Chaumet
Seksuologie 2020-2021
Systematiek van het Toeval ................................................................................................................................... 42
Het toeval ..................................................................................................................................................................42
Inferentiële statistiek ................................................................................................................................................43
Steekproefgemiddelden ............................................................................................................................................44
Centrale limietstelling ...............................................................................................................................................45
Toetsen van hypothesen ...........................................................................................................................................48
z-score ........................................................................................................................................................................49
T-toets ........................................................................................................................................................................51
Principe van Statistisch Toetsen I .......................................................................................................................... 54
Normal human body temperature (Case study) ......................................................................................................54
Betrouwbaarheidsinterval ........................................................................................................................................55
t verdeling ..................................................................................................................................................................57
Toetsen met betrouwbaarheidsinterval ...................................................................................................................57
t-toets ........................................................................................................................................................................59
t-toets (twee populaties) ............................................................................................. Error! Bookmark not defined.
Gepaarde t-toets .......................................................................................................................................................60
Women in the labor force (Case study) ....................................................................................................................63
Principe van Statistisch Toetsen II ......................................................................................................................... 67
Onafhankelijke steekproeven (Stress) ......................................................................................................................67
Theorie toegepast op het voorbeeld ........................................................................................................................69
Standaarddeviates zijn gelijk: 𝜎S = 𝜎C = 𝜎p ..............................................................................................................70
Standaarddeviaties zijn niet gelijk: 𝜎S ≠ 𝜎C ............................................................................................................72
Normal human body temperature (Case study) ......................................................................................................73
Cancer Survival (Case study) .....................................................................................................................................74
Parametische toetsen ...............................................................................................................................................75
Non-parametrische (verdelingsvrije) toetsen ..........................................................................................................76
Wilcoxon rangsom toets (Mann-Whitney toets) .....................................................................................................76
Evaluatie van een afslankproduct (Case study) .......................................................................................................77
Height and ethnicity study ........................................................................................................................................78
Samenhang tussen variabelen I ............................................................................................................................. 80
Kalama studie (Case study).......................................................................................................................................80
Covariantie ................................................................................................................................................................81
Pearson correlatiecoëfficiënt ....................................................................................................................................83
Toetsen van Pearson’s correlatiecoëfficiënt ............................................................................................................86
Kankersterfte in de buurt van Hanford Reactor (Case study) .................................................................................86
Roken en kanker (Case study) ...................................................................................................................................88
Lineaire regressie ......................................................................................................................................................90
2
,Statistiek deel 1 Pauline Chaumet
Seksuologie 2020-2021
Kleinste-kwadratenmethode ....................................................................................................................................91
Bronnen van variatie .................................................................................................................................................92
De som van de kwadraten ........................................................................................................................................94
Determinatiecoëficiënt .............................................................................................................................................94
Samenhang tussen variabelen II ............................................................................................................................ 96
Kalama studie (Case study).......................................................................................................................................96
Gevaar van extrapolatie ...........................................................................................................................................98
Hellingshoek en correlatiecoëfficiënt .......................................................................................................................98
Normal human body temperature (Case study) ......................................................................................................99
Keuze van analysetechniek .................................................................................................................................. 105
Muscle mass study (Case study) ............................................................................................................................ 105
Clinical trial in Schizophrenia (Case study) ............................................................................................................ 105
EU public debt crisis (Case Study) .......................................................................................................................... 105
Keuze van analysetechniek .................................................................................................................................... 106
Oefeningen ............................................................................................................................................................. 109
3
, Statistiek deel 1 Pauline Chaumet
Seksuologie 2020-2021
Algemene informatie
Doel van de cursus
- Focus: toepassingsmogelijkheden voor medisch en paramedisch onderzoek
o Onderzoek naar oorzaak van ziekte
o Verschillende behandelingswijzen
- Uitvoeren van correct wetenschappelijk onderzoek
- Juiste interpretatie van onderzoeksresultaten
- Beter kunnen lezen van wetenschappelijke artikels
- Materials and Methods
- Artikels kritisch beoordelen op kwaliteit
Twee modules
- Methodologische module: statistiek theorie EOG65a
- Praktische module statistiek oefeningen (SPSS) EOG6a (SPSS)
Evaluatie
Multiple-choice examen (15/20) en een praktische opdracht (5/20)
- Multiple-choice examen: 20 meerkeuzevragen met giscorrectie, vier mogelijke antwoorden,
correction for guessing (-0.33)
- Praktische opdracht: dataset analyseren met groep studenten
- Tweede kans examen: Punten van de praktische opdracht worden overgedragen
- Advies: Studeer om te leren!
Samenvatten en beschrijven
Waarom statistiek?
8 colleges van theorie, 5 practica. Boek kopen is niet nodig. Examen is deels MPC (15/20), de rest is
project.
Waarom statistiek leren?
The ability to take data, to be able to understand it, to process it, to extract value from it, to visualize
it, to communicate it, that’s going to be a hugely important skill in the next decades. Because now we
really do have essentially free and ubiquitous data. So the complimentary scarce factor is the ability to
understand that data and extract value from it.
=
Het vermogen om gegevens te nemen, te begrijpen, te verwerken, er waarde uit te halen, te
visualiseren, te communiceren, dat wordt een enorm belangrijke vaardigheid in de komende
decennia. Omdat we nu eigenlijk gratis en alomtegenwoordige data hebben. De aanvullende schaarse
factor is dus het vermogen om die gegevens te begrijpen en er waarde uit te halen.
➔ Elke dag worden we door data gebombardeerd
1. Algemene informatie
2. Samenvatten en beschrijven
3. Samenvatting en waarschuwingen
4
Inhoudstafel
Algemene informatie ............................................................................................................................................... 4
Doel van de cursus ...................................................................................................................................................... 4
Evaluatie ...................................................................................................................................................................... 4
Samenvatten en beschrijven ................................................................................................................................... 4
Waarom statistiek? ..................................................................................................................................................... 4
Empirisch wetenschappelijk onderzoek ..................................................................................................................... 5
Regels/principes voor onderzoek ................................................................................................................................ 5
Exploratief en inferentieel onderzoek......................................................................................................................... 5
Het hypothetisch-deductieve model ........................................................................................................................... 6
Onderzoekseenheden .................................................................................................................................................. 6
Variabelen.................................................................................................................................................................... 7
Meetniveau van variabelen ........................................................................................................................................ 7
Beschrijvende en Inferentiële statistiek ....................................................................................................................10
Frequentieverdeling ..................................................................................................................................................10
Staafdiagram .............................................................................................................................................................12
Histogram (classificeren) ............................................................................................. Error! Bookmark not defined.
Frequentieverdeling versus kansverdeling ...............................................................................................................14
Tak-blad-grafiek / Stem-and-leaf plot ......................................................................................................................17
Centrummaten ..........................................................................................................................................................17
Spreidingsmaten .......................................................................................................................................................20
Samenvatting en waarschuwingen ........................................................................................................................ 24
Cancer Survival (Case study) .....................................................................................................................................24
Populatie en steekproef ............................................................................................................................................26
Kankersterfte in buurt van Hanford Reactor (case study) .......................................................................................27
Correlatie of causaliteit? ...........................................................................................................................................30
Centrummaten en uitbijters......................................................................................................................................30
Scheefheid .................................................................................................................................................................31
Ongelijke klassenbreedte: frequentiedichtheid .......................................................................................................34
Transformatie en z-score ..........................................................................................................................................35
Lineaire transformatie ..........................................................................................................................................36
Z-score ...................................................................................................................................................................36
Kansverdelingen ........................................................................................................................................................38
Normaalverdeling: twee kenmerken (μ, σ) ........................................................................................................40
Het berekenen van kansen........................................................................................................................................41
,Statistiek deel 1 Pauline Chaumet
Seksuologie 2020-2021
Systematiek van het Toeval ................................................................................................................................... 42
Het toeval ..................................................................................................................................................................42
Inferentiële statistiek ................................................................................................................................................43
Steekproefgemiddelden ............................................................................................................................................44
Centrale limietstelling ...............................................................................................................................................45
Toetsen van hypothesen ...........................................................................................................................................48
z-score ........................................................................................................................................................................49
T-toets ........................................................................................................................................................................51
Principe van Statistisch Toetsen I .......................................................................................................................... 54
Normal human body temperature (Case study) ......................................................................................................54
Betrouwbaarheidsinterval ........................................................................................................................................55
t verdeling ..................................................................................................................................................................57
Toetsen met betrouwbaarheidsinterval ...................................................................................................................57
t-toets ........................................................................................................................................................................59
t-toets (twee populaties) ............................................................................................. Error! Bookmark not defined.
Gepaarde t-toets .......................................................................................................................................................60
Women in the labor force (Case study) ....................................................................................................................63
Principe van Statistisch Toetsen II ......................................................................................................................... 67
Onafhankelijke steekproeven (Stress) ......................................................................................................................67
Theorie toegepast op het voorbeeld ........................................................................................................................69
Standaarddeviates zijn gelijk: 𝜎S = 𝜎C = 𝜎p ..............................................................................................................70
Standaarddeviaties zijn niet gelijk: 𝜎S ≠ 𝜎C ............................................................................................................72
Normal human body temperature (Case study) ......................................................................................................73
Cancer Survival (Case study) .....................................................................................................................................74
Parametische toetsen ...............................................................................................................................................75
Non-parametrische (verdelingsvrije) toetsen ..........................................................................................................76
Wilcoxon rangsom toets (Mann-Whitney toets) .....................................................................................................76
Evaluatie van een afslankproduct (Case study) .......................................................................................................77
Height and ethnicity study ........................................................................................................................................78
Samenhang tussen variabelen I ............................................................................................................................. 80
Kalama studie (Case study).......................................................................................................................................80
Covariantie ................................................................................................................................................................81
Pearson correlatiecoëfficiënt ....................................................................................................................................83
Toetsen van Pearson’s correlatiecoëfficiënt ............................................................................................................86
Kankersterfte in de buurt van Hanford Reactor (Case study) .................................................................................86
Roken en kanker (Case study) ...................................................................................................................................88
Lineaire regressie ......................................................................................................................................................90
2
,Statistiek deel 1 Pauline Chaumet
Seksuologie 2020-2021
Kleinste-kwadratenmethode ....................................................................................................................................91
Bronnen van variatie .................................................................................................................................................92
De som van de kwadraten ........................................................................................................................................94
Determinatiecoëficiënt .............................................................................................................................................94
Samenhang tussen variabelen II ............................................................................................................................ 96
Kalama studie (Case study).......................................................................................................................................96
Gevaar van extrapolatie ...........................................................................................................................................98
Hellingshoek en correlatiecoëfficiënt .......................................................................................................................98
Normal human body temperature (Case study) ......................................................................................................99
Keuze van analysetechniek .................................................................................................................................. 105
Muscle mass study (Case study) ............................................................................................................................ 105
Clinical trial in Schizophrenia (Case study) ............................................................................................................ 105
EU public debt crisis (Case Study) .......................................................................................................................... 105
Keuze van analysetechniek .................................................................................................................................... 106
Oefeningen ............................................................................................................................................................. 109
3
, Statistiek deel 1 Pauline Chaumet
Seksuologie 2020-2021
Algemene informatie
Doel van de cursus
- Focus: toepassingsmogelijkheden voor medisch en paramedisch onderzoek
o Onderzoek naar oorzaak van ziekte
o Verschillende behandelingswijzen
- Uitvoeren van correct wetenschappelijk onderzoek
- Juiste interpretatie van onderzoeksresultaten
- Beter kunnen lezen van wetenschappelijke artikels
- Materials and Methods
- Artikels kritisch beoordelen op kwaliteit
Twee modules
- Methodologische module: statistiek theorie EOG65a
- Praktische module statistiek oefeningen (SPSS) EOG6a (SPSS)
Evaluatie
Multiple-choice examen (15/20) en een praktische opdracht (5/20)
- Multiple-choice examen: 20 meerkeuzevragen met giscorrectie, vier mogelijke antwoorden,
correction for guessing (-0.33)
- Praktische opdracht: dataset analyseren met groep studenten
- Tweede kans examen: Punten van de praktische opdracht worden overgedragen
- Advies: Studeer om te leren!
Samenvatten en beschrijven
Waarom statistiek?
8 colleges van theorie, 5 practica. Boek kopen is niet nodig. Examen is deels MPC (15/20), de rest is
project.
Waarom statistiek leren?
The ability to take data, to be able to understand it, to process it, to extract value from it, to visualize
it, to communicate it, that’s going to be a hugely important skill in the next decades. Because now we
really do have essentially free and ubiquitous data. So the complimentary scarce factor is the ability to
understand that data and extract value from it.
=
Het vermogen om gegevens te nemen, te begrijpen, te verwerken, er waarde uit te halen, te
visualiseren, te communiceren, dat wordt een enorm belangrijke vaardigheid in de komende
decennia. Omdat we nu eigenlijk gratis en alomtegenwoordige data hebben. De aanvullende schaarse
factor is dus het vermogen om die gegevens te begrijpen en er waarde uit te halen.
➔ Elke dag worden we door data gebombardeerd
1. Algemene informatie
2. Samenvatten en beschrijven
3. Samenvatting en waarschuwingen
4