100% tevredenheidsgarantie Direct beschikbaar na je betaling Lees online óf als PDF Geen vaste maandelijkse kosten 4,6 TrustPilot
logo-home
Samenvatting

Samenvatting statistiek 4

Beoordeling
-
Verkocht
-
Pagina's
75
Geüpload op
20-05-2025
Geschreven in
2024/2025

In deze samenvatting staat alle theorie dat je moet kennen! Het is voornamelijk gebaseerd op het pdf-bestand, maar heb het ook aangevuld met informatie uit de practica.


















Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Documentinformatie

Geüpload op
20 mei 2025
Aantal pagina's
75
Geschreven in
2024/2025
Type
Samenvatting

Onderwerpen

Voorbeeld van de inhoud

Introduction
If manipulation is impossible → observing the relations between variables

BUT 3 complications …

- Uncertainty: human behavior is influenced by many factors and a lot are unknown
o Solution: better theories, more knowledge, improved control
- (Measurement) noise: measurement instruments are far from perfect
o Solution: better measurement
- Variation: effects and relations in the behavioral sciences vary → variation over
situations, times and a huge variation across people
o Solution: better understanding of variation, knowing how and why effects vary

BUT humans have fallacies (drogredenen) when reasoning (e.g. confirmation bias) with a lot of
noise and variation

Basic statistical models: analysis of variance (ANOVA), linear regression and logistic regression
→ more advanced models: multilevel models and structural equation models

XY
- X: predictor, independent variable (usually multiple)
- Y: outcome, criterion, dependent variable (explained by one or more X’s)


read.table(“./Data/Chapter1/Ch1DataExample.csv”, Reading a csv file
header=TRUE, sep=”;”)
install.packages(“A”) Installing package A
library(“A”) Loading package A
cat(version$version.string) Checken welke versie R je hebt




1

, H1: The good old one-way ANOVA
ANOVA (ANalysis Of VAriance) = statistical methodology to compare the means of 2 or more
groups (~ generalization of the independent groups t-test)

Data passed the interocular trauma test if you know what the data means, when the conclusion
hits you between the eyes, no further statistical analysis is needed

𝒚𝒊𝒋 = score of person 𝑖 in condition 𝑗 ➔ 𝑖 and 𝑗 are running indices

- 𝑖 = 1, …, 𝑚𝑗 (𝑚𝑗 persons in condition 𝑗)
- 𝑗 = 1, …, 𝑎 (𝑎 conditions/groups)
o 𝑎 = levels of a factor

Balanced design = if all 𝑚𝑗 ’s are equal  unbalanced design = if not all 𝑚𝑗 ’s are equal


𝑛 = total number of participants

𝑦̅𝑗 = sample average in condition 𝑗

𝑦̅ = grand sample average



Step-by-step

1. Models and hypotheses
2. Choice of the test statistic
3. The sampling distribution of F under H0 and what to conclude
4. Determine the size of your effect


STEP 1: MODELS AND HYPOTHESES

ANOVA: comparison of 2 statistical models →
= generative models: specify how the scores on the criterion variable are generated

- The full model: 𝒚𝒊𝒋 = 𝝁𝒋 + 𝜺𝒊𝒋
o Observation = systematic (structural or signal) part + random deviation
(stochastic 𝜀𝑖𝑗 or noise)
o Population mean has index 𝑗 → can differ across conditions
- The reduced model: 𝒚𝒊𝒋 = 𝝁 + 𝜺𝒊𝒋
o Special case of the full model, nested in the full model
o Assumption that 𝑎 means are all equal
o 𝐻0 = 𝜇1 = 𝜇2 = ⋯ = 𝜇𝑎


Parameter = has certain value in the population, but is unknown to us (e.g. population mean in
full and reduced model) → draw a sample, make observations and estimate it

- Estimated parameter is indicated with a hat (e.g. 𝜇̂ )
- Fitted value = based on the estimated parameters, this is the best guess for an
observation based on the model ➔ model-based approximation to the observed score

2

,Least squares estimation = look for the values of the parameters that minimizes the sum of
squared differences between what is observed and what the model tells it should be → standard
method of estimation in ANOVA

𝐐𝐫𝐞𝐝𝐮𝐜𝐞𝐝(𝛍) = sum of squared differences → function of the unknown parameter μ → find
value of μ which minimizes Qreduced(μ)

- 𝑦𝑖𝑗 − 𝜇: difference between an observation and what the model tells us = residual 𝒆𝒊𝒋
o Large residual: model does a bad job in explaining that observation
o Small residual: model does a good job
- Squared so that positive and negative residuals don’t cancel each other out
- Reduced model: 𝝁 ̂=𝒚 ̂𝒓𝒆𝒅 ̅
𝒊𝒋 = 𝒚
o Estimated parameter = fitted values = grand sample average

𝐐𝐟𝐮𝐥𝐥 (𝝁𝟏 , … , 𝝁𝒂 )
𝒇𝒖𝒍𝒍
- Full model: 𝝁
̂𝒋 = 𝒚
̂𝒊𝒋 ̅𝒋
=𝒚
o Estimation population mean of condition 𝑗 = sample average of condition 𝑗


Sum of squares = error sum of squares = residual sum of squares = summary measure of the
size of the residuals




! Reduced model is NOT always a model with a single mean for all groups !


SSTot (total sum of squares) = measures total variation in the data

- One-way ANOVA: SSTot = SSEreduced
- SSEreduced ≥ SSEfull
- SSEff = effect sum of squares = difference between the error sum of squares
o Expresses how much we can decrease the error by taking into account the
different groups/conditions
o Shorter way of computing SSEff in one-way design:


Problems with interpreting the magnitude of SSE and SSEff

- Problem of scaling: SS can’t be interpreted meaningfully in an absolute way, but only
relative to each other (e.g. multiply scores with 100 → SS increase with factor 100²)
- SSE of reduced model is always larger of equal large than the full model (because full
model is more complex and flexible so smaller residuals)
o H0 is true (reduced model most true) → small difference between SS BUT what is
small? → solution: taking in account degrees of freedom (~ complexity models)




3

,Degrees of freedom ~ complexity of model

- There are only n – 1 independent numbers, not n, because if you know n – 1 residuals of
the reduced model, then you also know the last one (because sum is 0)
- Summing the degrees of freedom per condition: n – a (because a condition specific
population means) (cf. full model)
- Larger degrees of freedom: simpler models (with smaller number of parameters):
- df = number of observations – number of freely estimated parameters




Mean squares

- E




- Degrees of freedom of SSEff = a – 1 because it’s the difference between degrees of
freedom of reduced and full model → (n – 1) – (n – a) = a – 1



Effect parameter aj

- 𝛼𝑗 = effect or deviation of condition j compared to the grand mean 𝜇
→ hoeveel verschilt gemiddelde van groep j van algemeen gemiddelde
- Sum of effect parameters = 0
- E.g. height male/female: population mean 170, female 160, male 180 cm
→ 𝛼𝑚𝑎𝑙𝑒 = 10 cm 𝛼𝑓𝑒𝑚𝑎𝑙𝑒 = - 10 cm → sum is 0



STEP 2: CHOICE OF THE TEST STATISTIC

Find out if we can collect evidence against the reduced model (and H0) in favor of the full model

- Looking at lack of fit of the model to the data
- Evaluate complexity of the model
➔ Fit & complexity are opposing quantities: if one goes up, the other goes down
➔ Is the decrease in SSE (full < reduced) of the full model large enough to justify its
increase in complexity (full > reduced)?



F-statistic




4

, - Perspective 1: F is a fraction consisting of a numerator (top) and denominator (bottom)
o Numerator: variability between the sample averages of the conditions/groups
▪ Sampling variability: randomness
▪ Systematic variability: effect of manipulation
o Denominator: variability within conditions
▪ Only sampling variability: randomness
- Perspective 2: F is clarified by taking the expected values of MSEff and MSEfull
o m = group sample size → if larger m, larger F value
o Breuk = effect size
o Under reduced model (H0): a = 0 → E(MSEff) = σ²




STEP 3: THE SAMPLING DISTRIBUTION OF F UNDER H0 AND WHAT TO CONCLUDE

p value = the probability, given H0, to find an equally or more extreme value
of the F-statistic ➔ pr(F ≥ Fobs | H0 is true)

- p value = significance probability = observed significance level = probability level
- Conditional probability: if H0 (reduced model)


Interpretation p value
- Fisher: interpreted in a continuous way as evidence against H0
o Smaller p-value: more evidence against H0 (no effect of conditions)




- Neyman & Pearson: binary decision of rejecting H0 or not
o Comparing p value with nominal significance level 𝛼 (= 0.05, 0.01, 0.001, …)
▪ p < 𝛼: reject H0 → significant result
▪ p ≥ 𝛼: don’t reject H0
o Can also use F values directly
▪ : reject H0 (otherwise don’t reject)
▪ : 100*(1 – a) percentile of F distribution with 𝛼 -1 and n- 𝛼
degrees of freedom



STEP 4: DETERMINE THE SIZE OF YOUR EFFECT

Important to judge whether a result is (besides statistically significant) also practically or
clinically significant → effect size

- Association measure: proportion of variance explained → how strongly is the variation in
outcome associated with variation in the conditions in the population?
- 𝜼² = population proportion of variance explained
o Estimated using sample statistics: 𝜂 2 , 𝑅 2 or 𝜔
̂²

SSTot: total sum of squares


5

, - Measures the deviation from the observations to the grand sample average → index of
total variability in the sample
- Variance to be explained via the ANOVA model
- In one-way ANOVA:



SSEfull: error sum of squares

- How much variability is left unexplained under the full model (with conditions)
- Variability within conditions or groups


SSEff: effect sum of squares

- Difference between the variability to be explained and the unexplained variability
- Explained variability



Disadvantage 𝜂̂ 2 : biased estimator of the true proportion of the true variance explained 𝜂²

- 𝜂² = 0: true effect is 0, factor is not associated with outcome
o Then E(𝜂̂ 2 ) ≥ 0: positive bias
- E(𝜂̂ 2 ) = 0 if …
o Positive and negative values of 𝜂̂ 2 cancel each other out → CAN’T happen
because 𝜂̂ 2 > 0
o 𝜂̂ 2 = 0 in each sample → very unlikely because condition sample averages will
never be exactly equal to each other due to sampling variability → small positive
values of 𝜂̂ 2



Unbiased proportion of variance explained: 𝝎
̂²

- Unbiased estimator of the proportion of variance explained in the population
- If 𝜂² = 0 ➔ then E(𝜔̂²) = 0
- 𝜔̂² is usually smaller & better than 𝜂²
- Downside: 𝜔 ̂² can become negative (needed to attain a 0 average value when no effect)
o Usually set 𝜔 ̂² = 0 if it is smaller than 0 !


Rule of thumb (but also depends on domain of research)
- Proportion variance 1%: small effect
- 6%: medium effect
- 14%: large effect


We don’t use the F statistics or the p value as a measure of effect size because they depend on
the size of the effect and the sample size

➔ you can have a very small effect, but a very large F value (or small p value) because of a huge
sample


6

,Uncertainty of effect size estimates (𝜂² and 𝜔
̂²) → confidence intervals
→ only for quantities of primary interest ((differences between) means, effect sizes, …)



Practical session 1

pnorm(x) Kans (probability) dat normaal verdeelde variabele kleiner dan of gelijk
aan x
pt(t, df) Kans (probability) dat t-statistiek kleiner dan of gelijk aan t, met df
vrijheidsgraden
➔ computes the cumulative probability (area under the curve) for a t-
distribution given a t-score and degrees of freedom
qt(p, df) Quantile, kwantiel van p bij df vrijheidsgraden → welke waarde?
pf(F, df1, df2) Berekent p-value, kans op kleinere of gelijke F-waarde met F-verdeling
met df1 en df2 vrijheidsgraden ➔ computes the cumulative probability
for an F-distribution given a f-score and degrees of freedom
qf(p, df1, df2) Berekent F-value bij F verdeling met df1 en df2 vrijheidsgraden




Pr(H0 is true) ➔ prior probabilities (Bayesian): subjective (different for different people) and
unknown
Pr(H0 is true|Fobs) ➔ posterior probability (Bayesian): obv data


Larger n → smaller critical value
Smaller α → larger critical value
Larger a → larger critical value




See “test yourself” !! (8?)




7

, H2: Contrasts, or how to be more specific

2.1 DATA EXAMPLE: THE TREATMENT OF DEPRESSION REVISITED

Preregistration = the research questions, hypotheses, design and plan of analysis are specified
before the data have been collected → open science
- Written in a time-stamped and publicly accessible document
- Researcher can’t change his hypotheses from exploratory or post-hoc to confirmatory or
planned


2.2 GOAL OF THIS CHAPTER

F-test checks if there are differences between the conditions
- But could be that condition 1 differs from 2, but not from 3 or that they all differ or …

Analysis of contrasts checks which conditions differ from each other and how much they differ


2.3 SOME TERMINOLOGY

Contrast / comparison = difference in the averages of 2 or more conditions (e.g. placebo vs
treatment)

Pairwise contrast = simple difference between the averages of 2 conditions (e.g. 𝑦̅1 − 𝑦̅2 )

Complex contrast = more complicated difference between 2 elements, and one or both of
these elements are averages of several conditions (e.g. between the placebo condition and the
1
average of the 2 (or more) treatment groups 𝑦̅1 − 2 (𝑦̅2 + 𝑦̅3 ) )



Contrast = linear combination of sample averages

- Coefficients cj sum to 0 (are known)



Population contrast:

- Population value of the contrast 𝛾:
- Sample estimate g:



Planned contrast = specified before the data have been collected or seen

Post-hoc contrast = inspired by looking at the data (e.g. difference between group 1 and 3 looks
large, let’s test it)

Multiple contrasts, multiple comparisons = more than one planned contrast

Multiple post-hoc contrasts



8

,2.4 A SINGLE PLANNED CONTRAST

2.4.1 DERIVATION OF THE SAMPLING DISTRIBUTION OF G

Sampling distribution of g under a statistical model (e.g. full model) quantifies the uncertainty
around the sample contrast value g


1. FORM OF THE DISTRIBUTION OF G

If 𝑦𝑖𝑗 is normally distributed → sample averages of these observations + every linear
combination of the sample averages (= contrasts) are also normally distributed

Normal distribution is completely determined by its mean and variance


2. EXPECTED VALUE OF G

Expected value of linear combination = linear combination of expected values

g = unbiased estimator of 𝛾




3. VARIANCE OF G

Variance of the sum = sum of variances (because iid)

Variance of the sample average = variance of single observation (𝜎 2 ) divided by number of
observations of sample average (mj)




4. SUMMARY

Standard error of g = uncertainty in g based on sample measures

Replace unknown 𝜎 2 by estimate based on data: MSEfull

9

, 2.4.2 STATISTICAL INTERFERENCE FOR A SINGLE PLANNED CONTRAST

Statistical interference for 𝛾 → confidence interval or hypothesis testing


1. CONFIDENCE INTERVAL (CI)

100*(1 – 𝜶)% confidence interval for a single planned contrast:

95% CI (𝛼 = 0.05) → 97.5% quantile 99% CI (𝛼 = 0.01) → 99.5% quantile

Half-width of the confidence interval



2. HYPOTHESIS TEST

with C is a hypothesized value → usually C = 0


Difference between what we observe (g) and what is hypothesized (C) divided by the uncertainty
in g (SE(g)) ➔ if H0 true, then

Square of t-statistic = F-statistic (when comparing full and reduced model in F-test)



Effect size

- If well defined measurement scale (e.g. meter, euro, °C) → contrast value g (with CI)
- Standardized effect size measure (without measurement units) → cohen’s d:
o Difference of 2 means divided by the estimate of the within-group
standard deviation (common to both groups)
o Cohen’s d = estimate of the population value 𝜹 (delta)
o For pairwise contrasts (or complex contrasts: using its numerator sample value)
- Interpretation cohen’s d
o Around 0.2 = small effect
o Around 0.5 = medium effect
o Around 0.8 = large effect



Streetwise statistics: if sample sizes are large enough (e.g. df-full > 30) → t-distribution looks like
standard normal distribution → we can use standard normal distribution for CI and testing

If large sample sizes, an 95% CI can be calculated by using 2 as a multiplier for SE(g) instead of
𝑧 0.975 = 1.96 → rough hypothesis test: comparing absolute value of t-statistic with 2 to evaluate
the significance



2.5 MULTIPLE TESTING : MANY PLANNED CONTRASTS

Illustration: if test have 5% false alarms and doctor 1 tests for disease A and doctor 2 tests for
disease A, B, C, D, E → if nobody has any disease, then doctor 1 has 50 false alarms and doctor
2 has 226 false alarms


10

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
vranckennona Katholieke Universiteit Leuven
Bekijk profiel
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
12
Lid sinds
2 jaar
Aantal volgers
8
Documenten
5
Laatst verkocht
10 maanden geleden

4,0

1 beoordelingen

5
0
4
1
3
0
2
0
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via Bancontact, iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo eenvoudig kan het zijn.”

Alisha Student

Veelgestelde vragen