100% tevredenheidsgarantie Direct beschikbaar na je betaling Lees online óf als PDF Geen vaste maandelijkse kosten 4.2 TrustPilot
logo-home
Samenvatting

ARMS - Samenvatting alle Tentamenstof

Beoordeling
-
Verkocht
-
Pagina's
39
Geüpload op
14-12-2025
Geschreven in
2025/2026

Ik heb een 9,1 voor dit tentamen gehaald met het leren van mijn eigen document. Dit document bevat: - Alle hoorcolleges - Alle seminars - Alle Grasple lessen (per onderwerp verwerkt) - Overzichtstabel van alle assumpties - Handige tips (Nederlands) Succes met leren! :)

Meer zien Lees minder











Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Documentinformatie

Geüpload op
14 december 2025
Aantal pagina's
39
Geschreven in
2025/2026
Type
Samenvatting

Voorbeeld van de inhoud

Advanced Research Methods & Statistics – ARMS

Lecture 1: Frequentist vs Bayesian Statistics and MLR

Frequentist framework is still mainstream
 Estimate parameter of a model.
 Test how well the data fit H0 (NHST).
 p-values, confidence intervals, effect sizes, power analysis.

Bayesian framework is increasingly popular
 Estimate a parameter of a model.
 Probability of the hypothesis given the data, taking prior information into account.
 Bayes factors (BFs), priors, posteriors, credible intervals.

Difference between the frameworks when we are trying to estimate something:
 Frequentist: the information in the data is collected in the
likelihood function. We draw a sample from the population
and use that to estimate a parameter for the whole population.
All relevant information for inference is contained in the
likelihood function.
 Bayesian: In addition to the data, we may also have prior
information about µ (mean).
o Central idea/mechanism: prior knowledge is updated
with information in the data and together provides the
posterior distribution for µ.
o Advantage: accumulating knowledge (‘today’s posterior is tomorrow’s prior’).
o Disadvantage: results depend on choice of prior. This is not always something
to worry about, but it’s important to keep in mind.

A prior distribution plot is made before getting the data. The prior influences the posterior.
If there are no prior expectations (example 1), it resembles a frequency approach.
However, if we do have certain expectations (example 2), the posterior probability will be
affected. In the example: the data confirms the prior, which results in a stronger posterior
distribution.




Example 1 Example 2




1

,The posterior distribution of the parameter(s) of interest provides all desired estimates:
 Posterior mean or mode: the mean or mode of the posterior distribution.
 Posterior SD: SD of posterior distribution (comparable to frequentist standard error,
SE).
 Posterior 95% credible interval: providing the bounds of the part of the posterior with
95% of the posterior mass. In example 2: if you want to know 95% of the probability,
you cut off 2,5% at both ends of the posterior distribution function.

The problem with frequency approach is that results (and conclusion) depend on things not
observed and on the sampling plan → same data can give different results.

Bayes conditions on observed data, whereas frequentists testing conditions on H0:
 Bayesian – Pr (Ha I data): probability that hypothesis Ha, is supported by the data.
o What is the probability of the hypothesis, given the data?
 Frequentist – Pr (data I H0): p-value = probability of observing same or more extreme
data given that the null is true.
o What is the probability of the data, given the hypothesis?

When testing hypotheses, Bayesians can calculate the probability of the hypothesis given the
data:
PMP = Posterior Model Probability
→ the (Bayesian) probability of the hypothesis after observing the data. This is always
influenced by the prior data.

Bayesian probability of a hypothesis being true depends on two criteria:
1. How sensible it is, based on prior knowledge (the prior).
2. How well it fits the new evidence (the data).

Bayesian testing is comparative: hypotheses are tested against one BF10 means
another, not in isolation. 1: 0, and not ten.
This is also seen in the Bayes factor:

BF10 = 10: Support for H1 is 10 times stronger than for H0.
BF10 = 1: Support for H1 is as strong as support for H0.
BF10 < 1: Support for H1 is smaller than support for H0

Posterior probabilities of hypotheses (PMP) are also relative probabilities. PMPs are updates
of prior probabilities (for hypotheses) with the BF.

Definition of probability
Both frameworks use probability theory, but:
 Frequentist: probability is the relative frequency of events (more formal?)
 Bayesian: probability is the degree of belief (more intuitive?)
This leads to debate (same word used for different things) and to differences in the correct
interpretation of statistical results. E.g., p-value vs PMP; but also:
Frequentists use 95% confidence interval: If we were to repeat this experiment many times
and calculate a CI each time, 95% of the intervals will include the true parameter value (and
5% won’t).
Bayesian use 95% credible interval: There is 95% probability that the true value is in the
credible interval.

2

,Grasple – Verschil Bayesiaanse & Frequentistische Statistiek
In de Bayesiaanse statistiek proberen we kennis over onbekende parameters (zoals
gemiddelde of variantie) bij te werken op basis van nieuwe gegevens. De drie
hoofdonderdelen zijn:
 Prior distribution: wat je al weet/aanneemt over de parameter, voordat je nieuwe data
hebt.
 Likelihood (verdelingsfunctie van de data): hoe waarschijnlijk de geobserveerde data
zijn, gegeven een bepaalde waarde van de parameter.
 Posterior distribution: wat je na het zien van de data denkt over de parameter – een
update van je prior.
Er wordt gekeken naar conditionele kansen: P (A I B)/P (A given B) – wat is de kans dat A
gebeurt/waar is, gegeven dat B is gebeurd of waar is?

Bij Frequentistische statistiek worden betrouwbaarheidsintervallen/confidence intervals
gebruikt. Het geeft een interval waarbinnen de parameter waarschijnlijk ligt als je het
experiment vaak zou herhalen.
Als je het experiment vaak herhaalt en elke keer een interval berekent, zal 95% van de
intervallen de echte parameter bevatten. Je mag niet zeggen ‘er 95% kans dat de echte waarde
in dit interval ligt’.
Bij Bayesiaanse statistiek wordt een credible interval gebruikt. Het geeft een interval waar
de parameter een kansverdeling heeft. Er is …% kans dat de parameter in het interval ligt,
gegeven de data en de prior.
Stel je hebt een BF12 = 3 (H1 is 3x meer waarschijnlijk dan H2). Dan is het bijbehorende PMP
van H1: 3/(3+1) = 0,75. Dan is de PMP van H2: 1 – 0,75 = 0,25.

Bayesian Statistics Frequentist Statistics
Given the data and the prior, there is a If we repeat the experiment many times,
95% probability that the real parameter 95% of the intervals will include the
(true value) is in the credible interval. real parameter (true value), and 5%
won’t.


Multiple Linear Regression

We use scatterplots for scores on the variables x and y and the
linear association (regression) between them.
The model takes the form ŷ = b0 + b1x1. In which the hat on y
means estimated.
We also have y = b0 + b1x1 + e. The error term is the residual:
difference between estimate and true value.


A multiple linear regression has the following model: y = b0 + b1x1 + b2x2 + e.
 Intercept: b0
 Slope of x1: b1, Slope of x2: b2.

All results are only reliable if assumptions made by the model and approach roughly hold.
Serious violations lead to incorrect results. Sometimes there are easy solutions (e.g. deleting a
severe outlier; or adding a quadratic term), sometimes not.


3

, Assumptions Multiple Linear Regression (MLR)
1. Measurement levels: all variables are interval/ratio level (outcome and predictors). A
grade on a scale or something like age.

2. Absence of outliers – control with a scatterplot, distribution- or boxplot.
o Standardized residuals: if there are outliers in the Y- space. The thumb tule is
between the -3,3 and 3,3. A Std. Residual bigger or smaller is an outlier.
o Cook’s distance: if there are outliers in the XY-space (multivariate outlier). An
outlier in this space is an extreme combination of X-values and Y-scores. The thumb
rule is that cook’s distance has to be smaller than
1.
 A value above 1 says an influential
respondent.

3. Absence of multicollinearity: multicollinearity means the association between two IV’s
is too big. When r is bigger than .8 of .9 there is multicollinearity. Consequences are:
o The regression coefficients (B) are unreliable.
o It limits R (correlation between y and ŷ).
o The importance of the individual IV’s is hard to determine.
To see if multicollineairity is a problem, you look at the Tolerance or the Variance
Inflation Factor (VIF). The rules are:
o Tolerance values smaller than .2 possibly indicate a problem.
o Tolerance values smaller than.1 definitely indicate a problem.
o The VIF is the same as 1/Tolerance value. So: if the VIF is bigger than 5,
there is probably a problem, and bigger than 10 definitely a problem.
You have to see which variables cause the problem (those who correlate high) and put
these together or remove them.




4. Homoscedasticity: the variance of the residuals is the same for all values of the predictors
(IV’s). This means that across the whole residual line there is equal distribution of
residuals. For this you look in a residual plot:




5. Normally distributed residuals: the residuals are normally distributed. For this you look
at a distribution plot or a Q-Q plot. In the Q-Q plot, the points have to be near the line. If
there isn’t a normal distribution, the points are far away from the line.




4

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
daphnebleize Universiteit Utrecht
Bekijk profiel
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
23
Lid sinds
10 maanden
Aantal volgers
0
Documenten
11
Laatst verkocht
1 week geleden

3,5

2 beoordelingen

5
0
4
1
3
1
2
0
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen