repetition
Pearson r-> linear relation
Y= ax+b
A=Slope/B1= y/x
B= intercept/ b0/constante
Residu= expected value of Y and observed value of Y
Y=^y
Least square method->
R^2= explained variance, goodness of fit
R= multiple correlation coefficient
Publication bias
Sloppy science-> questionable research practices
The bayesian way
Bayesian hypothesis testing
Bayesian factor
The fit of the hypothesis and the speceficity of the hypothesis
Reliability -> The extent to which a measurement is free from random measurement errors. This
means that the scores are independent of time, place, and environment.
Construct validity: the extent to which u measure the construct u aim to measure
Internal validity: extent to which u can rule out third variables, and claim the relation causal
external validity: the extent to which u can generalize the results to a bigger population
Random sample
Randomization
Statistical validity: The extent to which the way of analyzing the results is relevant, suitable
(assumptions) and accurately.
Conditions for causality: temporal precedence, internal validity, covariance
Use unstandardized B for formula (y=ax+b)
Use standardized b to compare the influence on the dependent variable
R2 measures the goodness of fit without adjusting for the number of predictors, while adjusted
�2R2 considers the number of predictors and penalizes models with too many variables that don't
add meaningful information
Adjusted R^2 is used for population explained variance it takes account in size of sample and
number of predictors
the table with the F test of H0: R^2=0,so if significant-> reject 0 hypothesis
,the coefficients are ‘unique effects’ (takes account of the other factors), different than bivariate
correlations (doesn’t take account of the other factors)
Hoorcollege 1 13 November 2023
1. Frequentist vs Bayesian statistics
Frequentist framework: test hw well the data fit H0 (NHST)
P, values; confidence intervals, effect sizes, power analysis
Data captured in Likelihood function (normal distribution)
Empirical research uses collected data to learn from
U= mean
All relevant information for inference is contained in the lielihood function
Bayesian framework
Estimation:
Probability of the hypothesis given te data, taking prior information into account
In to the data we may also have prior information about u
Prior knowledge is updated with information in the data and together provides the posterior
distribution for u
Priors-> how u think it’s distrubeted
The prior influences the posterior
,
U can see if the data supports the prior ur answer will be more certain
Posterior distribution
o Posterior mean/mode (only the same when its on the same piek)
o Posterior standard deviation
o Posterior 95% confidence interval
Advantage: Accumulating knowledge
Disadvantage: results depend on choice of prior
Hypothesis testing
Which hypothesis is more likely
Bayes conditions on observerd data
Pr( Hj/data): probability that HJ IS SUPPORTED BY THE DATA
Frequentist: Pr (data/H0): p-value= probability of observering same or more extreme
data given that the null is true
Bayesian probability
o Posterior model probability (PMP)
o How sensisble it is, based on prior knowledge
o How well it fits
Bayesian is comparative: hypotheses are tested againt on antohe
BF: BF10: P (data/H1/)/ P(data/H0)
PMP are relative probabilities
PMPs are updates of prior probabilities with the BF
Definition of probability
In frequentist: probability is the relative frequency of events (more formal)
Bayesian: probability is the degree of belief (more intuitive)
CI (confidence interval, frequentist): 95%of the times of a repeated experiment that CI of the
the data has the true value
CI (credible interval, Bayesian): there 95% probability that the true value is in the credible
interval
Pearson r-> linear relation
Y= ax+b
A=Slope/B1= y/x
B= intercept/ b0/constante
Residu= expected value of Y and observed value of Y
Y=^y
Least square method->
R^2= explained variance, goodness of fit
R= multiple correlation coefficient
Publication bias
Sloppy science-> questionable research practices
The bayesian way
Bayesian hypothesis testing
Bayesian factor
The fit of the hypothesis and the speceficity of the hypothesis
Reliability -> The extent to which a measurement is free from random measurement errors. This
means that the scores are independent of time, place, and environment.
Construct validity: the extent to which u measure the construct u aim to measure
Internal validity: extent to which u can rule out third variables, and claim the relation causal
external validity: the extent to which u can generalize the results to a bigger population
Random sample
Randomization
Statistical validity: The extent to which the way of analyzing the results is relevant, suitable
(assumptions) and accurately.
Conditions for causality: temporal precedence, internal validity, covariance
Use unstandardized B for formula (y=ax+b)
Use standardized b to compare the influence on the dependent variable
R2 measures the goodness of fit without adjusting for the number of predictors, while adjusted
�2R2 considers the number of predictors and penalizes models with too many variables that don't
add meaningful information
Adjusted R^2 is used for population explained variance it takes account in size of sample and
number of predictors
the table with the F test of H0: R^2=0,so if significant-> reject 0 hypothesis
,the coefficients are ‘unique effects’ (takes account of the other factors), different than bivariate
correlations (doesn’t take account of the other factors)
Hoorcollege 1 13 November 2023
1. Frequentist vs Bayesian statistics
Frequentist framework: test hw well the data fit H0 (NHST)
P, values; confidence intervals, effect sizes, power analysis
Data captured in Likelihood function (normal distribution)
Empirical research uses collected data to learn from
U= mean
All relevant information for inference is contained in the lielihood function
Bayesian framework
Estimation:
Probability of the hypothesis given te data, taking prior information into account
In to the data we may also have prior information about u
Prior knowledge is updated with information in the data and together provides the posterior
distribution for u
Priors-> how u think it’s distrubeted
The prior influences the posterior
,
U can see if the data supports the prior ur answer will be more certain
Posterior distribution
o Posterior mean/mode (only the same when its on the same piek)
o Posterior standard deviation
o Posterior 95% confidence interval
Advantage: Accumulating knowledge
Disadvantage: results depend on choice of prior
Hypothesis testing
Which hypothesis is more likely
Bayes conditions on observerd data
Pr( Hj/data): probability that HJ IS SUPPORTED BY THE DATA
Frequentist: Pr (data/H0): p-value= probability of observering same or more extreme
data given that the null is true
Bayesian probability
o Posterior model probability (PMP)
o How sensisble it is, based on prior knowledge
o How well it fits
Bayesian is comparative: hypotheses are tested againt on antohe
BF: BF10: P (data/H1/)/ P(data/H0)
PMP are relative probabilities
PMPs are updates of prior probabilities with the BF
Definition of probability
In frequentist: probability is the relative frequency of events (more formal)
Bayesian: probability is the degree of belief (more intuitive)
CI (confidence interval, frequentist): 95%of the times of a repeated experiment that CI of the
the data has the true value
CI (credible interval, Bayesian): there 95% probability that the true value is in the credible
interval