Background information
T-test
Consists of:
Name of test
Variable: y = what you want to measure (so, optimism about environment for example)
Parameter of interest: difference between two population means. Therefore, you have to
measure both means
Null hypotheses: difference in means = 0
Alternative hypotheses: difference in means > 0
Test statistic: t-value: difference between means / standard error of difference
P-value: significance of result
Confidence interval: how much % the difference in mean would be if study was conducted
over and over again
Measure of effect size: Cohen’s d:
o 0.2 is small
o 0.5 is medium
o 0.8 is large
(Inter)dependent techniques
Dependent techniques: measuring the effect of predictors on the outcome.
Interval: simple/multiple regression
o Interval: equally spaced units, without a zero point. Date of birth.
o Predictors
One: simple regression
Multiple: multiple regression
Categorical/nominal: (factorial) ANOVA / t-test
o Categorical/nominal = Values are categories, without ranking. Alive or dead, ill or
well, vaccinated or unvaccinated
o Predictors:
One: ANOVA/t-test
Multiple: Factorial ANOVA
,Simple regression
Predict outcome x by predictor variable y.
Model
Simple model: using method of least squares. Straight line that is as close as possible to all of the
points. For calculating a point, just look at the function of the model + the error.
Line is made by regression coefficient and intercept.
Variance: variability from model per point.
Overall statistics
Effect size: difference between predicted and observed scores.
R: multiple correlation coefficient.
R-squared: coefficient of determination: proportion of variance of outcome Y can be explained by the
model.
Degrees of freedom in simple regression: N-2 with n being the amount of observations
Statistical significance: H0: R=0, Ha: R> of < 0
Difference statistical significance and effect size:
Effect size is difference between observed and predicted scores
Statistical significance is examining if findings are true due to chance
Important terms
Sum of squares: mean score * observed score
Mean square: sum of squares/degrees of freedom
F-value: mean squares regression / mean squares residuals
Mean squares residuals = estimate of variance of error terms
Detailed statistics
Standardized and unstandardized coefficients
Standardized: obtained after running a regression model measured on standardized variables
Unstandardized: obtained after running a regression model on variables measured in their
original scales
Interdependent techniques: investigate interrelations. No distinction between outcome and
predictors
Interval: only necessary
o Predictors:
Two: correlation
More than 2: Exploratory factor analysis
Correlation
Covariance: measures to which extent deviations from mean of variable 1 go together with
deviations from mean of variable 2.
Formula: (sum of (product from deviations from means1&2) across all observations)/no.
observations -1
, Problems: Covariance depends on units of measurement, just like maximum and minimum
values
Solution: Pearson correlation : Covariance/product of standard deviations from 1 & 2. Gives a
correlation which:
Measures strength of linear relationship
Maximum value is 1 and -1
Standardizable
Measurement of linear relation in correlation
All points on straight positive line (close to one)/negative line (close to -1): strong positive/negative
correlation.
Not straight but close to each other > spread out but straight line
Statistical significance
T-test: testing if r = 0
Z-test: testing if r = rhypothesized
Core assumptions (later)
Independent
Normally distributed
Obtained by simple random sampling
Size of effect (Cohen)
0.1 = small
0.9 = medium
0.25 = large
Multiple regression
Difference with simple regression: with multiple, there is more-than-one predictor
Research question:
Does this and this and this have an effect on …
Or, : Which of these three factors has the highest influence?
Model:
Looking for the linear contributions of the predictors. Quite the same as simple, only with more
dimensions.
Equation (to find a point): intercept + contribution of predictor 1 + “””””predictor n + error term.
The method of least squares is used again to draw a line.
R (-squared)
R = Multiple correlation coefficient. Here, also the effect size.
R-squared = coefficient of determination = measures how difference in one variable can be
explained by difference in other one. Other name: VAF (variance accounted for)
Cohen’s values again.
Adjusted R-squared: Estimate of r-squared measured in population, instead of sample.
Always < than R squared