W1: One-way ANOVA
OV/DV Outcome Variable = quantitative (salary)
- Quantitative (discrete, interval, ratio) we use NUMERICAL SCALES, with equal distances between
values
- Dependent upon others
PV/IV Predictor variable = categorical (education: low/medium/high)
- Categorical (nominal, ordinal) subgroups are indicated by numbers – LEVELS
- NOT dependent upon other
- Variables can have different measurement scales: (in social sciences ordinal scales can be
sometimes treated as interval scales (e.g. Likert scales)
Research question/ Hypothesis:
- To check the difference between the groups
H0: There is no difference in the salary of different education levels
H1: There is a difference in the salary of different education levels
- The p-value = stands for the Probability of obtaining a result (or test statistic value) equal to (or
‘more extreme’ than) what was observed (the result you actually got), assuming that the null
hypothesis is true.
- A LOW p-value indicates that the null hypothesis is UNLIKELY, there is a significant difference
ANOVA analysis of variance
We use One way Between-subjects ANOVA when:
OV = Quantitative salary
PV = Categorical with more than 2 groups
- Two measurements of Variability (how much values differ in your data)
o Variance = The average of the squared differences from the Mean (average) –
homogenous/not homogenous. – we want the p value to be as high as possible
o Sum of Squares = The sum of the squared differences from the Mean (average).
- We need to use an ANOVA the gain a statistical insight whether there is a statistically significant
difference. Means are not enough.
NB! ANOVA –WITHIN group variability should preferably be LOW, BETWEEN group variability should
preferably HIGH – to show statistical difference.
- Compares the variability between the groups against the variability within the groups:” Does is
matter in which group you are – which teaching method you receive with regard to your exam
score?”
- Conceptual models are visual representations of relations between constructs (variables) of interest
- ANOVA statistically examines how much of the variability in our outcome variable can be explained
by our predictor variable.
- It breaks down different measures of variability through calculating sums of squares.
- Via these calculations, the ANOVA helps us test if the mean scores of the groups are statistically
different
,NB! ANOVA + FOLLOW UP - To investigate with a certain level of (statistical) confidence, what differences
there might be between the groups
- Following a significant F ratio for the general ANOVA, multiple comparisons or planned contrasts
can identify which group means are significantly different.
- By comparing the variability between the groups against the variability within the groups
- We want to see much of the variability in our outcome variable can be explained by our predictor
variable
Sum of Squares and R2
- We have now decomposed the variability in our data in a part that can be explained by our model
(between group SS) and a residual part (within group SS).
- Model Sum of Squares = BETWEEN SS
- Residual Sum of Squares= WITHIN SS
- We can now calculate the proportion of the total variance in our data that is “explained” by our
model. This ratio to calculate this is called R squared (R2)
- 95 % of the variability in the customer score can be explained by the type of store
- R2 – the higher the better!!!
F-test and Mean Squares (Post hock)
- Tells us which group is different
- F-ratio is the main aspect
- To investigate if the group means differ with an ANOVA, we do a F test
- This is a statistical test and thus checks the ratio explained variability to unexplained variability.
- We want to F ratio to be as high as possible
, - To find out if that number is good or bad you have to compare it to the standard F table – look at
the values and check if the number is within the critical value. The F value should be HIGHER
- P-Value: if the p-value is < 0.05 the same conclusion can be drawn – there is a difference between
the groups
- As with any test statistic, the F ratio has a null hypothesis and an alternative hypothesis:
- As with any test statistic it has an accompanying p value.
o The probability of obtaining a result (or test statistic value) equal to (or ‘more extreme’)
than what was actually observed (the result you actually got), assuming that the null
hypothesis is true.
o Based on the p value you can either reject or not reject the null hypothesis
- H0 = no difference between the groups
- The F ratio later gives us a p value
OV/DV Outcome Variable = quantitative (salary)
- Quantitative (discrete, interval, ratio) we use NUMERICAL SCALES, with equal distances between
values
- Dependent upon others
PV/IV Predictor variable = categorical (education: low/medium/high)
- Categorical (nominal, ordinal) subgroups are indicated by numbers – LEVELS
- NOT dependent upon other
- Variables can have different measurement scales: (in social sciences ordinal scales can be
sometimes treated as interval scales (e.g. Likert scales)
Research question/ Hypothesis:
- To check the difference between the groups
H0: There is no difference in the salary of different education levels
H1: There is a difference in the salary of different education levels
- The p-value = stands for the Probability of obtaining a result (or test statistic value) equal to (or
‘more extreme’ than) what was observed (the result you actually got), assuming that the null
hypothesis is true.
- A LOW p-value indicates that the null hypothesis is UNLIKELY, there is a significant difference
ANOVA analysis of variance
We use One way Between-subjects ANOVA when:
OV = Quantitative salary
PV = Categorical with more than 2 groups
- Two measurements of Variability (how much values differ in your data)
o Variance = The average of the squared differences from the Mean (average) –
homogenous/not homogenous. – we want the p value to be as high as possible
o Sum of Squares = The sum of the squared differences from the Mean (average).
- We need to use an ANOVA the gain a statistical insight whether there is a statistically significant
difference. Means are not enough.
NB! ANOVA –WITHIN group variability should preferably be LOW, BETWEEN group variability should
preferably HIGH – to show statistical difference.
- Compares the variability between the groups against the variability within the groups:” Does is
matter in which group you are – which teaching method you receive with regard to your exam
score?”
- Conceptual models are visual representations of relations between constructs (variables) of interest
- ANOVA statistically examines how much of the variability in our outcome variable can be explained
by our predictor variable.
- It breaks down different measures of variability through calculating sums of squares.
- Via these calculations, the ANOVA helps us test if the mean scores of the groups are statistically
different
,NB! ANOVA + FOLLOW UP - To investigate with a certain level of (statistical) confidence, what differences
there might be between the groups
- Following a significant F ratio for the general ANOVA, multiple comparisons or planned contrasts
can identify which group means are significantly different.
- By comparing the variability between the groups against the variability within the groups
- We want to see much of the variability in our outcome variable can be explained by our predictor
variable
Sum of Squares and R2
- We have now decomposed the variability in our data in a part that can be explained by our model
(between group SS) and a residual part (within group SS).
- Model Sum of Squares = BETWEEN SS
- Residual Sum of Squares= WITHIN SS
- We can now calculate the proportion of the total variance in our data that is “explained” by our
model. This ratio to calculate this is called R squared (R2)
- 95 % of the variability in the customer score can be explained by the type of store
- R2 – the higher the better!!!
F-test and Mean Squares (Post hock)
- Tells us which group is different
- F-ratio is the main aspect
- To investigate if the group means differ with an ANOVA, we do a F test
- This is a statistical test and thus checks the ratio explained variability to unexplained variability.
- We want to F ratio to be as high as possible
, - To find out if that number is good or bad you have to compare it to the standard F table – look at
the values and check if the number is within the critical value. The F value should be HIGHER
- P-Value: if the p-value is < 0.05 the same conclusion can be drawn – there is a difference between
the groups
- As with any test statistic, the F ratio has a null hypothesis and an alternative hypothesis:
- As with any test statistic it has an accompanying p value.
o The probability of obtaining a result (or test statistic value) equal to (or ‘more extreme’)
than what was actually observed (the result you actually got), assuming that the null
hypothesis is true.
o Based on the p value you can either reject or not reject the null hypothesis
- H0 = no difference between the groups
- The F ratio later gives us a p value