MULTIVARIATE DATA ANALYSIS
Multiple Regression Analysis (MRA)
Check the assumptions: linearity, homoscedasticity and normality of the residuals.
Linearity - horizontal linear line can be drawn in the data plot
Homoscedasticity - variance of residuals is constant across predicted values, the
linear line represents the same amount of data across the plot
Normality - normal distribution of residuals can be seen if there is no huge variation
Is the regression model a reasonable approximation of the data?
Yes, because the assumptions are satisfied.
No, because not all assumptions are satisfied.
Is there evidence of multicollinearity in the data?
1 1
No if; VIF (variance inflation factors = 𝑇𝑗
= 2 ) < 10, tolerance > 0.10
1−𝑅 𝑗
2
(𝑇𝑗 = 1 − 𝑅 𝑗)
Multicollinearity - predictors explain the same variance of the dependent variable;
independent variables are correlated
Are there outliers, influential points, or outliers on the predictors?
No if, three aspects are satisfied;
, Distance; (-)3 > standardized residuals < 3
3(𝑝+1)
Leverage - Centered Leverage Value (CLV) < Border Value (BV = 𝑁
; where p = k
= number of predictors)
Influence - Cook’s distance maximum < 1
What are the null and the alternative hypothesis to test the MRA regression model?
* *
𝐻0 : 𝑏 1= 𝑏 2= 0
*
𝐻α: 𝑏 𝑗 ≠ 0
Interpret the unstandardized and standardized coefficients.
Fill in the unstandardised and standardized regression equations and discuss the
different effect sizes or reverse effects.
Standardized regression equation; 𝑦 = 𝑏 𝑥 + 𝑏 𝑥
1 2 2
1
*
Unstandardized regression equation; 𝑦 = 𝑏 + 𝑏1𝑥 + 𝑏2𝑥2
0
1
How much variance of Y in total is explained by X1 and X2?
2
𝑅 = VAF = variance accounted for = total variance explained by the predictors
2
𝑅 = zero-order correlation of X1 + part correlation ofX2
Multiple Regression Analysis (MRA)
Check the assumptions: linearity, homoscedasticity and normality of the residuals.
Linearity - horizontal linear line can be drawn in the data plot
Homoscedasticity - variance of residuals is constant across predicted values, the
linear line represents the same amount of data across the plot
Normality - normal distribution of residuals can be seen if there is no huge variation
Is the regression model a reasonable approximation of the data?
Yes, because the assumptions are satisfied.
No, because not all assumptions are satisfied.
Is there evidence of multicollinearity in the data?
1 1
No if; VIF (variance inflation factors = 𝑇𝑗
= 2 ) < 10, tolerance > 0.10
1−𝑅 𝑗
2
(𝑇𝑗 = 1 − 𝑅 𝑗)
Multicollinearity - predictors explain the same variance of the dependent variable;
independent variables are correlated
Are there outliers, influential points, or outliers on the predictors?
No if, three aspects are satisfied;
, Distance; (-)3 > standardized residuals < 3
3(𝑝+1)
Leverage - Centered Leverage Value (CLV) < Border Value (BV = 𝑁
; where p = k
= number of predictors)
Influence - Cook’s distance maximum < 1
What are the null and the alternative hypothesis to test the MRA regression model?
* *
𝐻0 : 𝑏 1= 𝑏 2= 0
*
𝐻α: 𝑏 𝑗 ≠ 0
Interpret the unstandardized and standardized coefficients.
Fill in the unstandardised and standardized regression equations and discuss the
different effect sizes or reverse effects.
Standardized regression equation; 𝑦 = 𝑏 𝑥 + 𝑏 𝑥
1 2 2
1
*
Unstandardized regression equation; 𝑦 = 𝑏 + 𝑏1𝑥 + 𝑏2𝑥2
0
1
How much variance of Y in total is explained by X1 and X2?
2
𝑅 = VAF = variance accounted for = total variance explained by the predictors
2
𝑅 = zero-order correlation of X1 + part correlation ofX2