V506 FINAL EXAM STUDY GUIDE
QUESTIONS AND ANSWERS
Question 7:
Rationale for using variances for testing the difference between means in ANOVA -
Answer-Treat categorical means as deviations from the grand mean. Always 2 sided.
By testing for variance, we can tell if there is a great difference between the observed
values and the population means. By taking the ratio of two types of calculated
variances, we can tell if the two means are similar, or different.
If the population means are in fact equal, then the average variation within any
subsample or category should not be markedly different from the variation between the
categories or subsamples.
Testing the relationship between means by determining the variance between
subsamples and the overall mean (grand mean).
Sum of Squares terms - Answer-BSS - Between Sum of Squares (explained variation)
- sum the sum of each column's variables^2, divided by the number of variables in the
respective column (n)
- subtract the sum of ALL variables^2, divided by the number of all variables
- sum (sum column variables^2 /n) - (sum of all variables^2/N)
WSS - Within Sum of Squares (unexplained variation)
- square each variable
- sum all the variable^2 for each column
- sum each column sum of variables^2
- subtract the sum of the sum of each column's variables^2, divided by the number of
variables in each respective column (n)
TSS - Total Sum of Squares
- sum all the variables^2 for each column
- sum each column sum of variables^2
- subtract from that sum from - the sum of all variables, then squared and divided by N
(the number of all variables)
F-stat is calculated by:
- Divide BSS by degrees of freedom (k-1)
- Divide WSS by degrees of freedom (n-k)
- BSS mean square / WSS mean square
F-critical = (k-1) / (n-k)
, Question 8:
Concept represented by the standard error of the regression equation - Answer-root
mse
- the average error made in predicting the dependent variable using the regression
equation
Coefficient Variation
= root mse/dependent mean
- average percent error or difference between the predicted value of the dependent
variable and the actual dependent variable
- the lower the number, the more trust you can place on the predicted values of the
dependent variable
Question 9:
Concepts represented by different sum of squares components in regression - Answer-
BSS - (explained variation)
(BcSS - found by TSS - WSS) Between Sum of Squares
- sum the sum of each column's variables^2, divided by the number of variables in the
respective column (n)
- subtract the sum of ALL variables^2, divided by the number of all variables
- sum (sum column variables^2 /n) - (sum of all variables^2/N)
BrSS - Rows (Block)
- b = # of rows
- k = # of columns
- for each row ((b*(row sum/k)) - (sum all variables/N))^2
- sum all those values from each row together
WSS - (unexplained variation)
(WgSS - calculate original WSS, then calculate BcSS (TSS - WSS), then calculate
WgSS (TSS-BcSS-BrSS) Within Sum of Squares
- square each variable
- sum all the variable^2 for each column
- sum each column sum of variables^2
- subtract the sum of the sum of each column's variables^2, divided by the number of
variables in each respective column (n)
TSS - Total Sum of Squares
- sum all the variables^2 for each column
- sum each column sum of variables^2
- subtract from that sum from - the sum of all variables, then squared and divided by N
(the number of all variables)
QUESTIONS AND ANSWERS
Question 7:
Rationale for using variances for testing the difference between means in ANOVA -
Answer-Treat categorical means as deviations from the grand mean. Always 2 sided.
By testing for variance, we can tell if there is a great difference between the observed
values and the population means. By taking the ratio of two types of calculated
variances, we can tell if the two means are similar, or different.
If the population means are in fact equal, then the average variation within any
subsample or category should not be markedly different from the variation between the
categories or subsamples.
Testing the relationship between means by determining the variance between
subsamples and the overall mean (grand mean).
Sum of Squares terms - Answer-BSS - Between Sum of Squares (explained variation)
- sum the sum of each column's variables^2, divided by the number of variables in the
respective column (n)
- subtract the sum of ALL variables^2, divided by the number of all variables
- sum (sum column variables^2 /n) - (sum of all variables^2/N)
WSS - Within Sum of Squares (unexplained variation)
- square each variable
- sum all the variable^2 for each column
- sum each column sum of variables^2
- subtract the sum of the sum of each column's variables^2, divided by the number of
variables in each respective column (n)
TSS - Total Sum of Squares
- sum all the variables^2 for each column
- sum each column sum of variables^2
- subtract from that sum from - the sum of all variables, then squared and divided by N
(the number of all variables)
F-stat is calculated by:
- Divide BSS by degrees of freedom (k-1)
- Divide WSS by degrees of freedom (n-k)
- BSS mean square / WSS mean square
F-critical = (k-1) / (n-k)
, Question 8:
Concept represented by the standard error of the regression equation - Answer-root
mse
- the average error made in predicting the dependent variable using the regression
equation
Coefficient Variation
= root mse/dependent mean
- average percent error or difference between the predicted value of the dependent
variable and the actual dependent variable
- the lower the number, the more trust you can place on the predicted values of the
dependent variable
Question 9:
Concepts represented by different sum of squares components in regression - Answer-
BSS - (explained variation)
(BcSS - found by TSS - WSS) Between Sum of Squares
- sum the sum of each column's variables^2, divided by the number of variables in the
respective column (n)
- subtract the sum of ALL variables^2, divided by the number of all variables
- sum (sum column variables^2 /n) - (sum of all variables^2/N)
BrSS - Rows (Block)
- b = # of rows
- k = # of columns
- for each row ((b*(row sum/k)) - (sum all variables/N))^2
- sum all those values from each row together
WSS - (unexplained variation)
(WgSS - calculate original WSS, then calculate BcSS (TSS - WSS), then calculate
WgSS (TSS-BcSS-BrSS) Within Sum of Squares
- square each variable
- sum all the variable^2 for each column
- sum each column sum of variables^2
- subtract the sum of the sum of each column's variables^2, divided by the number of
variables in each respective column (n)
TSS - Total Sum of Squares
- sum all the variables^2 for each column
- sum each column sum of variables^2
- subtract from that sum from - the sum of all variables, then squared and divided by N
(the number of all variables)