BUAL 2650 FINAL EXAM – LEE QUESTIONS
Simple Regression - Answer --only 1 predictor variable
-y-hat=b0 + b1*x
Multiple Regression - Answer --more than 1 predictor variables
-y-hat=b0 + b1*x1 + b2*x2......
Residual - Answer -the difference between the actual data and the value we predict for
it
=observed-predicted
=y-y-hat
Interpreting Residuals - Answer --Negative residual: the regression equation provided
an overestimate of the data.
-Positive residual: the regression equation provided an underestimation of the data.
Linear regression only works for... - Answer -Linear models
What do we want to see from a residual plot? - Answer --No pattern
-No plot thickening
-Randomization
Extrapolation - Answer --venturing into new x territory
-used to estimate values that go beyond a set of given data or observations
-very dangerous
Dangers of Extrapolation - Answer --assumes there is a linear relationship beyond the
range of the data
-assumes that nothing about the relationship between x and y changes at extreme
values of x
Interpreting the Intercept of a MRM - Answer -is it meaningful or not? we decide if it is
meaningful by assuming the other coefficients are 0
Is this multiple regression model any good at all? - Answer -Test hypotheses: HO: all
beta values = 0 vs. HA: at least 1 beta value does not = 0
-then, use a t-test
Rules for interpreting multiple regression coefficients - Answer --express in terms of the
units of the dependent variable
-always say "all else being equal"
, -always mention the other variables by saying "after (variable #1) and (variable #2) are
accounted for," and interpret the coefficient
How do we determine if a multiple regression model is significant? - Answer -p-value
(needs to b small) and t-test (needs to be big - this means that at least one of the
predictors accounts for the variation in predicting the dependent variable.)
R-squared - Answer --"Goodness of fit"
-a statistical measure of how close the data are to the fitted regression line (how well
observed outcomes are replicated by the model)
Dangers of R-squared - Answer -
Interpreting R-Square - Answer -R-square = .80 indicates that the model explains 80%
of variability of the response (y) data OR R-square = 0.41 indicates that 41% of the
variability of height can be explained by the mode.
Outliers - Answer -points with y-values far from the regression model; points far from
the body of the data
Leverage - Answer -A data point can also be unusual if its x-value is far from the mean
of the x-values. Such points are said to have high leverage.
Influential Point - Answer -We say that a point is influential if omitting it from the
analysis gives a very different slope for the model
Causality Warning - Answer -no matter how strong the association, no matter how large
the r-squared value, there is no way to conclude that for a regression alone that one
variable caused the other
Autocorrelation - Answer -When values at time, t, are correlated with values at time, t-1,
we say the values are autocorrelated in the first order. If values are correlated with
values two time periods back, we say second-order autocorrelation is present, and so
on.
Autoregression and P-values - Answer -large p-values (ex. .870 and .699) means that
the values are not significant
Why is autocorrelation a problem? - Answer -When data are highly correlated over
time, each data point is similar to those around it, so each data point provides less
additional information than if the points had been independent. All regression inference
is based on independent errors.
Durbin-Watson statistic - Answer --can detect first-order autocorrelation from the
residuals of a regression analysis
Simple Regression - Answer --only 1 predictor variable
-y-hat=b0 + b1*x
Multiple Regression - Answer --more than 1 predictor variables
-y-hat=b0 + b1*x1 + b2*x2......
Residual - Answer -the difference between the actual data and the value we predict for
it
=observed-predicted
=y-y-hat
Interpreting Residuals - Answer --Negative residual: the regression equation provided
an overestimate of the data.
-Positive residual: the regression equation provided an underestimation of the data.
Linear regression only works for... - Answer -Linear models
What do we want to see from a residual plot? - Answer --No pattern
-No plot thickening
-Randomization
Extrapolation - Answer --venturing into new x territory
-used to estimate values that go beyond a set of given data or observations
-very dangerous
Dangers of Extrapolation - Answer --assumes there is a linear relationship beyond the
range of the data
-assumes that nothing about the relationship between x and y changes at extreme
values of x
Interpreting the Intercept of a MRM - Answer -is it meaningful or not? we decide if it is
meaningful by assuming the other coefficients are 0
Is this multiple regression model any good at all? - Answer -Test hypotheses: HO: all
beta values = 0 vs. HA: at least 1 beta value does not = 0
-then, use a t-test
Rules for interpreting multiple regression coefficients - Answer --express in terms of the
units of the dependent variable
-always say "all else being equal"
, -always mention the other variables by saying "after (variable #1) and (variable #2) are
accounted for," and interpret the coefficient
How do we determine if a multiple regression model is significant? - Answer -p-value
(needs to b small) and t-test (needs to be big - this means that at least one of the
predictors accounts for the variation in predicting the dependent variable.)
R-squared - Answer --"Goodness of fit"
-a statistical measure of how close the data are to the fitted regression line (how well
observed outcomes are replicated by the model)
Dangers of R-squared - Answer -
Interpreting R-Square - Answer -R-square = .80 indicates that the model explains 80%
of variability of the response (y) data OR R-square = 0.41 indicates that 41% of the
variability of height can be explained by the mode.
Outliers - Answer -points with y-values far from the regression model; points far from
the body of the data
Leverage - Answer -A data point can also be unusual if its x-value is far from the mean
of the x-values. Such points are said to have high leverage.
Influential Point - Answer -We say that a point is influential if omitting it from the
analysis gives a very different slope for the model
Causality Warning - Answer -no matter how strong the association, no matter how large
the r-squared value, there is no way to conclude that for a regression alone that one
variable caused the other
Autocorrelation - Answer -When values at time, t, are correlated with values at time, t-1,
we say the values are autocorrelated in the first order. If values are correlated with
values two time periods back, we say second-order autocorrelation is present, and so
on.
Autoregression and P-values - Answer -large p-values (ex. .870 and .699) means that
the values are not significant
Why is autocorrelation a problem? - Answer -When data are highly correlated over
time, each data point is similar to those around it, so each data point provides less
additional information than if the points had been independent. All regression inference
is based on independent errors.
Durbin-Watson statistic - Answer --can detect first-order autocorrelation from the
residuals of a regression analysis