ISYE 6414 FINAL EXAM QUESTIONS
WITH CORRECT DETAILED ANSWERS
True - Answer-in k-fold cross validation, the larger K, the higher the variability in the
estimation of the classification error is.
1) to model count data
2) to model rate response data
3) to model response data with a Poisson distribution - Answer-What can Poisson
regression be used for?
True - Answer-The link function for the Poisson regression is the log function.
False - constant variance will be violated. - Answer-If we apply a standard regression to
response data with a Poisson distribution, constant variance assumption will hold.
True - Answer-in Poisson regression, we model the log of the expected response
variable, not the expected log response variable.
False - fill in later - Answer-In Poisson regression, we use ordinary least squares to fit
the model.
True - Answer-In Poisson regression, we interpret the coefficients in terms of the ratio of
the response rates.
False - we use z-tests - Answer-In Poisson regression, we make inference using the t-
intervals for the coefficients.
False - the estimates for the coefficients are approximate in Poisson regression. -
Answer-in Poisson regression, inference relies on the exact sampling distribution of the
regression coefficients.
True - the test for regression coefficients in Poisson regression follows a chi-square
distribution with q degrees of freedom. - Answer-We use a chi-square testing procedure
to test whether a subset of regression coefficients are zero in Poisson regression.
False - fill in later - Answer-We can use residual analysis in Poisson regression to
evaluate whether errors are uncorrelated.
1) to address multicollinearity in multiple regression
2) To select among a large number of predicting variables
3) To fit a model when there are more predicting variables than observations - Answer-
What are some common use cases for variable selection?
, True - Answer-When selecting variables, it is important to first establish which variables
are used for controlling bias in the sample and which are explanatory.
True -Variable selection balances bias with variance to select the model. - Answer-
Variable selection methods are performed by balancing the bias-variance tradeoff.
True - Answer-The penalty constant Lambda in regularized regression has the role of
controlling the trade off between lack of fit and model complexity.
True - we can find closed form solutions for the ridge coefficients - Answer-The ridge
regression coefficients are obtained using an exact or closed form expression.
True - in Lasso, the coefficient estimates are approximate, we used a numerical
algorithm to estimate them. - Answer-The estimated coefficients in lasso regression are
obtained using a numerical algorithm.
True - fill in later - Answer-The regression coefficients in lasso are less efficient than
those from the ordinary least squares estimation approach.
True - this is true for explanatory purposes but NOT prediction. - Answer-When
Selecting variables for explanatory purpose, one might consider including predicting
variables which are correlated if it would help answer your research hypothesis.
False - Variable selection has come a long way but is far from a solved problem,
especially with many predictors. - Answer-Variable selection is a simple and solved
statistical problem since we can implement it using software.
False - it is not good practice to perform variable selection based on the statistical
significance of the coefficients, as significance is almost always derived based on the
other predictors in the model. - Answer-It is good practice to perform variable selection
based on the statistical significance of the regression coefficients.
True - since the model sees the data 2 times, training risk is generally too favorable
when estimating true prediction risk. - Answer-The training risk is a biased estimator for
prediction risk.
True - AIC is a measure for prediction risk that adds a penalty term to correct for the
bias in the training risk. - Answer-AIC is an estimate for the prediction risk.
True - BIC generally penalized more than all of the other prediction risk estimated, and
is especially useful when the objective is prediction, since it selects simpler models. -
Answer-BIC penalizes for complexity of the model more than both VC and Mallow's Cp
statistic.
WITH CORRECT DETAILED ANSWERS
True - Answer-in k-fold cross validation, the larger K, the higher the variability in the
estimation of the classification error is.
1) to model count data
2) to model rate response data
3) to model response data with a Poisson distribution - Answer-What can Poisson
regression be used for?
True - Answer-The link function for the Poisson regression is the log function.
False - constant variance will be violated. - Answer-If we apply a standard regression to
response data with a Poisson distribution, constant variance assumption will hold.
True - Answer-in Poisson regression, we model the log of the expected response
variable, not the expected log response variable.
False - fill in later - Answer-In Poisson regression, we use ordinary least squares to fit
the model.
True - Answer-In Poisson regression, we interpret the coefficients in terms of the ratio of
the response rates.
False - we use z-tests - Answer-In Poisson regression, we make inference using the t-
intervals for the coefficients.
False - the estimates for the coefficients are approximate in Poisson regression. -
Answer-in Poisson regression, inference relies on the exact sampling distribution of the
regression coefficients.
True - the test for regression coefficients in Poisson regression follows a chi-square
distribution with q degrees of freedom. - Answer-We use a chi-square testing procedure
to test whether a subset of regression coefficients are zero in Poisson regression.
False - fill in later - Answer-We can use residual analysis in Poisson regression to
evaluate whether errors are uncorrelated.
1) to address multicollinearity in multiple regression
2) To select among a large number of predicting variables
3) To fit a model when there are more predicting variables than observations - Answer-
What are some common use cases for variable selection?
, True - Answer-When selecting variables, it is important to first establish which variables
are used for controlling bias in the sample and which are explanatory.
True -Variable selection balances bias with variance to select the model. - Answer-
Variable selection methods are performed by balancing the bias-variance tradeoff.
True - Answer-The penalty constant Lambda in regularized regression has the role of
controlling the trade off between lack of fit and model complexity.
True - we can find closed form solutions for the ridge coefficients - Answer-The ridge
regression coefficients are obtained using an exact or closed form expression.
True - in Lasso, the coefficient estimates are approximate, we used a numerical
algorithm to estimate them. - Answer-The estimated coefficients in lasso regression are
obtained using a numerical algorithm.
True - fill in later - Answer-The regression coefficients in lasso are less efficient than
those from the ordinary least squares estimation approach.
True - this is true for explanatory purposes but NOT prediction. - Answer-When
Selecting variables for explanatory purpose, one might consider including predicting
variables which are correlated if it would help answer your research hypothesis.
False - Variable selection has come a long way but is far from a solved problem,
especially with many predictors. - Answer-Variable selection is a simple and solved
statistical problem since we can implement it using software.
False - it is not good practice to perform variable selection based on the statistical
significance of the coefficients, as significance is almost always derived based on the
other predictors in the model. - Answer-It is good practice to perform variable selection
based on the statistical significance of the regression coefficients.
True - since the model sees the data 2 times, training risk is generally too favorable
when estimating true prediction risk. - Answer-The training risk is a biased estimator for
prediction risk.
True - AIC is a measure for prediction risk that adds a penalty term to correct for the
bias in the training risk. - Answer-AIC is an estimate for the prediction risk.
True - BIC generally penalized more than all of the other prediction risk estimated, and
is especially useful when the objective is prediction, since it selects simpler models. -
Answer-BIC penalizes for complexity of the model more than both VC and Mallow's Cp
statistic.