Edition By Christopher R. Bilder; Thomas M. Loughin
9781439855676 ALL Chapters .
What are the key reasons to develop a model for your data analysis? Select three answers.
A. Determine the relationships between variables.
B. Understand how the data were generated.
C. Identify any special structures that may exist in the data.
D. Determine the accuracy of your data. - ANSWER: A. Determine the relationships between variables.
B. Understand how the data were generated.
C. Identify any special structures that may exist in the data.
There are four assumptions associated with a linear regression model. What is the definition of the
assumption homoscedasticity?
A. The relationship between X and the mean of Y is linear.
B. Observations are independent of each other.
C. For any fixed value of X, Y is normally distributed.
D. The variance of residual is the same for any value of X. - ANSWER: D. The variance of residual is the
same for any value of X.
What step must you take before you can obtain a prediction based on a fitted simple linear regression
model?
A. Use or create a data frame containing never seen data.
B. Do nothing. Once you have a fitted simple linear regression model, you have all you need to make
predictions.
C. Use or create a data frame containing known target variables.
D. Use or create a data frame containing known predictor variables. - ANSWER: A. Use or create a
data frame containing never seen data.
Assume you have a dataset called "new_dataset", two predictor variables called X and Y, and a target
variable called Z, and you want to fit a multiple linear regression model. Which command should you
use?
A. linear_model <- lm(Z ~ X + Y, data = new_dataset)
B. linear_model <- lm(Z ~ X ~ Y, data = new_dataset)
C. linear_model <- lm(X + Y + Z, data = new_dataset)
D. linear_model <- lm(X + Y ~ Z, data = new_dataset) - ANSWER: A. linear_model <- lm(Z ~ X + Y, data
= new_dataset)
Which plot types help you validate assumptions about linearity? Select two answers.
A. Scale-location plot
B. Residual plot
C. Regression plot
D. Q-Q plot - ANSWER: B. Residual plot
C. Regression plot
True or False: When using the poly() function to fit a polynomial regression model, you must specify
"raw = FALSE" so you can get the expected coefficients.
A. True.
B. False. - ANSWER: B. False.
Which performance metric for regression is the mean of the square of the residuals (error)?
A. Mean squared error (MSE)
B. Mean absolute error (MAE)
C. Root mean squared error (RMSE)
, D. R-squared (R2) - ANSWER: A. Mean squared error (MSE)
When comparing the MSE of different models, do you want the highest or lowest value of MSE?
A. Lowest value of MSE
B. Highest value of MSE - ANSWER: A. Lowest value of MSE
In model development, you can develop more accurate models when you have which of the
following?
A. Relevant data.
B. Larger quantities of data.
C. Fewer independent variables.
D. More dependent variables. - ANSWER: A. Relevant data.
Assume you have a dataset called "new_dataset", a predictor variable called X, and a target called Y,
and you want to fit a simple linear regression model. Which command should you use?
A. linear_model <- lm(X ~ Y, data = new_dataset)
B. linear_model <- predict(X ~ Y, data = new_dataset)
C. linear_model <- predict(Y ~ Z, data = new_dataset)
D. linear_model <- lm(Y ~ X, data = new_dataset) - ANSWER: D. linear_model <- lm(Y ~ X, data =
new_dataset)
When using the predict() function in R, what is the default confidence level?
A. 90%
B. 95%
C. 100%
D. 85% - ANSWER: B. 95%
Which plot type helps you validate assumptions about normality?
A. Scale-location plot
B. Residual plot
C. Q-Q plot
D. Regression plots - ANSWER: C. Q-Q plot
A third order polynomial regression model is described as which of the following?
A. Squared, meaning that the predictor variable in the model is squared.
B. Quadratic, meaning that the predictor variable in the model is squared.
C. Cubic, meaning that the predictor variable in the model is cubed.
D. Simple linear regression. - ANSWER: C. Cubic, meaning that the predictor variable in the model is
cubed.
How should you interpret an R-squared result of 0.89?
A. 89% of the response variable variation is explained by a polynomial model.
B. 89% of the response variable variation is explained by a linear model.
C. The X variable causes the Y variable to positively change 89% of the time.
D. There is a strong negative correlation between the variables. - ANSWER: B. 89% of the response
variable variation is explained by a linear model.
When comparing linear regression models, when will the mean squared error (MSE) be smaller?
A. When using a multiple linear regression (MLR) model.
B. When using a polynomial regression model.
C. This depends on your data. The model that fits the data better has the smaller MSE.
D. When using a simple linear regression (SLR) model. - ANSWER: C. This depends on your data. The
model that fits the data better has the smaller MSE.
When evaluating models, what is the term used to describe a situation where a model fits the training
data very well but performs poorly when predicting new data?
A. Overfit