Questions and All Correct Answers.
In logistic regression, the relationship between the probability of success and the predicting
variables is nonlinear. - Answer TRUE: The equation that links the predictors to the probability
is:
𝑝(𝑥1,...,𝑥𝑝)=
𝑒𝑥𝑝(𝛽0+𝛽1𝑥1+...+𝛽𝑝𝑥𝑝) / 1+𝑒𝑥𝑝(𝛽0+𝛽1𝑥1+...+𝛽𝑝𝑥𝑝)
This relationship is not linear.
In logistic regression, the error terms are assumed to follow a normal distribution. - Answer
FALSE: There are no error terms in logistic regression
The logit function is the log of the ratio of the probability of success to the probability of failure.
It is also known as the log odds function. - Answer TRUE: 𝑔(𝑝)=ln(p/1−𝑝)
The logit link function is also known as the log odds function.
The number of parameters that need to be estimated in a logistic regression model with 6
predicting variables and an intercept is the same as the number of parameters that need to be
estimated in a standard linear regression model with an intercept and same predicting
variables. - Answer FALSE: As there is no error term in a logistic regression model, there is no
additional parameter for the variance of the error terms. As a result, the number of parameters
that need to be estimated in a logistic regression model with 6 predicting variables and an
intercept is 7. The number of parameters that need to be estimated in a standard linear
regression model with an intercept and same predicting variables is 8.
The log-likelihood function is a linear function with a closed-form solution. - Answer FALSE:
The log-likelihood function is a non-linear function. A numerical algorithm is needed in order to
maximize it.
In logistic regression, the estimated value for a regression coefficient 𝛽𝑖 represents the
estimated expected change in the response variable associated with one unit increase in the
corresponding predicting variable, 𝑥𝑖 , holding all else in the model fixed. - Answer FALSE: We
interpret logistic regression coefficients with respect to the odds of success.
Under logistic regression, the sampling distribution used for a coefficient estimator is a Chi-
, When testing a subset of coefficients, deviance follows a chi-square distribution with 𝑞q degrees
of freedom, where 𝑞q is the number of regression coefficients in the reduced model. - Answer
FALSE: When testing a subset of coefficients, deviance follows a chi-square distribution with q
degrees of freedom, where q is the number of regression coefficients discarded from the full
model to get the reduced model.
Logistic regression deals with the case where the dependent variable is binary, and the
conditional distribution 𝑌𝑖|𝑿𝑖,1,⋯,𝑿𝑖,𝑝 is Binomial. - Answer TRUE: Logistic regression is the
generalization of the standard regression model that is used when the response variable y is
binary or binomial.
In logistic regression, if the p-value of the deviance test for goodness-of-fit is smaller than the
significance level 𝛼, then it is plausible that the model is a good fit. - Answer FALSE: For logistic
regression, if the p-value of the deviance test for goodness-of-fit is large, then it is an indication
that the model is a good fit.
If a logistic regression model provides accurate classification, then we can conclude that it is a
good fit for the data. - Answer FALSE: 'Goodness of fit doesn't guarantee good prediction."
And conversely, good prediction doesn't guarantee that the model is a good fit.
To evaluate whether the model is a good fit or equivalently whether the assumptions hold, we
can use the Pearson or deviance residuals to evaluate whether they are normally distributed.
We can evaluate that using the histogram and the normality plots. If they're normally
distributed, then we conclude that the model is a good fit.
Another approach to evaluating goodness of fit is through hypothesis testing. In the goodness of
fit test, the null hypothesis is that the model fits well, and the alternative is that the model does
not fit well. The test statistic for the goodness of fit test is the sum of squared deviances. Under
the null hypothesis of good fit, the test statistic has an approximate Chi-Square distribution with
n-p-1 degrees of freedom. Very important to remember that if the p-value is small, we reject
the null hypothesis of good fit, and thus we conclude that the model is not a good fit.
For both logistic and Poisson regression, the deviance residuals should approximately follow the
standard normal distribution if the model is a good fit for the data. - Answer TRUE: The
deviance residuals are approximately N(0,1) if the model is a good fit
The logit link function is the best link function to model binary response data because it always
fits the data better than other link functions. - Answer FALSE: "The logit function is not the
only function that yields the s-shaped kind of curve. There are other s-shaped functions that are