high flexibility correlates to - correct answer ✔✔high variance
MLR:
When an outlier has been identified, the following approaches are accepted... - correct answer
✔✔remove the outlier only after confirming that it resulted from error
retain the outlier in the analysis and thoroughly document its influence
perform the regression twice: once with outlier and once without
include the observation but comment on its effects
delete the observation from the dataset
Create a binary variable to indicate the presence of an outlier.
Prediction:
There is no guarantee for any two models to produce the same prediction - correct answer ✔✔T - it is
more likely for them to produce different predictions
Prediction:
It is assumed that the new observation follows the same model as the one used in the sample - correct
answer ✔✔T - if it doesn't follow the same model, we shouldn't be using it to make predictions
Prediction:
Is a point prediction is more reliable than an interval prediction? - correct answer ✔✔F - neither is more
reliable than the other & there is no easy way to compare the reliability of the two
,Prediction:
A wider prediction interval means that the standard error is lower / higher - correct answer ✔✔higher
Prediction:
Which type of interval is more informative? wide/narrow - correct answer ✔✔narrow - it gives us a
better idea of the true value of an observation
Should a prediction interval contain the single point prediction? - correct answer ✔✔Yes - the interval
contains the most likely values with the point prediction being the single most likely point
Recursive binary splitting process:
the predictor and cut point for each split are chosen to minimize the overall impurity of the tree - correct
answer ✔✔T - these impurities can be measured by RSS, gini index, entropy, classification error
Recursive binary splitting process:
Does each split have to use a different predictor to ensure diversity in that tree? - correct answer ✔✔No
- the same predictor can be used for multiple splits if it continues to provide the best reduction in
impurity
Recursive binary splitting process:
is a top-down approach, starting from the root and expanding downwards - correct answer ✔✔T
Recursive binary splitting process:
works with both quantitative and qualitative predictors - correct answer ✔✔T
Recursive binary splitting process:
Does it stop as soon as a single split has been made? - correct answer ✔✔No - it continues until a
predefined stopping criterion is met (normally a min. node size, max tree depth, or min reduction in
impurity)
, Bagging:
For a sufficiently large number of bootstrap samples, out-of-bag error is virtually equivalent to ____ -
correct answer ✔✔LOOCV validation error
Bagging:
Out-of-bag error estimation uses only the trees for which the specific observation was not in the
bootstrap sample - correct answer ✔✔T - it does not requires each observation to be predicted by all
trees in the ensemble
Bagging:
Increasing the number of trees does not lead to overfitting due to the aggregation of predictions -
correct answer ✔✔T - a very high number of bootstrap samples will NOT lead to overfitting in bagged
models
Bagging:
Bagging is useful for improving prediction accuracy in classification settings - correct answer ✔✔T
Bagging:
Each bootstrapped dataset likely contains repeated observations due to sampling with replacement -
correct answer ✔✔T - not all observations will be different from the original dataset
Regression trees:
A smaller tree with fewer splits might lead to lower variance and better interpretation at the cost of a
little bias - correct answer ✔✔T
A decision tree considers all predictors X1-Xp and all the possible values of the cutpoint for each of the
predictors, and then chooses the predictor and cutpoint such that the resulting tree has the _____ sum
of squares - correct answer ✔✔lowest
Regression trees:
Estimating the Cross-Validation error for every possible subtree would be too cumbersome, so it leads to
cost complexity pruning that considers.... - correct answer ✔✔a sequence of trees indexed by a non-
negative tuning parameter, alpha