Which component of a decision tree provides the predictions? - Answers leaf nodes
A decision tree can have only two-way splits. - Answers false
Each instance in the data set follows a single path in the decision tree. - Answers true
Decision trees cannot handle numeric attributes when conducting attribute tests in interior
nodes. - Answers false
Suppose you build a decision tree to predict whether a customer makes a purchase on the
internet (yes or no). A leaf has perfect purity when it contains which of the following? - Answers
Either only customers who have made purchases or only customers who have not made
purchases.
Each attribute can appear only in a single split in the decision tree. - Answers false
Decision trees can only handle categorical variables as attributes in interior nodes. - Answers
false
In order to estimate the generalization performance of a decision tree model, I estimate the
performance using: - Answers the test set
When using a similarity-moderated kNN to make predictions for two different data points, even
if I always use all the instances in the data to generate predictions, I will not always generate the
same prediction. - Answers true
The accuracy metric is able to discriminate between the different types of correct
classifications a classifier makes. - Answers false
Laplace correction helps us avoid/prevent overfitting. - Answers True
Logistic regression is a supervised data science method. - Answers True
Logistic regression can used for numeric prediction tasks. - Answers False
In the context of a decision tree, laplace correction will typically not impact my class probability
estimations when we have a lot of instances in the leaf segmentation. - Answers True
It is required to normalize the features before building a logistic regression model. - Answers
False
Decision trees can be used for class probability estimation tasks. - Answers True
Logistic regression can typically generate fast predictions. - Answers True
It is possible to capture non-linear patterns in the data when I use linear models, such as the
, logistic regression model. - Answers True
When a logistic regression model is applied to a binary classification task, it generates a single
decision boundary in the instance space. - Answers True
Fitting graphs can used to assess the possibility of overfitting. - Answers True
Increasing the number of nodes in a decision tree decreases the complexity of the model. -
Answers False
Suppose your model is overfitting. Which of the following is NOT a valid way to try and reduce
the overfitting? - Answers Increase the model complexity.
When we don't limit the depth of the decision tree, it is less likely to overfit. - Answers False
Logistic Regression is likely to outperform Decision Trees when using a small training data set. -
Answers True
Which of the following strategies is likely to increase the chances of overfitting when using a
kNN model? - Answers Decreasing the k value
What strategies can help reduce overfitting in decision trees? - Answers Pruning;
Enforce a maximum depth for the decision tree;
Enforce a minimum number of instances in leaf nodes
Create new complex attributes (from existing attributes) which can be used to conduct attribute
tests in interior nodes: increases or decreases overfitting? - Answers increases
You have built a kNN model and you believe your model is overfitting the data. What of the
following option(s) would be appropriate in this case? - Answers Increase the value of k
In order to avoid the curse of dimensionality when building a kNN model, which of the following
options would be appropriate? - Answers Decrease the number of features we use
When building a kNN model, what happens to the boundaries when you increase the value of k?
- Answers The boundaries are smoothed out
Which penalty function can be used for automatic feature selection? - Answers L1-norm penalty
In a 10-fold cross-validation, how many different estimates do we get of a specific performance
metric (accuracy for instance)? - Answers 10
AUC larger than 0.5 indicates classifier performance better than the random guessing strategy. -
Answers true
High classification accuracy in the test set always indicates a good classifier. - Answers false