What statement is INCORRECT about the k-nearest neighbor (k-NN) method?
A) Different k value can change the performance of the classifier
B) Too small value for k may lead to over-fitting
C) When k=1 (closest record) the classifier performance is maximum
D) k is an arbitrary number that can be selected by trial-and-error - Answers C)
The main difference between k-NN classifiers and k-NN regression models is that the former
does not need a distance function, while the latter uses the Euclidean distance function.
A) True
B) False - Answers B)
With the k-NN model for classification, after we determined the k nearest neighbors of a new
data record, how the class is predicted?
A) Average of the neighbors
B) Majority vote determines the predicted class
C) Through a linear combination of neighbors
D) Through a logistic regression between the neighbors - Answers B)
What can cause the over-fitting problem in k-NN classifiers?
A) splitting the data set
B) incorrect distance function
C) too small values of k
D) too large values of k - Answers C)
Evaluation of classification models depends on our choice of performance metric, which in turn
depends on the problem that we are trying to solve.
, A) True
B) False - Answers A)
What is propensity score?
A) an indicator of the correct cut-off value
B) An arbitrary number assigned to each record
C) a measure that shows accuracy of the model
D) predicted probability of class membership - Answers D)
What is the error rate of the following confusion matrix? (rounded to 2 decimal places)
A) 0.58
B) 0.46
C) 0.59
D) 0.41 - Answers D)
The cost of misclassification is always the same for false negative and false positive cases.
A) True
B) False - Answers B)
What is the fall-out score of the following confusion matrix given that "1" is positive? (rounded
to 2 places)
A) 0.45
B) 0.53
C) 0.47
D) 0.36 - Answers C)