COMP 682 Data Mining Final Exam
Questions and Answers 2026
1
Precision (a or positive)
or
Positive Predictive Value (PPV)
TP/(TP+FP)
Positive Predictive Value (PPV) = TP/(TP+FP)
2
Precision (b or negative)
or
Negative Predictive Value (NPV)
TN/(TN+FN)
Negative Predictive Value (NPV) = TN/(TN+FN)
3
Recall (a or positive) = True Positive Rate (a) or TPR(a)
Sensitivity =TP/(TP+FN)
4
Recall (b or negative) = True Negative Rate (b)
Specificity = TN/(TN+FP)
5
COMP682
,COMP682
F-MEASURE OR F-SCORE
3 items (2 bullet points and the formula)
1) The harmonic mean of precision and recall.
2) It can be used as a single measure of class performance.
3) F-measure (a) = (2 x Precision x Recall) / (Precision + Recall)
6
Accuracy
how close a measurement is to the true value
7
Accuracy is the overall ___________________ of the model and is calculated as the sum of
__________________________ divided by the total number of ______________________________.
Accuracy is the overall [correctness] of the model and is calculated as the sum of [correct
classifications] divided by the total number of [classifications].
8
Accuracy Formula
Accuracy = (TP+TN) /TOTAL Classifications
9
True Positive Rate or True Positive Rate (a)
TP / TP + FN
10
True Negative Rate or True Negative Rate (b)
TN / TN + FP
COMP682
, COMP682
11
R formula for splitting data
(On Practice Exam 1)
intrain <- createDataPartition(y = df$variable, p=%, list=FALSE)
train_df <- df[inTrain,]
test_df <- df[-inTrain,]
OR
train_target <- df[inTrain,8]
test_target <- df[-inTrain,8]
train_input <- df[inTrain,-8]
test_input <- nadf-inTrain,-8]
12
R formula for C5.0
C5.0(df$target_variable~., df, control = C5.0Control(CF = .###))
13
C5.0 Confidence Factor (CF)
A value between 0 and 1 that indicates the confidence with which this prediction is made.
Decreasing the confidence factor will decrease the tree size, specifically the nodes
14
Draw a CONFUSION MATRIX (CONTINGENCY TABLE)
COMP682