ISYE 6501 FINAL EXAM QUESTIONS
AND ANSWERS. VERIFIED 2025/2026.
Support Vector Machine - ANS A supervised learning, classification model. Uses extremes, or
identified points in the data from which margin vectors are placed against. The hyperplane
between these vectors is the classifier
SVM Pros/Cons - ANS Pros: It works really well with a clear margin of separation
It is effective in high dimensional spaces.
It is effective in cases where the number of dimensions is greater than the number of samples.
It uses a subset of training points in the decision function (called support vectors), so it is also
memory efficient.
Cons: Not good for very large data sets
Not good for when the data set has more noise i.e. target classes are overlapping
Doesn't directly provide probability estimates.
K-nearest neighbor (K-NN) - ANS An unsupervised classification algorithm. Looks at the X
number of closest points to the new one and classifies as whichever is most common.
K-nearest neighbor (K-NN) Pros/Cons - ANS Pros: No assumptions about data
Easy to understand/Interpret
Varsatile
1 @COPYRIGHT 2025/2026 ALLRIGHTS RESERVED.
, Cons: Computationally expensive because algorithm stores all training data
Sensitive to irrelevant features and scale of data
k-fold cross validation - ANS Validation Technique where data is divided into X number of data
subsets. Each subset is then used as a for testing while the rest are used for training. The
algorithm then rotates through each subset and averages the results
K Fold cross Validation Pros/Cons - ANS Pros: Validates Performance of model
Can create balance across predicted features classes
Cons: Doesn't work well with time series data
The aggregate scores of your model could miss some important extreme values or overpower
them so theyre harder to pick up on
k-means clustering - ANS Unsupervised learning heuristic that sets x starts by assigning x
number of cluster centers, then clusters all data points into each of them based on distance.
The center point of each cluster is then calculated and all data points are again re clustered.
Repeat process until no-data points change clusters. Ideal number of clusters can be identified
via elbow diagram.
k-means pros and cons - ANS Pros: Simple to implement
Scales well to large data sets
Easily adaptable
Cons: Choosing K manually can bias it towards initial values
sensitive to outliers
2 @COPYRIGHT 2025/2026 ALLRIGHTS RESERVED.
AND ANSWERS. VERIFIED 2025/2026.
Support Vector Machine - ANS A supervised learning, classification model. Uses extremes, or
identified points in the data from which margin vectors are placed against. The hyperplane
between these vectors is the classifier
SVM Pros/Cons - ANS Pros: It works really well with a clear margin of separation
It is effective in high dimensional spaces.
It is effective in cases where the number of dimensions is greater than the number of samples.
It uses a subset of training points in the decision function (called support vectors), so it is also
memory efficient.
Cons: Not good for very large data sets
Not good for when the data set has more noise i.e. target classes are overlapping
Doesn't directly provide probability estimates.
K-nearest neighbor (K-NN) - ANS An unsupervised classification algorithm. Looks at the X
number of closest points to the new one and classifies as whichever is most common.
K-nearest neighbor (K-NN) Pros/Cons - ANS Pros: No assumptions about data
Easy to understand/Interpret
Varsatile
1 @COPYRIGHT 2025/2026 ALLRIGHTS RESERVED.
, Cons: Computationally expensive because algorithm stores all training data
Sensitive to irrelevant features and scale of data
k-fold cross validation - ANS Validation Technique where data is divided into X number of data
subsets. Each subset is then used as a for testing while the rest are used for training. The
algorithm then rotates through each subset and averages the results
K Fold cross Validation Pros/Cons - ANS Pros: Validates Performance of model
Can create balance across predicted features classes
Cons: Doesn't work well with time series data
The aggregate scores of your model could miss some important extreme values or overpower
them so theyre harder to pick up on
k-means clustering - ANS Unsupervised learning heuristic that sets x starts by assigning x
number of cluster centers, then clusters all data points into each of them based on distance.
The center point of each cluster is then calculated and all data points are again re clustered.
Repeat process until no-data points change clusters. Ideal number of clusters can be identified
via elbow diagram.
k-means pros and cons - ANS Pros: Simple to implement
Scales well to large data sets
Easily adaptable
Cons: Choosing K manually can bias it towards initial values
sensitive to outliers
2 @COPYRIGHT 2025/2026 ALLRIGHTS RESERVED.