complete solutions 2024/2025
Identify whether the task required is supervised or unsupervised learning:
Predicting whether a company will go bankrupt based on comparing its financial
data to those of similar bankrupt and non bankrupt firms. - ANSWER- Supervised
learning, all information evaluated is known
Identify whether the task required is supervised or unsupervised learning:
Printing of custom discount coupons at the conclusion of a grocery store
checkout based on what you just bought and what others have bought
previously. - ANSWER- Unsupervised learning; outcomes are unknown
True or false: The test data are used to build models, or to further tweak the
model or improve its fit. - ANSWER- False
_____________ is used for assessing the performance of the final chosen model
on new data - ANSWER- The test data partition
When a model is fit to training data, zero error with those data is not necessarily
good. This special case is called ______. - ANSWER- Overfitting
True or false: Bar charts are useful for comparing a single statistic (e.g. average,
count, percentage) across groups. The height of the bar represents the value of
statistic, and different bars correspond to different groups. - ANSWER- True
Which of the following are the most popular visualization tools in JMP_Pro? -
ANSWER- Graph Builder, Fit Y by X, Distribution
, Scatter plots play important role in prediction. Next step can be developing a
model. Scatter plots provide information about relationships (linear or non-linear)
between variables. The variables in scatter plot ________. - ANSWER- Numerical
In a box plot, the box include %50 of the data, the horizontal line represents
(i)____________, the top and bottom of the box represent (ii)________,
respectively. - ANSWER- (i) the Median (50th percentile); (ii) the 75th and 25th
percentiles
In JMP a diamond is displayed in the box, where the center of the diamond is
_________. - ANSWER- The mean
The density ellipsoid in scatterplot matrix is a good graphical indicator of the
correlation between two variables. The ellipsoid collapses diagonally as the
correlation between the two variables approaches either 1 or -1.
The ellipsoid is more circular if the two variables are more correlated. (TRUE or
FALSE?) - ANSWER- False; The ellipsoid is more circular (less diagonally
oriented) if the two variables are less correlated
True or False: Sensitivity and Specificity are plotted on an ROC Curve. -
ANSWER- True
To obtain an honest estimate of future classification error, we use the
classification matrix that is computed from ________. - ANSWER- Validation data
True or False: The classification matrix, also called confusion matrix, gives
estimates of the true classification and misclassification rates. - ANSWER- True
How do you calculate the error rate on a classification matrix (Confusion Chart)? -
ANSWER- Total incorrect predictions / total predictions
The 'portion' of a lift curve represents what percent of the data, and how is this
portion sorted? - ANSWER- The portion (portion = .2 = p) represents the top p%
(20%) of the data, as sorted by their predicted probability of predictor