Dimension - Answers ________ refers to the # of input variables (degrees of freedom).
Over-fitting - Answers ________ is caused by an overly complex model that is too flexible; can also mean
a situation in which there are too many variables in your model.
"Which split is best?" - Answers ___________ refers to the Splitting Criteria method.
Splitting Criteria methods - Answers The ___________ that are most widely used are based on Pearson
Chi-Square test, Gini index, and entropy; all three measure difference in class distributions across the
chid nodes and usually end with similar results.
Pure child node - Answers If a split ends in a __________, the split is undisputedly best.
Parent node - Answers If the expected target is same in child nodes as in the ____________ then the
split is WORTHLESS.
Pruning - Answers _______ creates a sequence of trees of increasing complexibility.
Assessment criterion - Answers _______ is needed for deciding the best (sub) tree.
Benefits of trees - Answers _______ include interpretability, mixed measurement scales, and missing
values.
Interpretability - Answers _______ is a tree-structured presentation.
Binary outcome - Answers We CANNOT predict a __________.
Binary - Answers When a response variable is ______, the data set (from past information) are zeroes.
Simple Linear Regression - Answers A model with one predictor variable is called __________.
Residual - Answers The _______ is the unexplained part of Y in the Simple Linear Regression formula.
Dummy - Answers Indicator variables are also known as _________ variables.
Indicator variables - Answers _________ allow for the inclusion of qualitative variables in the model.
Model - Answers In Logistic Regression, the _______ is trying to predict the probability of an event.
Logistic Regression assumption - Answers The idea that "the logit transformation of the probabilities of
the target value results in a linear relationship with the input variables" is called __________.
Logistic Regression - Answers In __________, the target is a discrete (binary or ordinal) variable.
Input variables - Answers In both Logistic and Linear Regression, the _________ have any measurement
level.