X - correct answer-attribute, predictor, independent variable, input
y - correct answer-class, response, dependent variable, output
Classification - correct answer-predicts categorical labels
Prediction - correct answer-predicts continuous values
Decision Tree - correct answer-a non-parametric supervised learning
algorithm, which is utilized for both classification and regression tasks. It has a
hierarchical, tree structure, which consists of a root node, branches, internal
nodes and leaf nodes.
K-Nearest Neighbors - correct answer-A data mining method that predicts
(classifies or estimates) an observation i's outcome value based on the k
observations most similar to observation i with respect to the input variables.
Naive Bayes Classifier - correct answer-an algorithm that predicts the
probability of a certain outcome based on prior occurrences of related events
Support Vector Machine - correct answer-Supervised learning classification
tool that seeks a dividing hyperplane for any number of dimensions can be
used for regression or classification
Nueral Networks - correct answer-a method in artificial intelligence that
teaches computers to process data in a way that is inspired by the human
brain.
Decision Tree Hyperparameters - correct answer-Many. Includes
min_samples_leaf , min_samples_split , max_leaf_nodes , or
min_impurity_decrease
K-Nearest Neighbor Hyperparameters - correct answer-K-value and distance
function
Decision tree disadvantages - correct answer--Prone to outliers
-tree can grow to be very complex while training complex datasets
, K-Nearest Neighbor disadvantages - correct answer--K has to be wisely
selected
-Large computation cost during runtime if sample size is large
What are two variable selection criteria? - correct answer--Entropy and
Information Gain
-Gini Index
Pure when Entropy = - correct answer-0
Impure when Entropy = - correct answer-1
Entropy - correct answer-a measure of the disorder of a system or energy
unavailable to do work.
Why the minus in the Entropy formula - correct answer-Probabilities are
always between 0 and 1.
log(x) where x < 1 is negative
Each term in the sum is negative, so the result of the sum negative meaning
that the minus makes the result positive
Information Gain - correct answer-the amount of knowledge acquired during a
certain decision or action
Random forests - correct answer--for supervised machine learning, where
there is a labeled target variable
-used for solving regression (numeric target variable) and classification
(categorical target variable) problems
-an ensemble method, meaning they combine predictions from other models
-Each of the smaller models in the random forest ensemble is a decision tree
What is the best hyperplane? - correct answer-The one that maximizes
distance from the hyperplane to data points
Margin - correct answer-the distance between hyperplane and data points
What is the name for the points closest to the hyperplane - correct
answer-Support Vectors