2 reasons to estimate F:
- Prediction
- Inference
Parametric methods
- Easy to estimate parameters in a linear function
- Model will usually not match the true unknown form of F
Non-parametric methods
- Avoids (wrong) assumption of functional form of F
- Large number of observations is required in order to obtain an accurate estimate of F
, Variance refers to the amount by which ˆf would change if we estimated it using a different training
data set
Bias refers to the error that is introduced by approximating a real-life problem, which may be
extremely complicated, by a much simpler model.
KNN neighbours
When K = 1, the decision boundary is overly flexible and finds patterns in the data that don’t
correspond to the Bayes decision boundary. This corresponds to a classifier that has low bias but very
high variance.
lowest possible test error rate, called the Bayes error rate.
Chapter 3
Curse of dimensionality: As the number of features/dimensions grows, the amount of data we need
to generalize accurately grows exponentially
Chapter 4
LDA to classify more than 2 classes
Why do we need another method, when we have logistic regression?
There are several reasons:
- When the classes are well-separated, the parameter estimates for the logistic regression
model are surprisingly unstable. Linear discriminant analysis does not suffer from this
problem.
- If n is small and the distribution of the predictors X is approximately normal in each of the
classes, the linear discriminant model is again more stable than the logistic regression model.
Check video’s LDA/QDA
sensitivity is the percentage of true defaulters that are identified
specificity is the percentage of non-defaulters that are correctly identified
LDA is a much less flexible classifier than QDA, and so has substantially lower variance.
LDA tends to be a better bet than QDA if there are relatively few training observations and so
reducing variance is crucial.