Resume

Summary of Statistical Learning (Machine Learning) Course for Econometrics Students at UvA taught by Yi HE

Note

Vendu

Pages

Publié le

03-04-2024

Écrit en

2023/2024

Dive into the world of Statistical Learning with this comprehensive guide, meticulously crafted to bridge the gap between theoretical concepts and practical applications in data science and machine learning. Whether you're a student eager to master the fundamentals, a practitioner aiming to sharpen your analytics skills, or a researcher seeking advanced methodologies, this document is your gateway to understanding the intricate balance of bias-variance trade-off, the nuances of regression, classification, and non-linear models, and the power of regularization and dimension reduction techniques. With detailed explorations of cross-validation, tree-based methods, and shrinkage methods like ridge regression and lasso, it equips you with the knowledge to tackle high-dimensional data challenges, optimize model performance, and unlock insightful predictions. Enhance your statistical learning arsenal and stay ahead in the rapidly evolving field of data science with this indispensable resource.

Montrer plus Lire moins

Établissement

Cours

Aperçu du contenu

Statistical Learning
Made by Maxmillian Forman, please do not sell this document. Good Luck!

1. Bias-Variance Trade-off
1.1. Regression Function
Predictor Function Notation: is a predictor function, a prediction rule, it is an element of a set of all
possible predictor functions, denoted . We can subset with , hence .

Squared Loss Function: Given an independent test observation and a independent training sample
for a candidate prediction rule . The squared loss function is given by:

Population Risk. Also known as the expected loss, for a predictor function that is an element of all
possible predictor functions.

Regression Function: By minimizing the population risk we obtain the regression function which is
unknown in practice.
Statistical learning aims to estimate the fixed but unknown

Prediction Model: The regression function showcases the systematic information provides about
.

1.2. Empirical Risk
Empirical Risk Function: For a random sample and and sample prediction model
the empirical risk function for a candidate prediction rule is:
The empirical risk function is an estimator of the population risk .
A estimator for would be the minimum of the empirical risk.

Approximation Error: The approximation error of the risk function is:

For we want to find the smallest approximation error for , so we look at the greatest lower bound
of the approximation error.
A flexible model has a large set .
A large set corresponds to high variance of

, A high variance of results in high approximation error.

Overfitting: Refers to fitting to the training set too much, causing to predict empirical
risk and not the population risk. Occurring when a large set is present.
Overfitting occurs if we have a low training MSE and a high test MSE.

1.3. Regularization
Regularization: Refers to restricting all predictor functions to a smaller set , to control the bias
variance trade-off.
A large sample size corresponds to a low variance.
For :
As we increase :
Flexible model, risk of overfitting
Flexibility increases.
The variance of increases
The bias of decreases
As we decrease :
Inflexible model, risk of underfitting.
Flexibility decreases.
The variance of decreases
The bias of increases.

Bias: Bias refers to how far is from the true regression function .
Variance: Variance refers to the extent of fluctuations in .
Regularized Estimators: There are two estimators when we minimize on a smaller domain , regularized
and oracle estimators.
The regularized estimator estimates the oracle estimator, we are looking for the best prediction
rule in the subdomain .
1. Regularized Estimator:
Depends on the empirical risk function.

2. Oracle Estimator
Depends on the population risk function.

Population Mean Squared Error:

Sample Mean Squared Error:

Reducible and Irreducible Errors: Suppose is a possibly regularized estimate of , fitted using a
training set. For and from the test set we can rewrite the population MSE into a reducible error and
a irreducible error.

Bias-Variance Trade-off: We use the law of iterated expectations to decompose the reducible error into
the bias and variance of .

, Bias Variance for an Estimator: For an estimator of some parameter we can decompose the
mean squared error as:

1.4. Cross Validation

Hyper Parameter Optimization: We want to estimate the reducible error too find the optimal
subdomain for the estimator . So we introduce a hyperparameter where

which assists in finding the appropriate .

Validation Set Approach: A method to estimate . Outputs the validation set error rate which is an
estimate for test set error rate.
Advantages:
Relatively simple to understand
Disadvantages:
Validation set error rate, MSE, is variable due to random split.
Validation set error may overestimate test set error.
We use a subset of the initial dataset to train , so we loose some information.

1. Randomly splits the dataset into a training and validation dataset.
2. Trains on the training dataset.
3. The fitted estimator then predicts responses for the validation dataset.
4. We calculate the validation set error, MSE, which will serve as an approximation for test error
rate.

K-Fold Cross Validation: A method used to estimate . Outputs the validation set error rate which is an
estimate for test set error rate.
Advantages:
MSE variability is smaller then in validation set approach and LOOCV.
Disadvantages:
Has higher bias then LOOCV.

1. We divide the initial dataset into equally sized partitions.
2. We select partitions to be the training set and the remaining -th partition will be the
validation set.
3. We fit using the training set.
4. The fitted estimator then predicts responses for the -th validation dataset.
5. We calculate the validation set error and call it .
6. We repeat so for all partitions exclusively and compute the average

Leave One Out Cross Validation (LOOCV): A method to estimate . Outputs the validation set error rate
which is an estimate for test set error rate.
Advantages:
We use all the observations in the dataset, so LOOCV is less biased and does not
overestimate the test error rate.
Disadvantages:
Computationally Intensive
Has higher variance then -fold, but a lower bias.

1. We divide the dataset into a training and validation set where the validation set is of size
.

Signaler une violation de copyright

Livre connecté

Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, Jonathan Taylor An Introduction to Statistical Learning

Édition:2023
ISBN:9783031387470
Édition:Inconnu

École, étude et sujet

Établissement: Universiteit van Amsterdam (UvA)
Cours: Econometrics
Cours: 6013B0357Y

Tous les documents sur ce sujet (1)

Infos sur le Document

Livre entier ?: Non
Quels chapitres sont résumés ?: Chapter {2,5,3,6,7,8}
Publié le: 3 avril 2024
Nombre de pages: 19
Écrit en: 2023/2024
Type: RESUME

Sujets

regression
classification
bias variance tradeoff
crossvalidation
regularized estimators
overfitting
underfitting
ridge regression
lasso regression
random forests
validation set approach
higher dimensi

25,66 €

Accéder à l'intégralité du document:

Rédigé par des étudiants ayant réussi

Disponible immédiatement après paiement

Lire en ligne ou en PDF

Faites connaissance avec le vendeur

formanmaximilian

Faites connaissance avec le vendeur

formanmaximilian Universiteit van Amsterdam

Voir profil

Vendu

Membre depuis

2 année

Nombre de followers

Documents

Dernière vente

0,0

0 revues

Récemment consulté par vous

Pourquoi les étudiants choisissent Stuvia

Créé par d'autres étudiants, vérifié par les avis

Une qualité sur laquelle compter : rédigé par des étudiants qui ont réussi et évalué par d'autres qui ont utilisé ce document.

Le document ne convient pas ? Choisis un autre document

Aucun souci ! Tu peux sélectionner directement un autre document qui correspond mieux à ce que tu cherches.

Paye comme tu veux, apprends aussitôt

Aucun abonnement, aucun engagement. Paye selon tes habitudes par carte de crédit et télécharge ton document PDF instantanément.

“Acheté, téléchargé et réussi. C'est aussi simple que ça.”

Alisha Student

Foire aux questions

Qu'est-ce que j'obtiens en achetant ce document ?

Vous obtenez un PDF, disponible immédiatement après votre achat. Le document acheté est accessible à tout moment, n'importe où et indéfiniment via votre profil.

Garantie de remboursement : comment ça marche ?

Notre garantie de satisfaction garantit que vous trouverez toujours un document d'étude qui vous convient. Vous remplissez un formulaire et notre équipe du service client s'occupe du reste.

Auprès de qui est-ce que j'achète ce résumé ?

Stuvia est une place de marché. Alors, vous n'achetez donc pas ce document chez nous, mais auprès du vendeur formanmaximilian. Stuvia facilite les paiements au vendeur.

Est-ce que j'aurai un abonnement?

Non, vous n'achetez ce résumé que pour 25,66 €. Vous n'êtes lié à rien après votre achat.

Peut-on faire confiance à Stuvia ?

4.6 étoiles sur Google & Trustpilot (+1000 avis) 49925 résumés ont été vendus ces 30 derniers jours Fondée en 2010, la référence pour acheter des résumés depuis déjà 16 ans

Summary of Statistical Learning (Machine Learning) Course for Econometrics Students at UvA taught by Yi HE

Aperçu du contenu

Livre connecté

École, étude et sujet

Infos sur le Document

Sujets

Plus de cours sur Universiteit van Amsterdam (UvA) > Econometrics

Faites connaissance avec le vendeur

Récemment consulté par vous

Pourquoi les étudiants choisissent Stuvia

Créé par d'autres étudiants, vérifié par les avis

Le document ne convient pas ? Choisis un autre document

Paye comme tu veux, apprends aussitôt

Vous travaillez sur vos références ?

Foire aux questions

Qu'est-ce que j'obtiens en achetant ce document ?

Garantie de remboursement : comment ça marche ?

Auprès de qui est-ce que j'achète ce résumé ?

Est-ce que j'aurai un abonnement?

Peut-on faire confiance à Stuvia ?