100% tevredenheidsgarantie Direct beschikbaar na je betaling Lees online óf als PDF Geen vaste maandelijkse kosten 4.2 TrustPilot
logo-home
Samenvatting

Summary Introduction to Statistical Learning

Beoordeling
-
Verkocht
6
Pagina's
15
Geüpload op
05-04-2019
Geschreven in
2018/2019

Summary of Introduction to Statistical Learning. Includes graphs examples from the book. Chapter 2, 3, 4, 5, 7










Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Documentinformatie

Heel boek samengevat?
Nee
Wat is er van het boek samengevat?
Chapter 2, 3, 4, 5, 7
Geüpload op
5 april 2019
Aantal pagina's
15
Geschreven in
2018/2019
Type
Samenvatting

Onderwerpen

Voorbeeld van de inhoud

ISL

Chapter 2
2.2 Assessing model accuracy
No one method dominates all others over all possible data sets.

2.2.1 Measuring the quality of fit
We need a way to measure how well a model’s predictions actually match the observed
data. The most commonly-used measure in the regression setting is mean squared error
(MSE).
MSE= (1/n) SUM(Yi-f^(Xi))²
The MSE will be small if the predicted responses are very close to the true responses, and
will be large if for some of the observations, the predicted and true responses differ
substantially.

But we don’t want to know if our training set predicts Y for our sample. Thus, we don’t care
whether or not the method accurately predicts diabetes risk for patients used to train the
model, since we already know if they have diabetes. We want to apply it to new people for
the future.
Thus we are interested in knowing whether f^(x0) is approximately equal to y0, where (x0,
y0) is a previously unseen test observation not used to train the statistical learning method.
We want to choose the method that gives the lowest test MSE, as opposed to the lowest
training MSE. If we have a large number of test observations, we could compute:
Ave(y0 – f^(x0))²
This is the average squared prediction error for these test observations (x0,y0). We want to
select the model for which the average of this quantity is as small as possible.

When no test observations are available, one might imagine simply selecting a statistical
learning method that minimizes the training MSE.

The degrees of freedom is a quantity that summarizes the flexibility of a curve.

When a given method yields a small training MSE but a large test MSE, we are said to be
overfitting the data. This happens because our statistical learning procedure is working too
hard to find patterns in the training data, and may be picking up some patterns that are just
caused by random chance rather than by true properties of the unknown function f.

The bias-variance trade-off
When we plot test MSE curves, sometimes U-shapes show up. These turn out to be the
result of two competing properties of statistical learning methods. The expected test MSE,
for a given value x0, van be decomposed into the sum of three fundamental quantities; the
variance, the squared bias, and the variances of the error terms.
In order to minimize the expected test error, we need to select a statistical learning method
that achieves low variance and low bias. Hence, we see that the expected test MSE can
never lie below the variance, which is the irreducible error.

Here variance means the amount by which f^ would change if we estimated it using a
different training data set. Ideally f should not vary too much between training sets f^. If a
method has a high variance, then small changes in the training data can result in large
changes in f^. In general, more flexible statistical methods have higher variance.

Bias refers to the error that is introduced by approximating a real-life problem. E.g. it is
unlikely that any real-life problem truly has a real linear relationship, and so performing

, linear regression will undoubtedly result in some bias in the estimate of f. Generally, more
flexible methods result in less bias.

As we increase the flexibility of our methods, the bias tends to initially decrease faster than
the variance increases. Consequently, the expected test MSE declines. However, at some
point increasing flexibility has little impact on the bias, but starts to significantly increase
the variance. Then the test MSE increases.

The relationship between bias, variance, and test set MSE is referred to as the bias-variance
trade-off.

Chapter 3

When answering statistical problems:
1. Find out if there is evidence of an association between the variables (e.g. advertising
expenditure and sales).
2. Check for weak of strong evidence.
3. Try to separate the individual effects of the variable (e.g. TV, radio or newspaper
advertising)
4. Try to find the accuracy of each effect.
5. Try to predict future values (e.g. how many future sales do we predict)
6. Check whether the relationship is linear
7. Check for an interaction effect (e.g. do 50.000 to both television and radio lead to
more sales than 100.000 on only one)

3.1 Simple linear regression
A straightforward approach for predicting a quantitative response Y on the basis of a single
predictor variable X. We are regressing Y on X.
Y ≈ β0 + β1X (ˆ y = ˆ β0 + ˆ β1x)
ß0 represents the intercept and is unknown. ß1 represents the slope and is unknown.
Together they are known as the coefficients/parameters.

3.1.1 Estimating the Coefficients
The goal is to find ß0 and ß1 so that the linear model fits well (so the line is as close to the
n observations). This can be done by finding all X and Y for all observations. The most
common approach involves minimizing the least squares criterion.

3.1.2 Assessing the Accuracy of the Coefficient Estimates
Y = β0 + β1X + E
The error is the catch-all for what we miss (e.g. other variables that cause variation in Y).
We assume that the error term is independent of X. The above formula defines the
population regression line, which is the best linear approximation to the true relationship
between X and Y.

We mostly use the sample mean ^u to estimate u. On average we expect those to be equal,
this estimate is unbiased. This holds for ß0 and ß1 as well: if we estimate those on a
particular data set, then our estimates won’t be exactly equal, but we could average the
estimates obtained over a huge number of data sets, to get them equal.
To see how much a measure is an underestimate/overestimate, we compute the standard
error (SE).
Var(ˆ μ) = SE(ˆ μ)² – (σ²/n)) = SE(ˆ μ) = SE(ˆ μ)² – (σ²/n))² – (σ²/n)

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
lindawijnhoven Radboud Universiteit Nijmegen
Bekijk profiel
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
60
Lid sinds
8 jaar
Aantal volgers
54
Documenten
24
Laatst verkocht
1 jaar geleden

4,3

13 beoordelingen

5
9
4
1
3
2
2
0
1
1

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen