RM | Unit 155 - Standard errors, t-distributions, and confidence intervals for linear models
Book: Analysing Data Using Linear Models
Chapter 5: Introduction, 5.1, 5.2, 5.3, 5.4, 5.5
Chapter 5.2: Random sampling and standard error
Remember from Chapter 2 that the standard deviation of the sample distribution is called the standard
error. The standard error for the sampling distribution of the sample slope represents the uncertainty about
the population slope. If the standard error is large, it means that if we would draw many different random
samples from the same population data, we would get very different sample slopes. If the standard error is
small, it means that if we would draw many different random samples from the same population data, we
would get sample slopes that are very close to one another, and very close to the population slope.
As we have seen, the standard error depends very much on sample size. Apart from sample
size, the standard error for a slope also depends on the variance of the independent variable, the
variance of the dependent variable, and the correlations between the independent variable and other
independent variables in the equation. We will not bore you with the complicated formula for the
standard error for regression coefficients in the case of multiple regression 3 . But here is the formula
for the standard error for the slope coefficient if you have only one predictor variable X: σbb1 =
qSSR n−2 √ SSX = q Σi(Yi−Ybi) 2 n−2 q Σi(Xi − X) 2 = vuut Σi(Yi − Ybi) 2 (n − 2)q Σi(Xi − X) 2
(5.1) where b1 is the slope coefficient in the sample, n is sample size, SSR is the sum of the squared
residuals, and SSX the sum of squares for independent variable X. From the formula, you can see that
the standard error σbb1 becomes smaller when sample size n becomes larger.
Chapter 5.3: t-distribution for model coefficients
When discussing t-statistics, we assumed we knew the population slope β, that is, the slope of the linear
equation based on all 80,000 bottles. In reality, we never know the population slope: the whole reason to
look at the sample slope is to have an idea about the population slope. Let’s look at the confidence
interval for slopes.
Chapter 5.4: Confidence intervals for the slope
Since we don’t know the actual value of the population slope β1, we could ask the personnel in the
beer factory what they think is a likely value for the slope. Suppose Mark says he believes that a slope
Book: Analysing Data Using Linear Models
Chapter 5: Introduction, 5.1, 5.2, 5.3, 5.4, 5.5
Chapter 5.2: Random sampling and standard error
Remember from Chapter 2 that the standard deviation of the sample distribution is called the standard
error. The standard error for the sampling distribution of the sample slope represents the uncertainty about
the population slope. If the standard error is large, it means that if we would draw many different random
samples from the same population data, we would get very different sample slopes. If the standard error is
small, it means that if we would draw many different random samples from the same population data, we
would get sample slopes that are very close to one another, and very close to the population slope.
As we have seen, the standard error depends very much on sample size. Apart from sample
size, the standard error for a slope also depends on the variance of the independent variable, the
variance of the dependent variable, and the correlations between the independent variable and other
independent variables in the equation. We will not bore you with the complicated formula for the
standard error for regression coefficients in the case of multiple regression 3 . But here is the formula
for the standard error for the slope coefficient if you have only one predictor variable X: σbb1 =
qSSR n−2 √ SSX = q Σi(Yi−Ybi) 2 n−2 q Σi(Xi − X) 2 = vuut Σi(Yi − Ybi) 2 (n − 2)q Σi(Xi − X) 2
(5.1) where b1 is the slope coefficient in the sample, n is sample size, SSR is the sum of the squared
residuals, and SSX the sum of squares for independent variable X. From the formula, you can see that
the standard error σbb1 becomes smaller when sample size n becomes larger.
Chapter 5.3: t-distribution for model coefficients
When discussing t-statistics, we assumed we knew the population slope β, that is, the slope of the linear
equation based on all 80,000 bottles. In reality, we never know the population slope: the whole reason to
look at the sample slope is to have an idea about the population slope. Let’s look at the confidence
interval for slopes.
Chapter 5.4: Confidence intervals for the slope
Since we don’t know the actual value of the population slope β1, we could ask the personnel in the
beer factory what they think is a likely value for the slope. Suppose Mark says he believes that a slope