RM | Unit 240 - Assumptions part I
Book: Analysing data using linear models
Chapter 7: 7.1, 7.2 7.3, 7.4, 7.5
Chapter 7.1: Introduction
For a linear model to be a good model, there are four conditions that need to be fulfilled.
1. linearity The relationship between the variables can be described by a linear equation (also called
additivity)
2. independence The residuals are independent of each other
3. equal variance The residuals have equal variance (also called homoskedasticity)
4. normality The distribution of the residuals is normal
If these conditions (often called assumptions) are not met, the inference with the computed
standard error is invalid. That is, if the assumptions are not met, the standard error should not be trusted,
or should be computed using alternative methods.
Chapter 7.2: Independence
This systematic order in the residuals is a violation of independence: the residuals should be random.
Chapter 7.3: Linearity
The assumption of linearity is often also referred to as the assumption of additivity. Contrary to intuition,
the assumption is not that the relationship between variables should be linear. The assumption is that there
is linearity or additivity in the parameters. That is, the effects of the variables in the model should add up.
the relationship between two variables need not be linear in order for a linear model to be
appropriate. A transformation of an independent variable, such as taking a square, can result in normally
randomly scattered residuals. The linearity assumption is that the effects of a number of variables
(transformed or untransformed) add up and lead to a model with normally and independently, randomly
scattered residuals.
Chapter 7.4: Equal variances
The equal variance assumption is an important one: if the data show that the variance is different for
different subgroups of individuals in the data set, then the standard errors of the regression
coefficients cannot be trusted. → The equal variance assumption is often referred to as the
homogeneity of variance assumption or homoscedasticity. It is the assumption that variance is
Book: Analysing data using linear models
Chapter 7: 7.1, 7.2 7.3, 7.4, 7.5
Chapter 7.1: Introduction
For a linear model to be a good model, there are four conditions that need to be fulfilled.
1. linearity The relationship between the variables can be described by a linear equation (also called
additivity)
2. independence The residuals are independent of each other
3. equal variance The residuals have equal variance (also called homoskedasticity)
4. normality The distribution of the residuals is normal
If these conditions (often called assumptions) are not met, the inference with the computed
standard error is invalid. That is, if the assumptions are not met, the standard error should not be trusted,
or should be computed using alternative methods.
Chapter 7.2: Independence
This systematic order in the residuals is a violation of independence: the residuals should be random.
Chapter 7.3: Linearity
The assumption of linearity is often also referred to as the assumption of additivity. Contrary to intuition,
the assumption is not that the relationship between variables should be linear. The assumption is that there
is linearity or additivity in the parameters. That is, the effects of the variables in the model should add up.
the relationship between two variables need not be linear in order for a linear model to be
appropriate. A transformation of an independent variable, such as taking a square, can result in normally
randomly scattered residuals. The linearity assumption is that the effects of a number of variables
(transformed or untransformed) add up and lead to a model with normally and independently, randomly
scattered residuals.
Chapter 7.4: Equal variances
The equal variance assumption is an important one: if the data show that the variance is different for
different subgroups of individuals in the data set, then the standard errors of the regression
coefficients cannot be trusted. → The equal variance assumption is often referred to as the
homogeneity of variance assumption or homoscedasticity. It is the assumption that variance is