CHAPTER 7.6-7.6 Multivariate Models
One of the assumptions of the classical linear regression model (CLRM) is that the
explanatory variables are non-stochastic, or fixed in repeated samples. It could also be stated
that all of the variables contained in the X matrix are assumed to be exogenous (=determined
outside the equation). Another way to state this is that the model is conditioned on the
variables in X. The X matrix is assumed not to have a probability distribution. Causality in
this model runs from X to y, and not vice versa, thus changes in the values of the explanatory
variables cause changes in the values of y, but changes in the value of y will not impact upon
the explanatory variables. Y is an endogenous variable and its value is determined by the
regression formula.
One of the CLRM assumptions was that X and u are independent, and given the assumption
that E(u) = 0, then E(X’u) = 0 and the errors are thus uncorrelated with the explanatory
variables. When X was to be related to the errors of the regression, it is said to be stochastic.
When this assumption is violated, β^ would be biased à biased coefficient estimates, which
is known as simultaneity bias of simultaneous equations bias. The OLS estimator cannot be
consistent when it is biased, meaning that if a very large sample would be taken, the
coefficient estimates would still be biased.
Identification is the issue of whether there is enough information in the reduced form
equations to enable the structural form coefficients to be calculated. There are two
conditions that could be examined to determine whether a given equation from a system is
identified:
1. The order condition: is a necessary but not sufficient condition for an equation to be
identified. That is, even if the order condition is satisfied, the equation might not be
identified. An equation is just identified if the number of variables excluded from an
equation is G-1, where G denotes the number of structural equations and “excluded”
means the number of all exogenous and endogenous variables that are not present in
this equation. If more than G-1 are absent, it is over-identified and when less than G-
1 are absent it is not identified.
2. The rank condition: is a necessary and sufficient condition for identification. The
structural equations are specified in a matrix form and the rank of a coefficient matrix
of all of the variables excluded from a particular equation is examined.
A variable is defined as exogenous if the conditional distribution of y given x does not change
with modifications of the process generating x. It is possible to classify two forms of
exogeneity:
• A predetermined variable is one that is independent of the simultaneous and future
errors in that equation.
• A strictly exogenous variable is one that is independent of all simultaneous, future,
and past errors in that equation.
One of the assumptions of the classical linear regression model (CLRM) is that the
explanatory variables are non-stochastic, or fixed in repeated samples. It could also be stated
that all of the variables contained in the X matrix are assumed to be exogenous (=determined
outside the equation). Another way to state this is that the model is conditioned on the
variables in X. The X matrix is assumed not to have a probability distribution. Causality in
this model runs from X to y, and not vice versa, thus changes in the values of the explanatory
variables cause changes in the values of y, but changes in the value of y will not impact upon
the explanatory variables. Y is an endogenous variable and its value is determined by the
regression formula.
One of the CLRM assumptions was that X and u are independent, and given the assumption
that E(u) = 0, then E(X’u) = 0 and the errors are thus uncorrelated with the explanatory
variables. When X was to be related to the errors of the regression, it is said to be stochastic.
When this assumption is violated, β^ would be biased à biased coefficient estimates, which
is known as simultaneity bias of simultaneous equations bias. The OLS estimator cannot be
consistent when it is biased, meaning that if a very large sample would be taken, the
coefficient estimates would still be biased.
Identification is the issue of whether there is enough information in the reduced form
equations to enable the structural form coefficients to be calculated. There are two
conditions that could be examined to determine whether a given equation from a system is
identified:
1. The order condition: is a necessary but not sufficient condition for an equation to be
identified. That is, even if the order condition is satisfied, the equation might not be
identified. An equation is just identified if the number of variables excluded from an
equation is G-1, where G denotes the number of structural equations and “excluded”
means the number of all exogenous and endogenous variables that are not present in
this equation. If more than G-1 are absent, it is over-identified and when less than G-
1 are absent it is not identified.
2. The rank condition: is a necessary and sufficient condition for identification. The
structural equations are specified in a matrix form and the rank of a coefficient matrix
of all of the variables excluded from a particular equation is examined.
A variable is defined as exogenous if the conditional distribution of y given x does not change
with modifications of the process generating x. It is possible to classify two forms of
exogeneity:
• A predetermined variable is one that is independent of the simultaneous and future
errors in that equation.
• A strictly exogenous variable is one that is independent of all simultaneous, future,
and past errors in that equation.