Econometrics
Summary of all Lectures
(7 Lectures)
Radboud University Nijmegen
Yoël Guijt
, Week 1 — Introduction
Econometrics = Theories, Models, Data + Statistics
Approach to Econometrics:
> Theory > Mathematical Model > Data > Estimation > Hypothesis testing > Predict > Policies
Econometrics:
> Statistics to estimate relationships, test theories, and evaluate policies
> Economic theory to real world, utility maximization, supply and demand, etc.
> Often in economics di cult to examine experiments; statistics (return to education, etc.)
Theory to Empirics:
> Observations of real world relations, to econometric models, absorbing of all unobserved e ects
> Then, hypothesis about relations, with data to test these hypotheses
Types of Data:
1 Cross-Sectional: One moment in time, multiple economics entities (individuals, rms, etc.)
2 Time Series: Over time, Single or a few economic entities
3 Panel Data: Two or more moments in time, Multiple economics entities (same over time)
Non-experimental vs experimental data:
> Non-experimental = Researcher is passive collector of data; observations
> Experimental = Collected in labor eld experiments
Causality and Ceteris Paribus:
> Di cult to establish a relation; often only an association or co-movement
> Regression Analysis aims at detecting causal factors ! (Di cult, only linear)
> Ceteris Paribus = All else being equal (if enough variables in model, ceteris paribus causal)
Why is data analysis important?
> Theories need data; to criticize existing practices; Support to theories, but never prove them;
Economics measurement can however reject economic theories; Understanding of econometrics
is important for economic practices
Basic Maths and Stats:
> Summation: x1+x2+x3 … + xn = SUM of x
> For constants c: n*c
> Linear Function: y = B0 + B1*x (B0 is intercept, B1 is slope coe cient) (DeltaY = B1 *Delta X)
> Average: 1/n * SUM… All summed up and divided by ’n’
> Deviation Average: (xi - x)… The di erence between a single observation and the average
> Proportions: (x1-x0)/x0 = DeltaX / X0 (Percentage? 100* DeltaX/X0 )
> Expected Value: E(X) or MU = x1F(x1) … This means, probability/weight, times value of x
> Variance: (X - MU)^2 = SIGMA^2
> Std. Deviation: Square Root of the Variance … SIGMA = sd(X) = Wortel Variance
> Covariance: Cov(x, y) = (Xi - X)(Yi - Y) … Relation between two random variables
> Correlation: Cov(x, y) / (SIGMAx * SIGMAy) … The covariance divided by st.dev. of x & y
> Conditional Expectation: E(Wage|Education) = 1.05 + 0.45Education (years of education)
The Simple Regression Model:
> Example: To what extent is the price of house determined by size? E ect of X on Y …
Deterministic Model: Speci es relation between variables exactly, without room for deviation
Probabilistic Model: Deterministic component + Random error, stochastic, or chance component
> Example: Y = 1.5X + E
ffi ffi fi fi ff ffi
ffi ff fi ff
, The Linear Model:
> Y = B0 + B1x + … + e Y = Dependent /// X = Independent
B0 + B1x = Deterministic part /// e = Stochastic part
B0 = intercept /// B1 = (In-)/decrease of Y for each unit of X
Y = Dependent Variable = Left-hand side, Explained, Regressand, Outcome Variable
X = Independent Variable = Right-hand side, Explanatory, Regressor, Covariate, Control Variable
Notation:
> Yi = B0 + B1xi + Ei (Cross sectional - Consumption of individuals at same time)
Multivariate Linear Models:
> Yi = B0 + B1 X1i + B2 X2i + B3 X3i … + Ei (n = 1,2,3, …, N)
Time Series:
> Yt = B0 + B1 Xt … + Et (t = 1,2,3, …, T)
>>> Has one entity and T di erent time periods
Estimation with Least Squares Method (LSM):
> Tries to minimize the deviations: Ordinary Least Squares Regression (OLS)
> There is only one line for which the “sum of squares” of the deviations is minimal !!
Theoretical versus Estimates Regression:
> Theoretical => Yi = B0 + B1 Xi + Ei
> Estimates => Yi = 103.40 + 6.38Xi
Deriving the Ordinary Least Squares Estimates (OLS):
> OLS is tting a line through the sample points, such that “sum of squared residuals” is minimal
> We can set up a formal minimization problem; choose parameters with minimize the following…
> The Residual, Êi, is an estimate of the error term, E, and is the
di erence between the tted line and the sample point
> This also leads to Y = B0 + B1 X
> This leads to Variance (xi - X)^2
> And Covariance (xi - X)(yi - Y)
The slope of B1 is now:
> In words —> The slope coe cient B1 is the “sample covariance”
> Between X and Y divided by the sample variance of X (one requirement, X has to vary)
> If X and Y are positively correlated, B1 will be positive (if not, it will be negative)
Algebraic Properties of Ordinary Least Squares (OLS):
> The Sum of the OLS residuals is ZERO —> SUM of Ei = 0
> The chosen estimates of B0 and B1 make the residuals add up to 0
> The sample covariance between the regressors and the OLS residuals is zero
> The OLS regression always goes trough the mean of the sample; Y^ = B0^ + B1^X^
> We can think of each observation as being made up of an explained and unexplained part
Regression Through The Origin:
> Regression without intercept term is possible, but leads to violation of Classical Assumption, it
is almost never zero
> Omitting the intercept leads the impact of intercept being forced into other coe cients (bias!)
> Don’t use it; Sum of Residuals is never Zero & No meaningful R2 can be computed
ff fi fi ffi ff ffi
Summary of all Lectures
(7 Lectures)
Radboud University Nijmegen
Yoël Guijt
, Week 1 — Introduction
Econometrics = Theories, Models, Data + Statistics
Approach to Econometrics:
> Theory > Mathematical Model > Data > Estimation > Hypothesis testing > Predict > Policies
Econometrics:
> Statistics to estimate relationships, test theories, and evaluate policies
> Economic theory to real world, utility maximization, supply and demand, etc.
> Often in economics di cult to examine experiments; statistics (return to education, etc.)
Theory to Empirics:
> Observations of real world relations, to econometric models, absorbing of all unobserved e ects
> Then, hypothesis about relations, with data to test these hypotheses
Types of Data:
1 Cross-Sectional: One moment in time, multiple economics entities (individuals, rms, etc.)
2 Time Series: Over time, Single or a few economic entities
3 Panel Data: Two or more moments in time, Multiple economics entities (same over time)
Non-experimental vs experimental data:
> Non-experimental = Researcher is passive collector of data; observations
> Experimental = Collected in labor eld experiments
Causality and Ceteris Paribus:
> Di cult to establish a relation; often only an association or co-movement
> Regression Analysis aims at detecting causal factors ! (Di cult, only linear)
> Ceteris Paribus = All else being equal (if enough variables in model, ceteris paribus causal)
Why is data analysis important?
> Theories need data; to criticize existing practices; Support to theories, but never prove them;
Economics measurement can however reject economic theories; Understanding of econometrics
is important for economic practices
Basic Maths and Stats:
> Summation: x1+x2+x3 … + xn = SUM of x
> For constants c: n*c
> Linear Function: y = B0 + B1*x (B0 is intercept, B1 is slope coe cient) (DeltaY = B1 *Delta X)
> Average: 1/n * SUM… All summed up and divided by ’n’
> Deviation Average: (xi - x)… The di erence between a single observation and the average
> Proportions: (x1-x0)/x0 = DeltaX / X0 (Percentage? 100* DeltaX/X0 )
> Expected Value: E(X) or MU = x1F(x1) … This means, probability/weight, times value of x
> Variance: (X - MU)^2 = SIGMA^2
> Std. Deviation: Square Root of the Variance … SIGMA = sd(X) = Wortel Variance
> Covariance: Cov(x, y) = (Xi - X)(Yi - Y) … Relation between two random variables
> Correlation: Cov(x, y) / (SIGMAx * SIGMAy) … The covariance divided by st.dev. of x & y
> Conditional Expectation: E(Wage|Education) = 1.05 + 0.45Education (years of education)
The Simple Regression Model:
> Example: To what extent is the price of house determined by size? E ect of X on Y …
Deterministic Model: Speci es relation between variables exactly, without room for deviation
Probabilistic Model: Deterministic component + Random error, stochastic, or chance component
> Example: Y = 1.5X + E
ffi ffi fi fi ff ffi
ffi ff fi ff
, The Linear Model:
> Y = B0 + B1x + … + e Y = Dependent /// X = Independent
B0 + B1x = Deterministic part /// e = Stochastic part
B0 = intercept /// B1 = (In-)/decrease of Y for each unit of X
Y = Dependent Variable = Left-hand side, Explained, Regressand, Outcome Variable
X = Independent Variable = Right-hand side, Explanatory, Regressor, Covariate, Control Variable
Notation:
> Yi = B0 + B1xi + Ei (Cross sectional - Consumption of individuals at same time)
Multivariate Linear Models:
> Yi = B0 + B1 X1i + B2 X2i + B3 X3i … + Ei (n = 1,2,3, …, N)
Time Series:
> Yt = B0 + B1 Xt … + Et (t = 1,2,3, …, T)
>>> Has one entity and T di erent time periods
Estimation with Least Squares Method (LSM):
> Tries to minimize the deviations: Ordinary Least Squares Regression (OLS)
> There is only one line for which the “sum of squares” of the deviations is minimal !!
Theoretical versus Estimates Regression:
> Theoretical => Yi = B0 + B1 Xi + Ei
> Estimates => Yi = 103.40 + 6.38Xi
Deriving the Ordinary Least Squares Estimates (OLS):
> OLS is tting a line through the sample points, such that “sum of squared residuals” is minimal
> We can set up a formal minimization problem; choose parameters with minimize the following…
> The Residual, Êi, is an estimate of the error term, E, and is the
di erence between the tted line and the sample point
> This also leads to Y = B0 + B1 X
> This leads to Variance (xi - X)^2
> And Covariance (xi - X)(yi - Y)
The slope of B1 is now:
> In words —> The slope coe cient B1 is the “sample covariance”
> Between X and Y divided by the sample variance of X (one requirement, X has to vary)
> If X and Y are positively correlated, B1 will be positive (if not, it will be negative)
Algebraic Properties of Ordinary Least Squares (OLS):
> The Sum of the OLS residuals is ZERO —> SUM of Ei = 0
> The chosen estimates of B0 and B1 make the residuals add up to 0
> The sample covariance between the regressors and the OLS residuals is zero
> The OLS regression always goes trough the mean of the sample; Y^ = B0^ + B1^X^
> We can think of each observation as being made up of an explained and unexplained part
Regression Through The Origin:
> Regression without intercept term is possible, but leads to violation of Classical Assumption, it
is almost never zero
> Omitting the intercept leads the impact of intercept being forced into other coe cients (bias!)
> Don’t use it; Sum of Residuals is never Zero & No meaningful R2 can be computed
ff fi fi ffi ff ffi