Research Methods: Applied Empirical Economics
Experimental data → selection bias is eliminated EX ANTE (van tevoren)
- IV → field experiments with non-compliance
Observational data → two ways to eliminate selection bias EX POST (achteraf)
- Instrumental Variables /Regression Discontinuity. → Recognize events that are as-
good-as-random
- Differences in differences → (and matching techniques): Make sure that you control
for all variables that may be correlated with the treatment and the outcome
variables → no omitted variables left…
Exogenous: is one whose value is determined outside the model and is imposed on the
model, and an exogenous change is a change in an exogenous variable.
Endogenous: is a variable whose value is determined by the model. An endogenous
change is a change in an endogenous variable in response to an exogenous change that is
imposed upon the model.
Omitted → weggelaten, not a problem perse only when 1+2
Omitted variable bias when:
• Yi = α + βQi + γAi + εi
• Yi: Dependent variable
• Qi: Treatment variable
• Ai : control variable
o Omitted variable is correlated with treatment variable, and
o Omitted variable has a direct effect on the dependent variable
In statistics and optimization, errors and residuals are two closely related and easily
confused measures of the deviation of an observed value of an element of a statistical
sample from its "theoretical value". The error (or disturbance) of an observed value is the
deviation of the observed value from the (unobservable) true value of a quantity of interest
(for example, a population mean), and the residual of an observed value is the difference
between the observed value and the estimated value of the quantity of interest (for
example, a sample mean). The distinction is most important in regression analysis, where
the concepts are sometimes called the regression errors and regression residuals and where
they lead to the concept of studentized residuals.
IV/ R.D.
Non-compliance:
o Treatment migration: control group gets treated
o Treatment dilution: assigned to treatment but not treated
- LATE = “Local” Average Treatment Effect
ρ → ITT
First stage: Ф
Second-stage: λ
1
, Randomized controlled trials → RCT
- Measuring the causal effect of treatment
- Get rid of selection bias
- Randomization:
o Similar before treatment → apples and apples
o Does the treatment have an effect?
- Causal effect of insurance Y1i – Y0i
o Suppose there are two potential outcomes Y for individual i:
o Y1i health status with insurance
o Y0i health status without insurance
- Difference in group means = Avgn[Y1i|Di=1] – Avgn’[Yoi|Di=o]
o Avg: average
o Y1i: with insurance
o Y0i: without insurance
o Di=1: individual insured
o Di=0: individual uninsured
- Treatment has same effect for everybody: Y1i – Y0i = κ
- Difference in group means
o = Avgn[Y1i|Di=1] – Avgn’[Yoi|Di=o]
o = Avgn[Y0i|Di=1] + κ – Avgn’[Yoi|Di=o]
o = κ + Avgn[Y0i|Di=1] – Avgn’[Yoi|Di=o]
o = average causal effect + selection bias
- Difference in group means captures the causal effect if:
o Avgn[Y0i|Di=1] = Avgn’[Yoi|Di=o]
o This is exactly what randomization does
- Useful statistics:
o Estimated treatment coefficient
o Estimated standard error
o T-value = Estimated treatment coefficient / Estimated standard error
▪ T-value <-2 or > 2:
▪ Reject null hypothesis of no treatment effect
o 95% confidence interval: [coefficient – 2*SE, coefficient + 2*SE]
2
Experimental data → selection bias is eliminated EX ANTE (van tevoren)
- IV → field experiments with non-compliance
Observational data → two ways to eliminate selection bias EX POST (achteraf)
- Instrumental Variables /Regression Discontinuity. → Recognize events that are as-
good-as-random
- Differences in differences → (and matching techniques): Make sure that you control
for all variables that may be correlated with the treatment and the outcome
variables → no omitted variables left…
Exogenous: is one whose value is determined outside the model and is imposed on the
model, and an exogenous change is a change in an exogenous variable.
Endogenous: is a variable whose value is determined by the model. An endogenous
change is a change in an endogenous variable in response to an exogenous change that is
imposed upon the model.
Omitted → weggelaten, not a problem perse only when 1+2
Omitted variable bias when:
• Yi = α + βQi + γAi + εi
• Yi: Dependent variable
• Qi: Treatment variable
• Ai : control variable
o Omitted variable is correlated with treatment variable, and
o Omitted variable has a direct effect on the dependent variable
In statistics and optimization, errors and residuals are two closely related and easily
confused measures of the deviation of an observed value of an element of a statistical
sample from its "theoretical value". The error (or disturbance) of an observed value is the
deviation of the observed value from the (unobservable) true value of a quantity of interest
(for example, a population mean), and the residual of an observed value is the difference
between the observed value and the estimated value of the quantity of interest (for
example, a sample mean). The distinction is most important in regression analysis, where
the concepts are sometimes called the regression errors and regression residuals and where
they lead to the concept of studentized residuals.
IV/ R.D.
Non-compliance:
o Treatment migration: control group gets treated
o Treatment dilution: assigned to treatment but not treated
- LATE = “Local” Average Treatment Effect
ρ → ITT
First stage: Ф
Second-stage: λ
1
, Randomized controlled trials → RCT
- Measuring the causal effect of treatment
- Get rid of selection bias
- Randomization:
o Similar before treatment → apples and apples
o Does the treatment have an effect?
- Causal effect of insurance Y1i – Y0i
o Suppose there are two potential outcomes Y for individual i:
o Y1i health status with insurance
o Y0i health status without insurance
- Difference in group means = Avgn[Y1i|Di=1] – Avgn’[Yoi|Di=o]
o Avg: average
o Y1i: with insurance
o Y0i: without insurance
o Di=1: individual insured
o Di=0: individual uninsured
- Treatment has same effect for everybody: Y1i – Y0i = κ
- Difference in group means
o = Avgn[Y1i|Di=1] – Avgn’[Yoi|Di=o]
o = Avgn[Y0i|Di=1] + κ – Avgn’[Yoi|Di=o]
o = κ + Avgn[Y0i|Di=1] – Avgn’[Yoi|Di=o]
o = average causal effect + selection bias
- Difference in group means captures the causal effect if:
o Avgn[Y0i|Di=1] = Avgn’[Yoi|Di=o]
o This is exactly what randomization does
- Useful statistics:
o Estimated treatment coefficient
o Estimated standard error
o T-value = Estimated treatment coefficient / Estimated standard error
▪ T-value <-2 or > 2:
▪ Reject null hypothesis of no treatment effect
o 95% confidence interval: [coefficient – 2*SE, coefficient + 2*SE]
2