ECO5185 Midterm Answers; Fall 2025.
Question 1
a. Briefly explain (in words) the idea/concept of each of the following, and briefly explain their
implication for econometric analyses
(a) random sample
Answer: A random sample is a sample whose draws are independent and identically
distributed. It means that for the simple linear regression model A3, i.e.
E(ϵi | X) = 0 (1)
simplifies to
E(ϵi | xi2) = 0. (2)
As such, for the OLS estimator to be unbiased we only need to worry about the explana-
tory variable not being correlated with the error term within each draw. It will also have
implications for the structure of the variance of our estimator.
(b) random variable
Answer: A random variable is a variable whose outcome is uncertain/unknown.
Given that yi and xik (i = 1, . . . , n and k = 1, . . . , K) are random variables, the OLS
estimator, βk (k = 1, . . . , K) is also a random variable. It has a mean and a variance,
and as such, has statistical properties and we can carry-out hypothesis testing.
b. Discuss how the addition of a control variable has implications for
(a) the variation used to estimate a parameter of interest
Answer: By the FWL theorem, the addition of a control variable means that we are not
using some of the variation in x2 when estimating its parameter (i.e. β2). More precisely,
1
, we are not using the variation in x2 that is correlated with the new control.
(b) the validity of a test
Answer: Adding a control may result in assumption A3 holding, if the additional variable
belongs in the model and is correlated with the explanatory variable. Recall that if A3
does not hold the t-test and F-test are not valid.
c. Briefly evaluate the following statement
“The R2 measures the proportion of the variation in the dependent variable that is explained by
the regression line, and as such, a high R2 implies the estimate of the parameter of interest is
probably close to the true parameter.”
Answer: The first part of the statement is true. The R2 measures how much of the variation in
the dependent variable is explained by the OLS regression line. Having said this, the second
part of the statement is not true. Having a high R2 has no bearing on whether assumption
A3 holds, and as such no bearing on whether the estimator is unbiased or consistent. As such,
it has no bearing on whether the estimate is close (or probably close) to the true.
d. You are interested in exploring whether men, on average, have a higher hourly wage than
women in the Canadian labour market. Your co-author suggests the following estimator
Σi ∈ male wage Σi ∈ female wage
i
— i
nm nf
where wagei is the hourly wage of individual i, and nm and nf are the number of males and
females in the sample, respectively.
Would this estimator, when applied to data, generate a guess that close to the true parameter
of interest? Justify your answer.
Answer: This is a method of moment estimator (i.e. the sample analogs of the population
moments). It is made up of sample means, and we know that sample means converge in
probability to population means if we have a random sample. Now, if the estimator is con-
sistent (i.e. we have a random sample) and we have a large sample, one can conclude that the
estimator is probably close to the true population parameter. We cannot, however, say it
with certainty.
Question 2
Assume the true population model takes the form
yi = β1 + β2xi2 + ϵi
2
, a. Set up the OLS minimization problem where one regresses y on x2 without a constant, derive
the F.O.C.(s), and show that the OLS estimator for the slope parameter will be the following
Σn
xi2yi
b2 = Σi=1
n 2 (3)
i=1 xi2
Answer:
Σ
n
min
b
e2i
2
i=1
Σ n
min (y − b2xi2)2
b2
i=1
Σ
n
−2 (yi − b2xi2)xi2 = 0
i=1
Σ
n
(yi − b2xi2)xi2 = 0
i=1
n
Σ 2
[yixi2 − b2x i2] = 0
i=1
n n
Σ Σ
[yixi2] − b2 [x2 i2] = 0
i=1 ni=1 n
Σ Σ
[yixi2] = b2 [x2i2]
i=1 i=1
Σn
i=1 [yixi2]
Σn [x2 ] =b2
i=1 i2
b. Show that the OLS estimator of the slope parameter (as shown in equation (3)) can be
rewritten as
Σn
xi2ϵi
b2 = β2 + Σi=1
n 2
i=1 xi2
3
Question 1
a. Briefly explain (in words) the idea/concept of each of the following, and briefly explain their
implication for econometric analyses
(a) random sample
Answer: A random sample is a sample whose draws are independent and identically
distributed. It means that for the simple linear regression model A3, i.e.
E(ϵi | X) = 0 (1)
simplifies to
E(ϵi | xi2) = 0. (2)
As such, for the OLS estimator to be unbiased we only need to worry about the explana-
tory variable not being correlated with the error term within each draw. It will also have
implications for the structure of the variance of our estimator.
(b) random variable
Answer: A random variable is a variable whose outcome is uncertain/unknown.
Given that yi and xik (i = 1, . . . , n and k = 1, . . . , K) are random variables, the OLS
estimator, βk (k = 1, . . . , K) is also a random variable. It has a mean and a variance,
and as such, has statistical properties and we can carry-out hypothesis testing.
b. Discuss how the addition of a control variable has implications for
(a) the variation used to estimate a parameter of interest
Answer: By the FWL theorem, the addition of a control variable means that we are not
using some of the variation in x2 when estimating its parameter (i.e. β2). More precisely,
1
, we are not using the variation in x2 that is correlated with the new control.
(b) the validity of a test
Answer: Adding a control may result in assumption A3 holding, if the additional variable
belongs in the model and is correlated with the explanatory variable. Recall that if A3
does not hold the t-test and F-test are not valid.
c. Briefly evaluate the following statement
“The R2 measures the proportion of the variation in the dependent variable that is explained by
the regression line, and as such, a high R2 implies the estimate of the parameter of interest is
probably close to the true parameter.”
Answer: The first part of the statement is true. The R2 measures how much of the variation in
the dependent variable is explained by the OLS regression line. Having said this, the second
part of the statement is not true. Having a high R2 has no bearing on whether assumption
A3 holds, and as such no bearing on whether the estimator is unbiased or consistent. As such,
it has no bearing on whether the estimate is close (or probably close) to the true.
d. You are interested in exploring whether men, on average, have a higher hourly wage than
women in the Canadian labour market. Your co-author suggests the following estimator
Σi ∈ male wage Σi ∈ female wage
i
— i
nm nf
where wagei is the hourly wage of individual i, and nm and nf are the number of males and
females in the sample, respectively.
Would this estimator, when applied to data, generate a guess that close to the true parameter
of interest? Justify your answer.
Answer: This is a method of moment estimator (i.e. the sample analogs of the population
moments). It is made up of sample means, and we know that sample means converge in
probability to population means if we have a random sample. Now, if the estimator is con-
sistent (i.e. we have a random sample) and we have a large sample, one can conclude that the
estimator is probably close to the true population parameter. We cannot, however, say it
with certainty.
Question 2
Assume the true population model takes the form
yi = β1 + β2xi2 + ϵi
2
, a. Set up the OLS minimization problem where one regresses y on x2 without a constant, derive
the F.O.C.(s), and show that the OLS estimator for the slope parameter will be the following
Σn
xi2yi
b2 = Σi=1
n 2 (3)
i=1 xi2
Answer:
Σ
n
min
b
e2i
2
i=1
Σ n
min (y − b2xi2)2
b2
i=1
Σ
n
−2 (yi − b2xi2)xi2 = 0
i=1
Σ
n
(yi − b2xi2)xi2 = 0
i=1
n
Σ 2
[yixi2 − b2x i2] = 0
i=1
n n
Σ Σ
[yixi2] − b2 [x2 i2] = 0
i=1 ni=1 n
Σ Σ
[yixi2] = b2 [x2i2]
i=1 i=1
Σn
i=1 [yixi2]
Σn [x2 ] =b2
i=1 i2
b. Show that the OLS estimator of the slope parameter (as shown in equation (3)) can be
rewritten as
Σn
xi2ϵi
b2 = β2 + Σi=1
n 2
i=1 xi2
3