SOLUTIONS MANUAL
,Chapter 1
Exercise 1.1 To verify first that the representation holds, compute the
second partial derivative of ln p(x, θ) with respect to θ. It is
∂ 2 ln p(x, θ) 1 ³ ∂ p(x, θ) ´2 1 ∂ 2 p(x, θ)
= − £ ¤2 +
∂θ2 p(x, θ) ∂θ p(x, θ) ∂ θ2
³ ∂ ln p(x, θ) ´2 1 ∂ 2 p(x, θ)
= − + .
∂θ p(x, θ) ∂ θ2
Multiplying by p(x, θ) and rearranging the terms produce the result,
³ ∂ ln p (x, θ) ´2 ∂ 2 p (x, θ) ³ ∂ 2 ln p (x, θ) ´
p (x, θ) = − p (x, θ).
∂θ ∂θ2 ∂θ2
Now integrating both sides of this equality with respect to x, we obtain
h³ ∂ ln p (X, θ) ´2 i Z ³
∂ ln p (x, θ) ´2
In (θ) = n Eθ = n p (x, θ) dx
∂θ R ∂θ
Z Z ³ 2
∂ 2 p (x, θ) ∂ ln p (x, θ) ´
= n dx − n p (x, θ) dx
R ∂θ2 R ∂θ2
Z Z ³ 2
∂2 ∂ ln p (x, θ) ´
= n 2 p(x, θ) dx − n p (x, θ) dx
∂θ R R ∂θ2
| {z }
0
Z ³ 2 h ∂ 2 ln p (x, θ) i
∂ ln p (x, θ) ´
= −n p (x, θ) dx = − n Eθ .
R ∂θ2 ∂θ2
Exercise 1.2 The first£ step is Pto notice that¤ θn∗ is an unbiased estimator of
θ. Indeed, Eθ [ θn∗ ] = Eθ (1/n) ni=1 (Xi − µ)2 = Eθ [(X1 − µ)2 ] = θ.
Further, the log-likelihood function for the N (µ, θ) distribution has the form
1 (x − µ)2
ln p(x, θ) = − ln(2 π θ) − .
2 2θ
Therefore,
∂ ln p(x, θ) 1 (x − µ)2 ∂ 2 ln p(x, θ) 1 (x − µ)2
= − + , and = − .
∂θ 2θ 2θ2 ∂θ2 2θ2 θ3
Applying the result of Exercise 1.1, we get
h ∂ 2 ln p(X, θ) i h 1 (X − µ)2 i
In (θ) = − n Eθ = − n Eθ −
∂θ2 2θ2 θ3
2
, h 1 θi n
= −n − = .
2θ2 θ3 2θ2
P
Next, using the fact that ni= 1 (Xi − µ)2 /θ has a chi-squared distribution
with n degrees of freedom, and, hence its variance equals to 2n, we arrive at
n i 2nθ2
£ ¤ h1 X 2θ2 1
Varθ θn∗ = Varθ (Xi − µ)2 = 2 = = .
n i=1 n n In (θ)
Thus, we have shown that θn∗ is an unbiased estimator of θ and that its vari-
ance attains the Cramér-Rao lower bound, that is, θn∗ is an efficient estimator
of θ.
Exercise 1.3 For the Bernoulli(θ) distribution,
ln p (x, θ) = x ln θ + (1 − x) ln(1 − θ),
thus,
∂ ln p(x, θ) x 1−x ∂ 2 ln p(x, θ) x 1−x
= − and 2
= − 2 − .
∂θ θ 1−θ ∂θ θ (1 − θ)2
From here,
h X 1−X i ³ θ 1−θ ´ n
In (θ) = − n Eθ − 2 − 2
= n 2
+ 2
= .
θ (1 − θ) θ (1 − θ) θ(1 − θ)
£ ¤ £ ¤ £ ¤ £ ¤
On the other hand, Eθ X̄n = Eθ X = θ and Varθ X̄n = Varθ X /n =
θ(1 − θ)/n = 1/In (θ). Therefore θn∗ = X̄n is efficient.
Exercise 1.4 In the Poisson(θ) model,
ln p (x, θ) = x ln θ − θ − ln x! ,
hence,
∂ ln p(x, θ) x ∂ 2 ln p(x, θ) x
= −1 and 2
= − 2.
∂θ θ ∂θ θ
Thus,
h Xi n
In (θ) = − n Eθ − 2 = .
θ θ
£ ¤
The estimate X̄n is unbiased with the variance Varθ X̄n = θ/n = 1/In (θ),
and therefore efficient.
3
,Exercise 1.5 For the given exponential density,
ln p (x, θ) = − ln θ − x/θ ,
whence,
∂ ln p(x, θ) 1 x ∂ 2 ln p(x, θ) 1 2x
= − + 2 and 2
= 2 − 3.
∂θ θ θ ∂θ θ θ
Therefore,
h1 2X i h1 2θ i n
In (θ) = − n Eθ − = − n − = 2.
θ2 θ 3 θ 2 θ 3 θ
£ ¤ £ ¤
Also, Eθ X̄n = θ and Varθ X̄n = θ2 /n = 1/In (θ). Hence efficiency holds.
Exercise 1.6 If X1 , . . . , Xn are independent
Pn exponential random variables
with the mean 1/θ, their sum Y = i=1 X i has a gamma distribution with
the density
y n−1 θ n e−y θ
fY (y) = , y > 0.
Γ(n)
Consequently,
Z ∞
h 1 i hni 1 y n−1 θ n e−y θ
Eθ = Eθ = n dy
X̄n Y 0 y Γ(n)
Z ∞
nθ n θ Γ(n − 1)
= y n−2 θ n−1 e−y θ dy =
Γ(n) 0 Γ(n)
n θ (n − 2)! nθ
= = .
(n − 1)! n−1
Also,
£ ¤ £ ¤ ³ £ ¤ ¡ £ ¤¢2 ´
2 2
Varθ 1/X̄n = Varθ n/Y = n Eθ 1/Y − Eθ 1/Y
h θ2 Γ(n − 2) θ2 i h 1 1 i
= n2 − = n 2 2
θ −
Γ(n) (n − 1)2 (n − 1)(n − 2) (n − 1)2
n2 θ 2
= .
(n − 1)2 (n − 2)
Exercise 1.7 The trick here is to notice the relation
∂ ln p 0 (x − θ) 1 ∂p 0 (x − θ)
=
∂θ p 0 (x − θ) ∂θ
4
, 1 ∂p 0 (x − θ) p 0 ′ (x − θ)
= − = − .
p 0 (x − θ) ∂x p 0 (x − θ)
Thus we can write
Z ¡ ¢2
h³ p 0 ′ (X − θ) ´2 i p 0 ′ (y)
In (θ) = n Eθ − = n dy ,
p 0 (X − θ) R p 0 (y)
which is a constant independent of θ.
Exercise 1.8 Using the expression for the Fisher information derived in the
previous exercise, we write
Z ¡ ′ ¢2 Z π/2 ¡ ¢2
p 0 (y) − Cα cosα−1 y sin y
In (θ) = n dy = n dy
R p 0 (y) −π/2 C cosα y
Z π/2 Z π/2
2 2 α−2 2
= nC α sin y cos y dy = n C α (1 − cos2 y) cosα−2 y dy
−π/2 −π/2
Z π/2 ¡ ¢
= n C α2 cosα−2 y − cosα y dy .
−π/2
Here the first term is integrable if α − 2 > −1 (equivalently, α > 1), while
the second one is integrable if α > −1. Therefore, the Fisher information
exists when α > 1.
5
,Chapter 2
Exercise 2.9 By Exercise 1.4, the Fisher information of the Poisson(θ)
sample is In (θ) = n/θ. The joint distribution of the sample is
P
Xi
p(X1 , . . . Xn , θ) = Cn θ e−n θ
where Cn = Cn (X1 , . . . , Xn ) is the normalizing constant independent of θ .
As a function of θ, this joint probability has the algebraic form of a gamma
distribution. Thus, if we select the prior density to be a gamma density,
π(θ) = C(α, β) θ α−1 e− β θ , θ > 0, for some positive α and β, then the
weighted posterior density is also a gamma density,
P
f˜(θ | X1 , . . . , Xn ) = In (θ) Cn θ Xi
e−n θ C(α, β) θ α−1 e− β θ
P
Xi +α−2 −(n+β) θ
= C̃n θ e , θ > 0,
where C̃n = n Cn (X1 , . . . , Xn ) C(α, β) is the normalizing constant. The
expected value of the weighted posterior gamma distribution is equal to
Z ∞ P
˜ Xi + α − 1
θ f (θ | X1 , . . . , Xn ) dθ = .
0 n+β
Exercise 2.10 As shown in Example 1.10, the Fisher information In (θ) =
n/σ 2 . Thus, the weighted posterior distribution of θ can be found as follows:
P¡ ¢2 ¡
˜
¡ ¢ n X i − θ θ − µ)2 o
f θ | X1 , . . . , Xn = C In (θ) exp − −
2σ 2 2σθ2
n n ³ P X 2 2θ P X nθ2 θ2 2θµ µ2 ´ o
i i
= C 2 exp − − + + − +
σ 2σ 2 2σ 2 2σ 2 2σθ2 2σθ2 2σθ2
n 1h 2¡ n 1¢ ¡ nX̄n µ ¢io
= C1 exp − θ + − 2 θ +
2 σ 2 σθ2 σ2 σθ2
n 1¡n 1 ¢³ ¡ 2 ¢ ¡ 2 ¢´2 o
2 2
= C2 exp − + θ − n σ θ X̄n + µσ / n σ θ + σ .
2 σ 2 σθ2
Here C, C1 , and C2 are the¡ appropriate normalizing
¢ ¡ 2 constants.
¢ Thus, the
2 2 2
weighted posterior mean is n σθ X̄n + µσ / n σθ + σ and the variance is
¡ ¢−1 ¡ ¢
n/σ 2 + 1/σθ2 = σ 2 σθ2 / n σθ2 + σ 2 .
Exercise 2.11 First, we derive the Fisher information for the exponential
model. We have
∂ ln p(x, θ) 1
ln p(x, θ) = ln θ − θ x, = − x,
∂θ θ
6
,and
∂ 2 ln p(x, θ) 1
2
= − 2.
∂θ θ
Consequently, h 1i n
In (θ) = −nEθ − 2 = 2 .
θ θ
Further, the joint distribution of the sample is
P P
Xi
p(X1 , . . . Xn , θ) = Cn θ e− θ Xi
with the normalizing constant Cn = Cn (X1 , . . . , Xn ) independent of θ . As a
function of θ, this joint probability belongs to the family of gamma distri-
butions, hence, if we choose the conjugate prior to be a gamma distribution,
π(θ) = C(α, β) θ α−1 e− β θ , θ > 0, with some α > 0 and β > 0, then the
weighted posterior is also a gamma,
P P
f˜ = (θ | X1 , . . . , Xn ) = In (θ) Cn θ Xi
e− θ Xi
C(α, β) θ α−1 e− β θ
P P
Xi +α−3 −( Xi +β) θ
= C̃n θ e
where C̃n = n Cn (X1 , . . . , Xn ) C(α, β) is the normalizing constant. The
corresponding weighted posterior mean of the gamma distribution is equal
to Z ∞ P
Xi + α − 2
θ f˜(θ | X1 , . . . , Xn ) dθ = P .
0 Xi + β
Exercise 2.12 (i) The joint density of n independent Bernoulli(θ) obser-
vations X1 , . . . , Xn is
P P
p(X1 , . . . Xn , θ) = θ Xi (1 − θ) n− Xi .
£ ¤√n/2 − 1
Using the conjugate prior π(θ) = C θ (1 − θ) P ,√we obtain the Pnon- √
weighted posterior density f (θ | X1 , . . . , Xn ) = C θ Xi + n/2−1 (1−θ) n− Xi + n/2−1 ,
which is a beta density with the mean
P √ P √
∗ Xi + n/2 Xi + n/2
θn = P √ P √ = √ .
Xi + n/2 + n − Xi + n/2 n+ n
(ii) The variance of θn∗ is
n Varθ (X1 ) nθ(1 − θ)
Varθ [ θn∗ ] = √ 2 = √ ,
(n + n) (n + n)2
and the bias equals to
√ √ √
nθ + n/2 n/2 − n θ
bn (θ, θn∗ ) = Eθ [ θn∗ ] −θ = √ −θ = √ .
n+ n n+ n
7
,Consequently, the non-normalized quadratic risk of θn∗ is
£
Eθ (θn∗ − θ)2 ] = Varθ [ θn∗ ] + b2n (θ, θn∗ )
¡√ √ ¢2
nθ(1 − θ) + n/2 − n θ n/4 1
= √ 2 = √ 2 = √ .
(n + n) (n + n) 4(1 + n)2
(iii) Let tn = tn (X1 , . . . , Xn ) be the Bayes estimator with respect to a non-
normalized risk function
£ ¤
Rn (θ, θ̂n , w) = Eθ w(θ̂n − θ) .
The statement and the proof of Theorem 2.5 remain exactly the same if the
non-normalized risk and the corresponding Bayes estimator are used. Since
θn∗ is the Bayes estimator for a constant non-normalized risk, it is minimax.
Exercise 2.13 In Example 2.4, let α = β = 1 + 1/b. Then the Bayes
estimator assumes the form
P
Xi + 1/b
tn (b) =
n + 2/b
where Xi ’s are independent Bernoulli(θ) random variables. The normalized
quadratic risk of tn (b) is equal to
¡ ¢ h ¡p ¢2 i
Rn θ, tn (b), w = Eθ In (θ) (tn (b) − θ)
h £ ¤ ¡ ¢i
= In (θ) Varθ tn (b) + b2n θ, tn (b)
h nVar [X ] ³ nE [X ] + 1/b ´2 i
θ 1 θ 1
= In (θ) + −θ
(n + 2/b)2 n + 2/b
n h nθ(1 − θ) ³ nθ + 1/b ´2 i
= + − θ
θ(1 − θ) (n + 2/b)2 n + 2/b
n h nθ(1 − θ) (1 − 2θ)2 i
= +
θ(1 − θ) (n + 2/b)2 b 2 (n + 2/b)2
| {z }
→0
n nθ(1 − θ)
→ = 1 as b → ∞.
θ(1 − θ) n2
Thus, by Theorem 2.8, the minimax lower bound is equal to 1. The normal-
ized quadratic risk of X̄n = limb→∞ tn (b) is derived as
¡ ¢ h ¡p ¢2 i
Rn θ, X̄n , w = Eθ In (θ) (X̄n − θ)
£ ¤ n θ(1 − θ)
= In (θ) Varθ X̄n = = 1.
θ(1 − θ) n
That is, it attains the minimax lower bound, and hence X̄n is minimax.
8
,Chapter 3
Exercise 3.14 Let X ∼ Binomial(n , θ2 ). Then
h¯p ¯i h ¯¯ X/n − θ2 ¯¯ i
Eθ ¯ X/n − θ ¯ = Eθ ¯ p ¯
¯ X/n + θ ¯
r h
1 h ¯¯ ¯i
2¯ 1 ¡ ¢2 i
≤ Eθ X/n − θ ≤ Eθ X/n − θ2
θ θ
(by the Cauchy-Schwarz inequality)
r r
1 θ2 (1 − θ2 ) 1 − θ2
= = → 0 as n → ∞ .
θ n n
Exercise 3.15 First we show that the Hodges estimator θ̂n is asymptotically
unbiased. To this end write
£ ¤ £ ¤ £ ¤
Eθ θ̂n − θ = Eθ θ̂n − X̄n + X̄n − θ = Eθ θ̂n − X̄n
h ¡ ¢i
= Eθ − X̄n I |X̄n | < n−1/4 < n−1/4 → 0 as n → ∞.
Next consider the case θ 6= 0. We will check that
h ¡ ¢2 i
lim Eθ n θ̂n − θ = 1.
n→∞
Firstly, we show that
h ¡ ¢2 i
Eθ n θ̂n − X̄n → 0 as n → ∞.
Indeed,
h ¡ ¢2 i h ¡ ¢i
Eθ n θ̂n − X̄n = n Eθ (−X̄n )2 I |X̄n | < n−1/4
Z n1/4 −θ n1/2
1/2
¡ −1/4
¢ 1/2 1 2
≤ n Pθ |X̄n | < n = n √ e−z /2 dz
−n1/4 −θ n1/2 2π
Z n1/4
1/2 1 1/2 2
= n √ e − (u − θ n ) /2 du.
−n1/4 2π
Here we made a substitution u = z + θ n1/2 . Now, since |u| ≤ n1/4 , the
exponent can be bounded from above as follows
¡ ¢2
− u − θ n1/2 /2 = − u2 /2 + u θ n1/2 − θ2 n/2 ≤ − u2 /2 + θ n3/4 − θ2 n/2,
9
, and, thus, for all sufficiently large n, the above integral admits the upper
bound Z n1/4
1/2 1 1/2 2
n √ e − (u − θ n ) /2 du
−n1/4 2π
Z n1/4
1 2 3/4 2
≤ n1/2 √ e− u /2 + θ n − θ n/2 du
−n1/4 2π
Z n1/4
−θ2 n/4 1 2
≤ e √ e−u /2 du → 0 as n → ∞.
−n1/4 2π
Further, we use the Cauchy-Schwarz inequality to write
h ¡ ¢2 i h ¡ ¢2 i
Eθ n θ̂n − θ = Eθ n θ̂n − X̄n + X̄n − θ
h ¡ ¢2 i h ¡ ¢¡ ¢i h ¡ ¢2 i
= Eθ n θ̂n − X̄n + 2 Eθ n θ̂n − X̄n X̄n − θ + Eθ n X̄n − θ
h ¡ ¢2 i n h ¡ ¢2 i o1/2
≤ Eθ n θ̂n − X̄n + 2 Eθ n θ̂n − X̄n ×
| {z } | {z }
→0 →0
nh ¡ ¢2 i o1/2 h ¡ ¢2 i
× Eθ n X̄n − θ + Eθ n X̄n − θ → 1 as n → ∞ .
| {z } | {z }
=1 =1
Consider now the case θ = 0. We will verify that
£ ¤
lim Eθ n θ̂n2 = 0 .
n→∞
We have £ ¤ h ¡ ¢i
Eθ n θ̂n2 n X̄n2
I |X̄n | ≥ n
= Eθ −1/4
Z ∞
h ¡√ ¢2 ¡ √ 1/4
¢i z2 2
= Eθ nX̄n I | nX̄n | ≥ n = 2 √ e−z /2 dz
n1/4 2π
Z ∞
1/4
≤ 2 e−z dz = 2 e−n → 0 as n → ∞.
n1/4
Exercise 3.16 The following lower bound holds:
h i h i
sup Eθ In (θ) (θ̂n − θ)2 ≥ n I∗ max Eθ (θ̂n − θ)2
θ∈R θ∈{θ0 , θ1 }
n I∗ n h i h io
≥ Eθ0 (θ̂n − θ0 )2 + Eθ1 (θ̂n − θ1 )2
2
n I∗ h © ªi
2 2
= Eθ0 (θ̂n − θ0 ) + (θ̂n − θ1 ) exp ∆Ln (θ0 , θ1 ) (by (3.8))
2
10