Edition
Solutions Manual
Dimitri P. Bertsekas
Massachusetts Institute of Technology
Athena Scientific, Belmont, Massachusetts
NOTE
This solutions manual is continuously updated and improved. Portions of the manual, involving
primarily theoretical exercises, have been posted on the internet at the book’s www page
http://www.athenasc.com/nonlinbook.html
Many thanks are due to several people who have contributed solutions, and particularly to Angelia
Nedic, Asuman Ozdaglar, and Cynara Wu.
Last Updated: May 2005
, Solutions Chapter 1
SECTION 1.1
1.1.9 www
For any x, y ∈ Rn, from the second order expansion (see Appendix A, Proposition A.23) we have
1
ƒ (y) − ƒ (x) = (y − x)′ ∇ ƒ (x) + (y − x)′ ∇ 2 ƒ (z)(y − x), (1)
2
where z is some point of the line segment joining x and y. Setting x = 0 in (1) and using the
given property of ƒ , it can be seen that ƒ is coercive. Therefore, there exists x∗ ∈ Rn such that ƒ
(x∗) = infx∈ Rn ƒ (x) (see Proposition A.8 in Appendix A). The condition
m||y||2 ≤ y′ ∇ 2 ƒ (x)y, ∀ x, y ∈ Rn,
is equivalent to strong convexity of ƒ . Strong convexity guarantees that there is a unique global
minimum x∗. By using the given property of ƒ and the expansion (1), we obtain
m M
(y − x)′ ∇ ƒ (x) + ||y − x||2 ≤ ƒ (y) − ƒ (x) ≤ (y − x)′ ∇ ƒ (x) + ||y − x||2.
2 2
Taking the minimum over y ∈ Rn in the expression above gives
³ ´ µ ¶
m M
min (y − ) (x) + ||y − x|| ≤ ƒ (x ) − ƒ (x) ≤ min (y − )
2 ∗ (x) + ||y − x||2 .
y∈ Rn 2 y∈ Rn 2
Note that for any a > 0
³ a ´ 1
min (y − x)′ ∇ ƒ (x) + ||y − x||2 = ƒ (x)||2,
∇ f (x)
and the minimum is attained for y = x − . Using this relation for a = m and a = M , we
obtain
1 1
− ||∇ ƒ (x)||2.
||∇ ƒ (x)||2 ≤ ƒ (x∗) − ƒ (x) ≤ −
2 2
The first chain of inequalities follows from here. To show the second relation, use the expansion
(1) at the point x = x∗, and note that ∇ ƒ (x∗) = 0, so that
1
ƒ (y) − ƒ (x∗) = (y − x∗)′ ∇ 2 ƒ (z)(y − x∗).
2
The rest follows immediately from here and the given property of the function ƒ .
,1.1.11 www
Since x∗ is a nonsingular strict local minimum, we have that ∇ 2ƒ (x∗) > 0. The function ƒ is
twice continuously differentiable over n, so that there exists a scalar δ > 0 such that
∇ 2ƒ (x) > 0, ∀ x, with ||x − x∗|| ≤ δ.
This means that the function ƒ is strictly convex over the open sphere B(x∗, δ) centered at x∗
with radius δ. Then according to Proposition 1.1.2, x∗ is the only stationary point of ƒ in the
sphere B(x∗, δ).
If ƒ is not twice continuously differentiable, then x∗ need not be an isolated stationary
point. The example function ƒ does not have the second derivative at x = 0. Note that ƒ (x) > 0
for x /= 0, and by definition ƒ (0) = 0. Hence, x∗ = 0 is the unique (singular) global minimum.
The first derivative of ƒ (x) for x /= 0 can be calculated as follows:
µ µ ¶ µ ¶¶
√ 5π √ √ 5π √
ƒ ′ (x) = 2x
2 − sin − + 3 cos −3 ln(x2) 3 ln(x2)
µ µ ¶ µ ¶¶
√ π 5π √ π 5π √
= 2x 2 − 2 cos sin − 3 ln(x ) + 2 sin cos
2 − 3 ln(x )
2
µ µ ¶¶
√ π 5π √
= 2x 2 + 2 sin − + 3 ln(x2)
√ √
= 2x 2 − 2 cos(2 3 ln x) .
(1−8k)π −(1+8k)π
√ √
Solving ƒ ′ (x) = 0, gives xk = e 8 3 and yk = e 8 3 for k integer. The second derivative
of ƒ (x), for x /= 0, is given by
³√ √ √ √ ´
ƒ ′ ′ (x) = 2 2 − 2 cos(2 3 ln x) + 4 3 sin(2 3 ln x) .
Thus: ³√
π √ π´
ƒ ′ ′ (xk) = 2
2 − 2 cos + 4 3 sin
à √ 4 √ ! 4
√ 2 √ 2
=2 2− 2 +4 3
2 2
√
= 4 6.
Similarly µ µ ¶ µ ¶¶
√ −π √ −π
ƒ (y ) = = 2 2 − 2 cos
′ ′ k + 4 3 sin
à √ 4 √ ! 4
√ 2 √ 2
=2 2− 2 −4 3
2 2
√
= − 4 6.
Hence, {xk | k ≥ 0} is a sequence of nonsingular local minima, which evidently converges to x∗,
while {yk | k ≥ 0} is a sequence of nonsingular local maxima converging to x∗.
, 1.1.12 www
(a) Let x∗ be a strict local minimum of ƒ . Then there is δ such that ƒ (x∗) < ƒ(x) for all x in
the closed sphere centered at x∗ with radius δ. Take any local sequence {xk} that minimizes ƒ ,
i.e. ||xk − x∗|| ≤ δ and limk→∞ ƒ (xk) = ƒ (x∗). Then there is a subsequence {xki } and the point
x such that xki → x and ||x − x∗|| ≤ δ. By continuity of ƒ , we have
ƒ (x) = lim ƒ (xki ) = ƒ (x∗).
i→∞
Since x∗ is a strict local minimum, it follows that x = x∗. This is true for any convergent
subsequence of {xk}, therefore {xk} converges to x∗, which means that x∗ is locally stable. Next we
will show that for a continuous function ƒ every locally stable local minimum must be strict.
Assume that this is not true, i.e., there is a local minimum x∗ which is locally stable but is not strict.
Then for any θ > 0 there is a point xθ /= x∗ such that
0 < ||xθ − x∗|| < θ and ƒ (xθ ) = ƒ (x∗). (1)
Since x∗ is a stable local minimum, there is a δ > 0 such that xk → x∗ for all {xk} with
lim ƒ (xk) = ƒ (x∗) and ||xk − x∗|| < δ. (2)
For θ = δ in (1), we can find a point x0 /= x∗ for which 0 < ||x0 − x∗|| < δ and ƒ (x0) = ƒ (x∗).
1 1
Then, for θ = ||x0 − x∗|| in (1), we can find a point x1 such that 0 < ||x1 − x∗|| < ||x0 − x∗||
1
and ƒ (x 1 ) = ƒ (x∗). Then, again, for θ = ||x1 − x∗|| in (1), we can find a point x2 such that 0
1
< ||x2 − x∗|| < ||x1 − x∗|| and ƒ (x2) = ƒ (x∗), and so on. In this way, we have constructed a
sequence {xk} of distinct points such that 0 < ||xk − x∗|| < δ, ƒ (x k ) = ƒ (x∗) for all k, and
limk→∞ xk = x∗. Now, consider the sequence {yk} defined by
y2m = xm, y2m+1 = x0, ∀ m ≥ 0.
Evidently, the sequence {yk} is contained in the sphere centered at x∗ with the radius δ. Also
we have that ƒ (yk) = ƒ (x∗), but {yk} does not converge to x∗. This contradicts the assumption
that x∗ is locally stable. Hence, x∗ must be strict local minimum.
(b) Since x∗ is a strict local minimum, we can find δ > 0, such that ƒ (x) > ƒ(x∗) for all x /= x∗
with ||x − x∗|| ≤ δ. Then min ∗ ƒ (x) = ƒ δ > ƒ(x∗). Let Gδ = max ∗ |g(x)|. Now,
||x− x ||=δ ||x− x ||≤δ
we have
ƒ (x) − ϵGδ ≤ ƒ (x) + ϵg(x) ≤ ƒ (x) + ϵGδ, ∀ ϵ > 0, ∀ x ||x − x∗|| < δ.