CS229T/STATS231 (Fall 2018–2019)
Note: please do not copy or distribute.
Due date: 10/03/2018, 11pm
This is a diagnostic homework and will not count towards your grade (but the bonus points do count).
It should give you an idea of the types of concepts and skills required for the course, and also give you an
opportunity to practice some things in case you’re rusty. It also will allow you to see how we grade.
1. Linear algebra (0 points)
a (dual norm of L1 norm) The L1 norm k · k1 of a vector v ∈ Rn is defined as
n
X
kvk1 = |vi |. (1)
i=1
The dual norm k · k∗ of a norm k · k is defined as
kvk∗ = sup (v · w). (2)
kwk≤1
Compute the dual norm of the L1 norm. (Here v · w denotes the inner product between v and w: v · w ,
P n
i=1 vi wi )
Solution:
We will prove that
sup (v · w) = max vi = kvk∞ (3)
kwk1 ≤1 i∈[n]
which implies that the dual norm of L1 norm is L∞ norm.
Towards proving (3), we first observe that
n
X n
X
v·w = vi w i ≤ |vi | · |wi | (4)
i=1 i=1
Xn
≤ kvk∞ · |wi | (5)
i=1
= kvk∞ kwk1 (6)
(7)
Therefore,
sup (v · w) ≤ kvk∞ (8)
kwk1 ≤1
1
, We argue equality can be attained: let i? be such that vi? = arg maxi |vi | = kvk∞ , then setting w = ei?
(where ei denotes the vector with 1 on the i-th coordinate and 0 elsewhere) gives (v · w) = kvk∞ . Thus we
complete the proof of equation (3).
Remarks: dual norms are useful to bound inner products: u · v ≤ kukkvk∗ , which follows directly from the
definition of the dual norm. This is a generalization of the Cauchy-Schwartz inequality (which is for the L2
norm).
In general, the Lp norm and the Lq norm are dual when 1/p + 1/q = 1.
Pnb (trace is sum of singular values) The nuclear norm of a matrix A ∈ Rn×n is defined as
i=1 |σi (A)|, where the σ1 (A), . . . , σn (A) are the singular values of A. Show
Pn that the nuclear norm of a
symmetric positive semi-definite matrix A is equal to its trace (tr(A) = i=1 Aii ). (For this reason, the
nuclear norm is sometimes called the trace norm.) (Hint: use the fact that tr(AB) = tr(BA).)
Solution:
As A is PSD, the SV D of A has the form A = U SU > . Using the trace rotation trick,
X X
tr(A) = tr(U SU > ) = tr(U > U S) = tr(IS) = σi (A) = |σi (A)|. (9)
i i
The last equality used that singular values are non-negative.
c. (3 bonus points) (trace is bounded by nuclear norm) Show that the trace of a square matrix
A ∈ Rn×n is always less than or equal to its nuclear norm.
Solution:
Suppose the SVD decomposition of A is A = U ΣV > . Using the trace rotation trick,
tr(A) = tr(V > U Σ) (10)
Let R = V > U . Let Ui and Vi denote the i-th column of U and V respectively. Since Ui and Vi are unit
vectors by the property of SVD, we have |Rii | = | hUi , Vi i | ≤ 1. Therefore,
n
X
tr(A) = tr(RΣ) = Rii Σii (because Σ is a diagonal matrix)
i=1
n
X
≤ |Σii | (because |Rii | ≤ 1)
i=1
= kAk?
Remark: The equality is achieved when the left and right singular subspaces are aligned (Ui = Vi ) —
which is exactly the case in part (b).
SVD is generally a very powerful tool to deal with various linear algebraic quantities.
2