Linear Algebra and Optimization for Machine
W W W W W
LearningW
1st Edition by Charu Aggarwal. Chapters 1 – 11
W W W W WW W W W
vii
,Contents
1 Linear W Algebra W and W Optimization: W An W Introduction 1
2 Linear W Transformations W and W Linear W Systems 17
3 Diagonalizable W Matrices W and W Eigenvectors 35
4 Optimization WBasics: WA WMachine WLearning WView 47
5 Optimization W Challenges W and W Advanced W Solutions 57
6 Lagrangian W Relaxation W and W Duality 63
7 Singular W Value W Decomposition 71
8 Matrix W Factorization 81
9 The W Linear W Algebra W of W Similarity 89
10 The W Linear W Algebra W of W Graphs 95
11 Optimization W in W Computational W Graphs 101
viii
,Chapter W 1
Linear WAlgebra Wand WOptimization: WAn WIntroduction
1. For W any W two W vectors W x W and W y, W which W are W each W of W length W a,
W show W that W (i) W x W− Wy W is Worthogonal Wto Wx W+ Wy, W and W(ii) W the
Wdot Wproduct Wof Wx W− W3y W and Wx W+ W3y W is W negative.
(i) WThe Wfirst Wis· WWsimply
− Wx W x W y W y Wusing Wthe Wdistributive
W ·
Wproperty Wof Wmatrix Wmultiplication. WThe Wdot Wproduct Wof Wa Wvector
Wwith Witself Wis Wits Wsquared Wlength. WSince Wboth Wvectors Ware Wof
Wthe Wsame Wlength, Wit Wfollows Wthat Wthe Wresult Wis W0. W(ii) WIn Wthe
Wsecond Wcase, Wone Wcan Wuse Wa Wsimilar Wargument Wto Wshow Wthat
Wthe Wresult Wis Wa 2 W− W9a2, Wwhich Wis Wnegative.
2. Consider W a W situation W in W which W you W have W three W matrices W A, W B,
W and W C, W of W sizes W 10 W× W2, W2 W× W10, Wand W 10 W× W10, W respectively.
(a) Suppose Wyou Whad Wto Wcompute Wthe Wmatrix Wproduct WABC. WFrom
Wan Wefficiency Wper- Wspective, Wwould Wit Wcomputationally Wmake Wmore
Wsense Wto Wcompute W(AB)C Wor Wwould Wit Wmake Wmore Wsense Wto
Wcompute WA(BC)?
(b) If Wyou Whad Wto Wcompute Wthe Wmatrix Wproduct WCAB, Wwould Wit
Wmake Wmore Wsense Wto Wcompute W (CA)B W or W C(AB)?
The Wmain Wpoint Wis Wto Wkeep Wthe Wsize Wof Wthe Wintermediate
Wmatrix Was Wsmall Was Wpossible W in Worder Wto Wreduce Wboth
Wcomputational Wand Wspace Wrequirements. WIn Wthe Wcase Wof WABC,
Wit Wmakes Wsense Wto Wcompute WBC Wfirst. WIn Wthe Wcase Wof WCAB Wit
Wmakes Wsense Wto Wcompute WCA Wfirst. WThis Wtype Wof Wassociativity
Wproperty Wis Wused Wfrequently Win Wmachine Wlearning Win Worder Wto
Wreduce Wcomputational Wrequirements.
3. Show W that W if W a W matrix W A W — satisfies W A W = AT W, W then W all
W the W diagonal W elements W of W the Wmatrix Ware W0.
Note Wthat WA W+ WAT W= W0. WHowever, Wthis Wmatrix Walso Wcontains
Wtwice Wthe Wdiagonal Welements Wof WA Won Wits Wdiagonal. WTherefore,
Wthe Wdiagonal Welements Wof WA Wmust Wbe W0.
1
, 4. Show Wthat Wif Wwe Whave Wa Wmatrix Wsatisfying WA W= AT W, Wthen Wfor
T —
Wany Wcolumn Wvector Wx, Wwe Whave W x WAx W= W0.
Note W that W the W transpose W of W the W scalar W xT WAx W remains W unchanged.
W Therefore, W we W have
xT WAx W= W(x WAx)
T T
W = WxT WAT Wx W= W−xT WAx. W Therefore, W we W have
W 2x WAx W= W0.
T
2