Linear Algebra and Optimization for Machine
Learning
1st Edition by Charu Aggarwal. Chapters 1 – 11
vii
,Contents
1 LinearW AlgebraW andW Optimization:W AnW Introduction 1
2 LinearW Transformations W andW LinearW Systems 17
3 Diagonalizable W MatricesW andW Eigenvectors 35
4 OptimizationWBasics:WAWMachineWLearningWView 47
5 OptimizationW ChallengesW andW AdvancedW Solutions 57
6 LagrangianW RelaxationW andW Duality 63
7 SingularW ValueW Decomposition 71
8 MatrixW Factorization 81
9 TheW LinearW AlgebraW ofW Similarity 89
10 TheW LinearW Algebra W ofW Graphs 95
11 OptimizationW inW ComputationalW Graphs 101
viii
,ChapterW 1
LinearWAlgebraWandWOptimization:WAnWIntroduction
1. ForW anyW twoW vectorsW xW andW y,W whichW areW eachW ofW lengthW a,W showW tha
tW (i)W xW−WyW isWorthogonalWtoWxW+Wy,W andW(ii)W theWdotWproductWofWxW−W3
yW andWxW+W3yW isW negative.
(i)WTheWfirstWisWsimply
·W −WWxW x W yW yWusingWtheWdistributiveWpropertyWofWm
atrixWmultiplication. · WTheWdotWproductWofWaWvectorWwithWitselfWisWitsWs
quaredWlength.WSinceWbothWvectorsWareWofWtheWsameWlength,WitWfollows
WthatWthe Wresult Wis W0.W(ii) WIn Wthe Wsecond Wcase,Wone Wcan WuseWaWsimilarW
argumentWtoWshowWthatWtheWresultWisWa2W−W9a2,WwhichWisWnegative.
2. ConsiderW aW situationW inW whichW youW haveW threeW matricesW A,W B,W andW C,W of
W sizes W 10W× W2,W2 W× W10,WandW 10W×W10,W respectively.
(a) SupposeWyouWhadWtoWcomputeWtheWmatrixWproductWABC.WFromWanWeffi
ciencyWper-
Wspective,WwouldWitWcomputationallyWmakeWmoreWsenseWtoWcomputeW(AB)
CWorWwouldWitWmakeWmoreWsenseWtoWcomputeWA(BC)?
(b) IfWyouWhadWtoWcomputeWtheWmatrixWproductWCAB,WwouldWitWmakeWmor
eWsenseWtoWcomputeW (CA)BW orW C(AB)?
TheWmainWpointWisWtoWkeepWtheWsizeWofWtheWintermediateWmatrixWas
Wsmall Was WpossibleW inWorder WtoWreduceWbothWcomputationalWand Wspa
ceWrequirements.WInWtheWcaseWofWABC,WitWmakesWsenseWtoWcomputeW
BCWfirst.WInWtheWcaseWofWCABWitWmakesWsenseWtoWcomputeWCAWfirst.
WThisWtype Wof WassociativityWpropertyWisWused WfrequentlyWin Wmachine
WlearningWinWorderWtoWreduceWcomputationalWrequirements.
3. — W AW =
ShowW thatW ifW aW matrixW AW satisfies
ATW,WthenW allWtheW diagonalWelements
W of W the Wmatrix Ware W0.
NoteWthatWAW+WATW=W0.WHowever,WthisWmatrixWalsoWcontainsWtwiceW
theWdiagonalWelementsWofWAWonWitsWdiagonal.WTherefore,WtheWdiago
nalWelementsWofWAWmustWbeW0.
4. ShowWthatWifWweWhaveWaWmatrixWsatisfying
— WA W=
1
, ATW,WthenWforWanyWcolumnWvector
Wx,WweWhaveW x WAxW=W0.
T
NoteW thatW theW transposeW ofW theW scalarW xTWAxW remainsW unchanged.W Therefo
re,W weW have
xTWAxW=W(xTWAx)TW =WxTWATWxW=W−xTWAx.W Therefore,W weW haveW 2xTWA
xW=W0.
2