Chapter2LinearAlgebraLinearalgebraisabranchofmathematicsthatiswidelyusedthroughoutscienceandengineering.However,becauselinearalgebraisaformofcontinuousratherthandiscretemathematics,manycomputerscientistshavelittleexperiencewithit.Agoodunderstandingoflinearalgebraisessentialforunderstandingandworkingwithmanymachinelearningalgorithms,especiallydeeplearningalgorithms.Wethereforeprecedeourintroductiontodeeplearningwithafocusedpresentationofthekeylinearalgebraprerequisites.Ifyouarealreadyfamiliarwithlinearalgebra,feelfreetoskipthischapter.Ifyouhavepreviousexperiencewiththeseconceptsbutneedadetailedreferencesheettoreviewkeyformulas,werecommendTheMatrixCookbook(PetersenandPedersen2006,).Ifyouhavenoexposureatalltolinearalgebra,thischapterwillteachyouenoughtoreadthisbook,butwehighlyrecommendthatyoualsoconsultanotherresourcefocusedexclusivelyonteachinglinearalgebra,suchasShilov1977().Thischapterwillcompletelyomitmanyimportantlinearalgebratopicsthatarenotessentialforunderstandingdeeplearning.2.1Scalars,Vectors,MatricesandTensorsThestudyoflinearalgebrainvolvesseveraltypesofmathematicalobjects:•Scalars:Ascalarisjustasinglenumber,incontrasttomostoftheotherobjectsstudiedinlinearalgebra,whichareusuallyarraysofmultiplenumbers.Wewritescalarsinitalics.Weusuallygivescalarslower-casevariablenames.Whenweintroducethem,wespecifywhatkindofnumbertheyare.For31CHAPTER2.LINEARALGEBRAexample,wemightsay“Lets∈Rbetheslopeoftheline,”whiledefiningareal-valuedscalar,or“Letn∈Nbethenumberofunits,”whiledefininganaturalnumberscalar.•Vectors:Avectorisanarrayofnumbers.Thenumbersarearrangedinorder.Wecanidentifyeachindividualnumberbyitsindexinthatordering.Typicallywegivevectorslowercasenameswritteninboldtypeface,suchasx.Theelementsofthevectorareidentifiedbywritingitsnameinitalictypeface,withasubscript.Thefirstelementofxisx1,thesecondelementisx2andsoon.Wealsoneedtosaywhatkindofnumbersarestoredinthevector.IfeachelementisinR,andthevectorhasnelements,thenthevectorliesinthesetformedbytakingtheCartesianproductofRntimes,denotedasRn.Whenweneedtoexplicitlyidentifytheelementsofavector,wewritethemasacolumnenclosedinsquarebrackets:x=x1x2...xn.(2.1)Wecanthinkofvectorsasidentifyingpointsinspace,witheachelementgivingthecoordinatealongadifferentaxis.Sometimesweneedtoindexasetofelementsofavector.Inthiscase,wedefineasetcontainingtheindicesandwritethesetasasubscript.Forexample,toaccessx1,x3andx6,wedefinethesetS={1,3,6}andwritexS.Weusethe−signtoindexthecomplementofaset.Forexamplex−1isthevectorcontainingallelementsofxexceptforx1,andx−Sisthevectorcontainingalloftheelementsofexceptforxx1,x3andx6.•Matrices:Amatrixisa2-Darrayofnumbers,soeachelementisidentifiedbytwoindicesinsteadofjustone.Weusuallygivematricesupper-casevariablenameswithboldtypeface,suchasA.Ifareal-valuedmatrixAhasaheightofmandawidthofn,thenwesaythatA∈Rmn×.Weusuallyidentifytheelementsofamatrixusingitsnameinitalicbutnotboldfont,andtheindicesarelistedwithseparatingcommas.Forexample,A11,istheupperleftentryofAandAm,nisthebottomrightentry.Wecanidentifyallofthenumberswithverticalcoordinateibywritinga“”forthehorizontal:coordinate.Forexample,Ai,:denotesthehorizontalcrosssectionofAwithverticalcoordinatei.Thisisknownasthei-throwofA.Likewise,A:,iis32CHAPTER2.LINEARALGEBRAA=24A11,A12,A21,A22,A31,A32,35)A=A11,A21,A31,A12,A22,A32,Figure2.1:Thetransposeofthematrixcanbethoughtofasamirrorimageacrossthemaindiagonal.thei-thcolumnofA.Whenweneedtoexplicitlyidentifytheelementsofamatrix,wewritethemasanarrayenclosedinsquarebrackets:A11,A12,A21,A22,.(2.2)Sometimeswemayneedtoindexmatrix-valuedexpressionsthatarenotjustasingleletter.Inthiscase,weusesubscriptsaftertheexpression,butdonotconvertanythingtolowercase.Forexample,f(A)i,jgiveselement(i,j)ofthematrixcomputedbyapplyingthefunctionto.fA•Tensors:Insomecaseswewillneedanarraywithmorethantwoaxes.Inthegeneralcase,anarrayofnumbersarrangedonaregulargridwithavariablenumberofaxesisknownasaWedenoteatensornamed“A”tensor.withthistypeface:A.WeidentifytheelementofAatcoordinates(i,j,k)bywritingAi,j,k.Oneimportantoperationonmatricesisthetranspose.Thetransposeofamatrixisthemirrorimageofthematrixacrossadiagonalline,calledthemaindiagonal,runningdownandtotheright,startingfromitsupperleftcorner.SeeFig.foragraphicaldepictionofthisoperation.Wedenotethetransposeofa2.1matrixasAA,anditisdefinedsuchthat(A)i,j=Aj,i.(2.3)Vectorscanbethoughtofasmatricesthatcontainonlyonecolumn.Thetransposeofavectoristhereforeamatrixwithonlyonerow.Sometimeswe33CHAPTER2.LINEARALGEBRAdefineavectorbywritingoutitselementsinthetextinlineasarowmatrix,thenusingthetransposeoperatortoturnitintoastandardcolumnvector,e.g.,x=[x1,x2,x3].Ascalarcanbethoughtofasamatrixwithonlyasingleentry.Fromthis,wecanseethatascalarisitsowntranspose:aa=.Wecanaddmatricestoeachother,aslongastheyhavethesameshape,justbyaddingtheircorrespondingelements:whereCAB=+Ci,j=Ai,j+Bi,j.Wecanalsoaddascalartoamatrixormultiplyamatrixbyascalar,justbyperformingthatoperationoneachelementofamatrix:D=a·B+cwhereDi,j=aB·i,j+c.Inthecontextofdeeplearning,wealsousesomelessconven