统计学习[The Elements of Statistical Learning]第二章习题

宝贝中的大小姐
1 ℃
2020-04-30

整理文档很辛苦，赏杯茶钱您下走！

还剩 ... 页未读，继续阅读 >>

免费阅读已结束，点击下载阅读编辑剩下 ... 页

阅读已结束，您可以下载文档离线阅读编辑

资源描述

TheElementofStatisticalLearning{Chapter2oxstar@SJTUJanuary3,2011Ex.2.1SupposeeachofK-classeshasanassociatedtargettk,whichisavectorofallzeros,exceptaoneinthekthposition.Showthatclassifyingtothelargestelementof^yamountstochoosingtheclosesttarget,minkktk^yk,iftheelementsof^ysumtoone.ProofOurgoalisprovingargmaxk(^yk)=argminkktk^yk.Infact,argminkktk^yk=argminkktk^yk2//ktk^ykispositive=argminkKXi=1((tk)i^yi)2//expanding=argminkKXi=1((tk)i22(tk)i^yi+^yi2)=argminkKXi=1((tk)i22(tk)i^yi)//item^yi2doesn'tcontainvariablek=argminkKXi=1(tk)i22KXi=1(tk)i^yi!=argmink12KXi=1(tk)i^yi!//denitionoftk=argmink2KXi=1(tk)i^yi!=argmaxkKXi=1(tk)i^yi!=argmaxk(^yk)//wheni=k,tk=1,elsetk=0Ex.2.2ShowhowtocomputetheBayesdecisionboundaryforthesimulationexampleinFigure2.5.AnswerThebookshowshowtogeneratethetrainingdata.ThismodelcontainstwoclasseswhicharelabeledBLUEandORANGErespectively.Letk2fBLUE;ORANGEg,andweuseGktorepresentclassk.Eachclassarecodedtokviadummyvariables,whereBLUE=(1;0)T(1)ORANGE=(0;1)T(2)Firstly,10meansmfromabivariateGaussiandistributionN(k;I)aregenerated.SowehaveP(mkjGk)=f(mk;k;I)(3)1wheref(x;;2)=1p22e(x)2=(22)(4)Thenforeachclass100observationsaregeneratedasfollows:foreachobservation,an(mk)iatrandomwithprobability1/10ispicked,andthengeneratedaN((mk)i;I=5),thusleadingtoamixtureofGaussianclustersforeachclass.SowehaveP(X=xj(mk)i;Gk)=110f(x;(mk)i;I=5)(5)P(X=xjmk;Gk)=10Xi=1110f(x;(mk)i;I=5)(6)Thevaluesofmkisunknown,soweshouldmarginalizethemout.P(X=xjGk)=ZP(X=xjmk;Gk)P(mkjGk)dmk(7)Fromequation(2.23)intextbook,theBayes-optimaldecisionboundarycanbecalculatedbythevaluesofxthatsatisfyP(GBLUEjX=x)=P(GORANGEjX=x)(8)Becauseeveryclassescontainthesameamountoftrainingdata,soP(GBLUE)=P(GORANGE)(9)Bayes'formulatellusP(GkjX=x)=P(X=xjGk)P(Gk)PkP(X=xjGk)P(Gk)(10)Accordingtoequation(8)-(10),wehaveP(X=xjGBLUE)=P(X=xjGORANGE)(11)ThuswecancalculateBayes-optimaldecisionboundarybyequation(1)-(11).Ex.2.4Theedgeeectproblemdiscussedonpage23isnotpeculiartouniformsamplingfromboundeddomains.ConsiderinputsdrawnfromasphericalmultinormaldistributionXN(0;Ip).Thesquareddistancefromanysamplepointtotheoriginhasa2pdistributionwithmeanp.Con-siderapredictionpointx0drawnfromthisdistribution,andleta=x0=kx0kbeanassociatedunitvector.Letzi=aTxibetheprojectionofeachofthetrainingpointsonthisdirection.ShowthattheziaredistributedN(0;1)withexpectedsquareddistancefromtheorigin1,whilethetargetpointhasexpectedsquareddistancepfromtheorigin.Henceforp=10,randomlydrawntestpointisabout3.1standarddeviationsfromtheorigin,whileallthetrainingpointsareonaverageonestandarddeviationalongdirectiona.Somostpre-dictionpointsseethemselvesaslyingontheedgeofthetrainingset.Proof*xiN(0;Ip))(xi)jN(0;1),wherej2f1;2;:::;pg)aj(xi)jN(0;a2j))zi=Ppj=1aj(xi)jN(0;Ppj=1a2j)=N(0;1))SquareddistancefromtheoriginE((zi0)2)=Var(zi)=1*Thesquareddistancedfromanysamplepointtotheoriginhasa2pdistributionwithmeanp)TargetpointhasexpectedsquareddistanceE(d)fromtheorigin,andE(d)=p*Thereisarandomlydrawntestpointxtand(xt0)2=dthatVar(xt)=E((xt0)2)=E(d)=p2)Forp=10;Std(xt)=pVar(xt)=pp=p103:1andStd(zi)=pVar(zi)=1Ex.2.6Consideraregressionproblemwithinputsxiandoutputsyi,andaparameterizedmodelf(x)tobetbyleastsquares.Showthatifthereareobservationswithtiedoridenticalvaluesofx,thenthetcanbeobtainedfromareducedweightedleastsquaresproblem.Proof\Iftherearemultipleobservationpairsxi;yi`;`=1;:::;Niateachvalueofxi,theriskislimited.(Page32)Weshouldestimatetheparametersinfbyminimizingtheresidualsum-of-squares,i.e.calculateargminPiPNi`=1(f(xi)yi`)2,whileargminXiNiX`=1(yi`f(xi))2=argminXiNiX`=1(y2i`2yi`f(xi)+f(xi)2)=argminXiNiX`=1y2i`2NiPNi`=1yi`Nif(xi)+NiX`=1f(xi)2!=argminXiNiX`=1y2i`2Niyif(xi)+Nif(xi)2!=argminXiNiX`=1y2i`Niyi2!+Niyi22Niyif(xi)+Nif(xi)2!=argminXi(yi22yif(xi)+f(xi)2)=argminXi(yif(xi))2=argminRSS()\Inthiscase,thesolutionspassthroughtheaveragevaluesoftheyiateachxi.Sowecanuseleastweightedsquarestoestimatetheparametersinf.Ex.AdditionalDeriveequation(2.47).EPEk(x0)=2+hf(x0)1kkX`=1f(x(`))i2+2kAnswer\SupposethedataarisefromamodelY=f(X)+,f(X)andareindependent.SoE(f(X))=E(f(X))E()(12)ErrorsareonlyinandE()=0;Var()=2,sowehaveVar(Y)=Var()=E[(E())2]=E(2),andVar(Y)=E(2)=2(13)Nearestneighborstox0alsoarisefromthismodel,henceVar(f(x(`)))=2,w.r.t.k-nearest-neighborVar(^fk(x0))=Var(f(x(`)))=Var(f(x(`)))k=2k(14)E(^fk(x0))=E(f(x(`)))=f(x(`))(15)Thereforewecanassumethat^fk(x0)=f0(x0)+0,andE(0)=0(16)Var(0)=E(02)=2k(17)E[f0(x0)]=E(^fk(x0))=f(x(`))(18)3Nowwecanderiveequation(2.47)asfollowsEPEk(x0)=E[(Y^fk(x0))2jX=x0]=E[(f(x0)+^fk(x0))2]=E[f(x0)2+2f(x0)+22f(x0)^fk(x0)2^fk(x0)+^fk(x0)2]=E[f(x0)2+22f(x0)^fk(x0)+^fk(x0)2]//E()=0and(12)=E[2+(f(x0)^fk(x0))2]=E[2+(f0(x0)+0f(x0))2]=E[2+(f(x0)f0(x0))2+02]//expandingand(12)(16)=2+(f(x0)f(x(`)))2+2k//(13)(17)(18)=2+hf(x0)1kkX`=1f(x(`))i2+2k4