机器学习-试卷-finals14

aaaq123
0 ℃
2020-08-30

整理文档很辛苦，赏杯茶钱您下走！

还剩 ... 页未读，继续阅读 >>

免费阅读已结束，点击下载阅读编辑剩下 ... 页

阅读已结束，您可以下载文档离线阅读编辑

资源描述

CS189Spring2014IntroductiontoMachineLearningFinal•Youhave3hoursfortheexam.•Theexamisclosedbook,closednotesexceptyourone-pagecribsheet.•Pleaseusenon-programmablecalculatorsonly.•MarkyouranswersONTHEEXAMITSELF.Ifyouarenotsureofyouransweryoumaywishtoprovideabriefexplanation.•Fortrue/falsequestions,llintheTrue/Falsebubble.•Formultiple-choicequestions,llinthebubblesforALLCORRECTCHOICES(insomecases,theremaybemorethanone).Wehaveintroducedanegativepenaltyforfalsepositivesforthemultiplechoicequestionssuchthattheexpectedvalueofrandomlyguessingis0.Don'tworry,forthissection,yourscorewillbethemaximumofyourscoreand0,thusyoucannotincuranegativescoreforthissection.FirstnameLastnameSIDFirstandlastnameofstudenttoyourleftFirstandlastnameofstudenttoyourrightForstauseonly:Q1.TrueorFalse/9Q2.MultipleChoice/24Q3.Softmaxregression/11Q4.PCAandleastsquares/10Q5.Mixtureoflinearregressions/10Q6.Trainingsetaugmentation/10Q7.KernelPCA/12Q8.Autoencoder/14Total/1001Q1.[9pts]TrueorFalse(a)[1pt]Thesingularvaluedecompositionofarealmatrixisunique.TrueFalse(b)[1pt]Amultiple-layerneuralnetworkwithlinearactivationfunctionsisequivalenttoonesingle-layerperceptronthatusesthesameerrorfunctionontheoutputlayerandhasthesamenumberofinputs.TrueFalse(c)[1pt]Themaximumlikelihoodestimatorfortheparameterofauniformdistributionover[0;]isunbiased.TrueFalse(d)[1pt]Thek-meansalgorithmforclusteringisguaranteedtoconvergetoalocaloptimum.TrueFalse(e)[1pt]Increasingthedepthofadecisiontreecannotincreaseitstrainingerror.TrueFalse(f)[1pt]Thereexistsaone-to-onefeaturemappingforeveryvalidkernelk.TrueFalse(g)[1pt]Forhigh-dimensionaldata,k-dtreescanbeslowerthanbruteforcenearestneighborsearch.TrueFalse(h)[1pt]Ifwehadinnitedataandinnitelyfastcomputers,kNNwouldbetheonlyalgorithmwewouldstudyinCS189.TrueFalse(i)[1pt]Fordatasetswithhighlabelnoise(manydatapointswithincorrectlabels),randomforestswouldgenerallyperformbetterthanboosteddecisiontrees.TrueFalse2Q2.[24pts]MultipleChoice(a)[2pts]InHomework4,youtalogisticregressionmodelonspamandhamdataforaKaggleCompetition.Assumeyouhadaverygoodscoreonthepublictestset,butwhentheGSIsranyourmodelonaprivatetestset,yourscoredroppedalot.Thisislikelybecauseyouoverttedbysubmittingmultipletimesandchangingthefollowingbetweensubmissions:,yourpenaltyterm,yourstepsize,yourconvergencecriterionFixingarandombug(b)[2pts]Givend-dimensionaldatafxigNi=1,yourunprinciplecomponentanalysisandpickPprinciplecompo-nents.Canyoualwaysreconstructanydatapointxifori2f1:::NgfromthePprinciplecomponentswithzeroreconstructionerror?Yes,ifPdYes,ifPnYes,ifP=dNo,always(c)[2pts]PuttingastandardGaussianpriorontheweightsforlinearregression(wN(0;I))willresultinwhattypeofposteriordistributionontheweights?LaplacePoissonUniformNoneoftheabove(d)[2pts]SupposewehaveNinstancesofd-dimensionaldata.Lethbetheamountofdatastoragenecessaryforahistogramwithaxednumberofticksperaxis,andletkbetheamountofdatastoragenecessaryforkerneldensityestimation.Whichofthefollowingistrueabouthandk?handkgrowlinearlywithNhandkgrowexponentiallywithdhgrowsexponentiallywithd,andkgrowslinearlyNhgrowslinearlywithN,andkgrowsexpo-nentiallywithd(e)[2pts]Whichofthetheseclassierscouldhavegeneratedthisdecisionboundary?LinearSVMLogisticregression1-NNNoneoftheabove3(f)[2pts]Whichofthetheseclassierscouldhavegeneratedthisdecisionboundary?LinearSVMLogisticregression1-NNNoneoftheabove(g)[2pts]Whichofthetheseclassierscouldhavegeneratedthisdecisionboundary?LinearSVMLogisticregression1-NNNoneoftheabove(h)[2pts]Youwanttoclusterthisdatainto2clusters.Whichofthethesealgorithmswouldworkwell?K-meansGMMclusteringMeanshiftclustering4(i)[2pts]Youwanttoclusterthisdatainto2clusters.Whichofthethesealgorithmswouldworkwell?K-meansGMMclusteringMeanshiftclustering(j)[2pts]Youwanttoclusterthisdatainto2clusters.Whichofthethesealgorithmswouldworkwell?K-meansGMMclusteringMeanshiftclusteringThefollowingquestionsareabouthowtohelpCS189TAJonathanSnowtosolvethehomework.(k)[2pts]Jonathanjusttrainedadecisiontreeforadigitrecognition.Henoticesanextremelylowtrainingerror,butanabnormallylargetesterror.HealsonoticesthatanSVMwithalinearkernelperformsmuchbetterthanhistree.Whatcouldbethecauseofhisproblem?DecisiontreeistoodeepLearningratetoohighDecisiontreeisoverttingThereistoomuchtrainingdata(l)[2pts]Jonathanhasnowswitchedtomultilayerneuralnetworksandnoticesthatthetrainingerrorisgoingdownandconvergestoalocalminimum.Thenwhenhetestsonthenewdata,thetesterrorisabnormallyhigh.Whatisprobablygoingwrongandwhatdoyourecommendhimtodo?Thetrainingdatasizeisnotlargeenough.Collectalargertrainingdataandretrainit.Useadierentinitializationandtrainthenet-workseveraltimes.Usetheaverageofpredictionsfromallnetstopredicttestdata.Playwithlearningrateandaddregularizationtermtotheobjectivefunction.Usethesametrainingdatabutaddtwomorehiddenlayers.5Q3.[11pts]SoftmaxregressionRecallthesetupoflogisticregression:Weassumethattheposteriorprobabilityisoftheformp(Y=1jx)=11+exThisassumesthatYjXisaBernoullirandomvariable.WenowturntothecasewhereYjXisamultinomialrandomvariableoverKout