机器学习-试卷-finals14

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

CS189Spring2014IntroductiontoMachineLearningFinal•Youhave3hoursfortheexam.•Theexamisclosedbook,closednotesexceptyourone-pagecribsheet.•Pleaseusenon-programmablecalculatorsonly.•MarkyouranswersONTHEEXAMITSELF.Ifyouarenotsureofyouransweryoumaywishtoprovideabriefexplanation.•Fortrue/falsequestions, llintheTrue/Falsebubble.•Formultiple-choicequestions, llinthebubblesforALLCORRECTCHOICES(insomecases,theremaybemorethanone).Wehaveintroducedanegativepenaltyforfalsepositivesforthemultiplechoicequestionssuchthattheexpectedvalueofrandomlyguessingis0.Don'tworry,forthissection,yourscorewillbethemaximumofyourscoreand0,thusyoucannotincuranegativescoreforthissection.FirstnameLastnameSIDFirstandlastnameofstudenttoyourleftFirstandlastnameofstudenttoyourrightForsta useonly:Q1.TrueorFalse/9Q2.MultipleChoice/24Q3.Softmaxregression/11Q4.PCAandleastsquares/10Q5.Mixtureoflinearregressions/10Q6.Trainingsetaugmentation/10Q7.KernelPCA/12Q8.Autoencoder/14Total/1001Q1.[9pts]TrueorFalse(a)[1pt]Thesingularvaluedecompositionofarealmatrixisunique.TrueFalse(b)[1pt]Amultiple-layerneuralnetworkwithlinearactivationfunctionsisequivalenttoonesingle-layerperceptronthatusesthesameerrorfunctionontheoutputlayerandhasthesamenumberofinputs.TrueFalse(c)[1pt]Themaximumlikelihoodestimatorfortheparameterofauniformdistributionover[0;]isunbiased.TrueFalse(d)[1pt]Thek-meansalgorithmforclusteringisguaranteedtoconvergetoalocaloptimum.TrueFalse(e)[1pt]Increasingthedepthofadecisiontreecannotincreaseitstrainingerror.TrueFalse(f)[1pt]Thereexistsaone-to-onefeaturemappingforeveryvalidkernelk.TrueFalse(g)[1pt]Forhigh-dimensionaldata,k-dtreescanbeslowerthanbruteforcenearestneighborsearch.TrueFalse(h)[1pt]Ifwehadin nitedataandin nitelyfastcomputers,kNNwouldbetheonlyalgorithmwewouldstudyinCS189.TrueFalse(i)[1pt]Fordatasetswithhighlabelnoise(manydatapointswithincorrectlabels),randomforestswouldgenerallyperformbetterthanboosteddecisiontrees.TrueFalse2Q2.[24pts]MultipleChoice(a)[2pts]InHomework4,you talogisticregressionmodelonspamandhamdataforaKaggleCompetition.Assumeyouhadaverygoodscoreonthepublictestset,butwhentheGSIsranyourmodelonaprivatetestset,yourscoredroppedalot.Thisislikelybecauseyouover ttedbysubmittingmultipletimesandchangingthefollowingbetweensubmissions:,yourpenaltyterm,yourstepsize,yourconvergencecriterionFixingarandombug(b)[2pts]Givend-dimensionaldatafxigNi=1,yourunprinciplecomponentanalysisandpickPprinciplecompo-nents.Canyoualwaysreconstructanydatapointxifori2f1:::NgfromthePprinciplecomponentswithzeroreconstructionerror?Yes,ifPdYes,ifPnYes,ifP=dNo,always(c)[2pts]PuttingastandardGaussianpriorontheweightsforlinearregression(wN(0;I))willresultinwhattypeofposteriordistributionontheweights?LaplacePoissonUniformNoneoftheabove(d)[2pts]SupposewehaveNinstancesofd-dimensionaldata.Lethbetheamountofdatastoragenecessaryforahistogramwitha xednumberofticksperaxis,andletkbetheamountofdatastoragenecessaryforkerneldensityestimation.Whichofthefollowingistrueabouthandk?handkgrowlinearlywithNhandkgrowexponentiallywithdhgrowsexponentiallywithd,andkgrowslinearlyNhgrowslinearlywithN,andkgrowsexpo-nentiallywithd(e)[2pts]Whichofthetheseclassi erscouldhavegeneratedthisdecisionboundary?LinearSVMLogisticregression1-NNNoneoftheabove3(f)[2pts]Whichofthetheseclassi erscouldhavegeneratedthisdecisionboundary?LinearSVMLogisticregression1-NNNoneoftheabove(g)[2pts]Whichofthetheseclassi erscouldhavegeneratedthisdecisionboundary?LinearSVMLogisticregression1-NNNoneoftheabove(h)[2pts]Youwanttoclusterthisdatainto2clusters.Whichofthethesealgorithmswouldworkwell?K-meansGMMclusteringMeanshiftclustering4(i)[2pts]Youwanttoclusterthisdatainto2clusters.Whichofthethesealgorithmswouldworkwell?K-meansGMMclusteringMeanshiftclustering(j)[2pts]Youwanttoclusterthisdatainto2clusters.Whichofthethesealgorithmswouldworkwell?K-meansGMMclusteringMeanshiftclusteringThefollowingquestionsareabouthowtohelpCS189TAJonathanSnowtosolvethehomework.(k)[2pts]Jonathanjusttrainedadecisiontreeforadigitrecognition.Henoticesanextremelylowtrainingerror,butanabnormallylargetesterror.HealsonoticesthatanSVMwithalinearkernelperformsmuchbetterthanhistree.Whatcouldbethecauseofhisproblem?DecisiontreeistoodeepLearningratetoohighDecisiontreeisover ttingThereistoomuchtrainingdata(l)[2pts]Jonathanhasnowswitchedtomultilayerneuralnetworksandnoticesthatthetrainingerrorisgoingdownandconvergestoalocalminimum.Thenwhenhetestsonthenewdata,thetesterrorisabnormallyhigh.Whatisprobablygoingwrongandwhatdoyourecommendhimtodo?Thetrainingdatasizeisnotlargeenough.Collectalargertrainingdataandretrainit.Useadi erentinitializationandtrainthenet-workseveraltimes.Usetheaverageofpredictionsfromallnetstopredicttestdata.Playwithlearningrateandaddregularizationtermtotheobjectivefunction.Usethesametrainingdatabutaddtwomorehiddenlayers.5Q3.[11pts]SoftmaxregressionRecallthesetupoflogisticregression:Weassumethattheposteriorprobabilityisoftheformp(Y=1jx)=11+e xThisassumesthatYjXisaBernoullirandomvariable.WenowturntothecasewhereYjXisamultinomialrandomvariableoverKout

1 / 13
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功