CS189Spring2015IntroductiontoMachineLearningFinal•Youhave2hours50minutesfortheexam.•Theexamisclosedbook,closednotesexceptyourone-page(two-sided)cheatsheet.•Nocalculatorsorelectronicitems.•MarkyouranswersONTHEEXAMITSELF.Ifyouarenotsureofyouransweryoumaywishtoprovideabriefexplanationandstateyourassumptions.•Fortrue/falsequestions,llintheTrue/Falsebubble.•Formultiple-choicequestions,llinthebubbleforEXACTLYONEchoicethatrepresentsthebestanswertothequestion.FirstnameLastnameSIDFirstandlastnameofstudenttoyourleftFirstandlastnameofstudenttoyourrightForstauseonly:Q1.TrueorFalse/44Q2.MultipleChoice/33Q3.DecisionTheory/9Q4.ParameterEstimation/8Q5.LocallyWeightedLogisticRegression/14Q6.DecisionTrees/7Q7.ConvolutionalNeuralNets/11Q8.Streamingk-means/9Q9.LowDimensionalDecompositions/15Total/1501Q1.[44pts]TrueorFalse(a)[2pts]Aneuralnetworkwithmultiplehiddenlayersandsigmoidnodescanformnon-lineardecisionboundaries.TrueFalse(b)[2pts]Allneuralnetworkscomputenon-convexfunctionsoftheirparameters.TrueFalse(c)[2pts]Forlogisticregression,withparametersoptimizedusingastochasticgradientmethod,settingparametersto0isanacceptableinitialization.TrueFalse(d)[2pts]Forarbitraryneuralnetworks,withweightsoptimizedusingastochasticgradientmethod,settingweightsto0isanacceptableinitialization.TrueFalse(e)[2pts]GivenadesignmatrixX2Rnd,wheredn,ifweprojectourdataontoakdimensionalsubspaceusingPCAwherekequalstherankofX,werecreateaperfectrepresentationofourdatawithnoloss.TrueFalse(f)[2pts]Hierarchicalclusteringmethodsrequireapredenednumberofclusters,muchlikek-means.TrueFalse(g)[2pts]Givenapredenednumberofclustersk,globallyminimizingthek-meansobjectivefunctionisNP-hard.TrueFalse(h)[2pts]Usingcrossvalidationtoselecthyperparameterswillguaranteethatourmodeldoesnotovert.TrueFalse(i)[2pts]Arandomforestisanensemblelearningmethodthatattemptstolowerthebiaserrorofdecisiontrees.TrueFalse(j)[2pts]Baggingalgorithmsattachweightsw1:::wntoasetofNweaklearners.Theyre-weightthelearnersandconvertthemintostrongones.BoostingalgorithmsdrawNsampledistributions(usuallywithreplacement)fromanoriginaldatasetforlearnerstotrainon.TrueFalse(k)[2pts]GivenanymatrixX,itssingularvaluesaretheeigenvaluesofXXandXX.TrueFalse(l)[2pts]GivenanymatrixX,(XX+I) 1for6=0alwaysexists.TrueFalse(m)[2pts]BackpropagationismotivatedbyutilizingChainRuleandDynamicProgrammingtoconservemathe-maticalcalculations.TrueFalse(n)[2pts]AninnitedepthbinaryDecisionTreecanalwaysachieve100%trainingaccuracy,providedthatnopointismislabeledinthetrainingset.TrueFalse(o)[2pts]InOnevsAllMulti-ClassClassicationinSVM,wearetryingtoclassifyaninputdatapointXasoneoftheNclasses(C1:::Cn),eachofwhichhasaparametervector~w1:::~wn.WeclassifypointXastheclassCiwhichmaximizestheinnerproductofXand~wi.TrueFalse2(p)[2pts]Thenumberofparametersinaparametricmodelisxed,whilethenumberofparametersinanon-parametricmodelgrowswiththeamountoftrainingdata.TrueFalse(q)[2pts]Asmodelcomplexityincreases,biaswilldecreasewhilevariancewillincrease.TrueFalse(r)[2pts]Consideracancerdiagnosisclassicationproblemwherealmostallofthepeoplebeingdiagnoseddon'thavecancer.Theprobabilityofcorrectclassicationisthemostimportantmetrictooptimize.TrueFalse(s)[2pts]Forthe1-NearestNeighborsalgorithm,asthenumberofdatapointsincreasestoinnityinourdataset,theerrorofouralgorithmisguaranteedtobeboundedbytwicetheBayesRisk.TrueFalse(t)[2pts]Increasingthedimensionalityofourdataalwaysdecreasesourmisclassicationrate.TrueFalse(u)[2pts]ItispossibletorepresentaXORfunctionwithaneuralnetworkwithoutahiddenlayer.TrueFalse(v)[2pts]Athighdimensionality,theKDtreespeeduptothenearestneighborcanbeslowerthanthenaivenearestneighborimplementation.TrueFalse3Q2.[33pts]MultipleChoice(a)[3pts]GivenaNeuralNetwithNinputnodes,nohiddenlayers,oneoutputnode,withEntropyLossandSigmoidActivationFunctions,whichofthefollowingalgorithms(withtheproperhyper-parametersandini-tialization)canbeusedtondtheglobaloptimum?SimulatedAnnealing(GradientDescentwithrestarts)StochasticGradientDescentMini-BatchGradientDescentBatchGradientDescentAlloftheaboveNoneoftheabove(b)[3pts]Givenfunctionf(x)=jx2+3j 1denedonR:NewtonsMethodonminimizinggradientswillalwaysconvergetotheglobaloptimuminoneit-erationfromanystartinglocationStochasticGradientDescentwillalwayscon-vergetotheglobaloptimuminoneiterationTheproblemisnonconvex,soitnotfeasibletondasolution.AlloftheaboveNoneoftheabove(c)[3pts]Danielwantstominimizeaconvexlossfunctionf(x)usingstochasticgradientdescent.Givenarandomstartingpoint,marktheconditionthatwouldguaranteethatstochasticgradientdescentwillconvergetotheglobaloptimum.Lett=stepsizeatiterationt.t0ConstantstepsizetDecreasingstepsizet=1ptDecreasingstepsizet=1t2AlloftheaboveNoneoftheabove(d)[3pts]Whichofthefollowingistrueoflogisticregression?ItcanbemotivatedbylogoddsTheoptimalweightvectorcanbefoundusingMLE.ItcanbeusedwithL1regularizationAlloftheaboveNoneoftheabove(e)[3pts]You'vejustnishedtrainingadecisiontreeforspamclassication,anditisgettingabnormallybadperformanceonbothyourtrainingandtestsets.Youknowthaty