10-701MidtermExam,Spring20051.Writeyournameandyouremailaddressbelow.Name:Emailaddress:2.Thereshouldbe15numberedpagesinthisexam(includingthiscoversheet).3.WriteyournameatthetopofEVERYpageintheexam.4.Youmayuseanyandallbooks,papers,andnotesthatyoubroughttotheexam,butnotmaterialsbroughtbynearbystudents.Nolaptops,PDAs,orInternetaccess.5.Ifyouneedmoreroomtoworkoutyouranswertoaquestion,usethebackofthepageandclearlymarkonthefrontofthepageifwearetolookatwhat'sontheback.6.Worke±ciently.Somequestionsareeasier,somemoredi±cult.Besuretogiveyourselftimetoansweralloftheeasyones,andavoidgettingboggeddowninthemoredi±cultonesbeforeyouhaveansweredtheeasierones.7.Notethereisoneextra-creditquestion.Thegradecurvewillbemadewithoutconsideringstudents'extracreditpoints.Theextracreditwillthenbeusedtotrytobumpyourgradeupwithouta®ectinganyoneelse'sgrade.8.Youhave80minutes.9.Goodluck!QuestionNumberofpointsScore1.BigPicture102.ShortQuestions153.LearningAlgorithms164.DecisionTrees165.LossFns.andSVMs236.LearningTheory20Total100Extracredit7.Bias-VarianceTrade-o®1811[10points]BigPictureFollowingtheexamplegiven,add10edgestoFigure1relatingthepairofalgorithms.Eachedgeshouldbelabeledwithonecharacteristicthemethodsshare,andonedi®erence.Theselabelsshouldbeshortandaddressbasicconcepts,suchastypesoflearningproblems,lossfunctions,andhypothesisspaces.NaïveBayesLogisticregressionNeuralNetsBoostingSVMsInstance-basedLearningSVMregressionkernelregressionlinearregressionDecisiontreesH-space:bothlinearLog-lossv.Hinge-lossFigure1:Bigpicture.2Onesolutionisshownbelow,therearemanyothers.Figure2:Bigpicturesolutions.32[15points]ShortQuestions(a)[3points]Brie°ydescribethedi®erencebetweenamaximumlikelihoodhypothesisandamaximumaposteriorihypothesis.Solutions:ML:maximizethedatalikelihoodgiventhemodel,i.e.,argmaxWP(DatajW)MAP:argmaxWP(WjData)(b)[4points]ConsideranaiveBayesclassi¯erwith3booleaninputvariables,X1;X2andX3,andonebooleanoutput,Y.²HowmanyparametersmustbeestimatedtotrainsuchanaiveBayesclassi¯er?(youneednotlistthemunlessyouwishto,justgivethetotal)Solutions:ForanaiveBayesclassi¯er,weneedtoestimateP(Y=1),P(X1=1jy=0);P(X2=1jy=0),P(X3=1jy=0),P(X1=1jy=1);P(X2=1jy=1);P(X3=1jy=1).Otherproba-bilitiescanbeobtainedwiththeconstraintthattheprobabilitiessumupto1.Soweneedtoestimate7parameters.²Howmanyparameterswouldhavetobeestimatedtolearntheaboveclassi¯erifwedonotmakethenaiveBayesconditionalindependenceassumption?Solutions:Withouttheconditionalindependenceassumption,westillneedtoestimateP(Y=1).ForY=1,weneedtoknowalltheenumerationsof(X1,X2,X3),i.e.,23ofpossible(X1,X2,X3).Considertheconstraintthattheprobabilitiessumupto1,weneedtoestimate23¡1=7parametersforY=1.Thereforethetotalnumberofparametersis1+2(23¡1)=15.4[8points]TrueorFalse?Iftrue,explainwhyinatmosttwosentences.Iffalse,explainwhyorgiveabriefcounterexampleinatmosttwosentences.²(TrueorFalse?)Theerrorofahypothesismeasuredoveritstrainingsetprovidesapessimisticallybiasedestimateofthetrueerrorofthehypothesis.Solutions:False.Thetrainingerrorisoptimisticlybiasedsinceit'sbiasedwhileusuallysmallerthanthetrueerror.²(TrueorFalse?)Ifyouaregivenmdatapoints,andusehalffortrainingandhalffortesting,thedi®erencebetweentrainingerrorandtesterrordecreasesasmincreases.Solutions:True.Aswehavemoreandmoredata,trainingerrorincreasesandtestingerrorde-creases.Andtheyallconvergetothetrueerror.²(TrueorFalse?)Over¯ttingismorelikelywhenthesetoftrainingdataissmallSolutions:True.Withsmalltrainingdataset,it'seasierto¯ndahypothesisto¯tthetrainingdataexactly,i.e.,over¯t.²(TrueorFalse?)Over¯ttingismorelikelywhenthehypothesisspaceissmallSolutions:False.Wecanseethisfromthebias-variancetrade-o®.Whenhypothesisspaceissmall,it'smorebiasedwithlessvariance.Sowithasmallhypothesisspace,it'slesslikelyto¯ndahypothesisto¯tthedataverywell,i.e.,over¯t.53[16points]LearningAlgorithmsConsiderlearningatargetfunctionoftheformf:2!fA;B;Cgthatis,afunctionwith3discretevaluesde¯nedoverthe2-dimensionalplane.Considerthefollowinglearningalgorithms:²Decisiontrees²Logisticregression²SupportVectorMachine²1-nearestneighborNoteeachofthesealgorithmscanbeusedtolearnourtargetfunctionf,thoughdoingsomightrequireacommonextension(e.g.,inthecaseofdecisiontrees,weneedtoutilizetheusualmethodforhandlingreal-valuedinputattributes).Foreachofthesealgorithms,A.DescribeanyassumptionsyouaremakingaboutthevariantofthealgorithmyouwoulduseB.Drawinthedecisionsurfacethatwouldbelearnedgiventhistrainingdata(anddescribinganyambiguitiesinyourdecisionsurface)C.Circleanyexamplesthatwouldbemisclassi¯edinaleave-one-outevaluationofthisalgorithmwiththisdata.Thatis,ifyouweretorepeatedlytrainonn-1oftheseexamples,andusethelearnedclassi¯ertolabeltheleftoutexample,willitbemisclassi¯ed?Solutions:theassuptionsareasfollows:Decisiontrees:Handlerealvaluedattributesbydiscretizing;Logisticregression:Handlenon-binaryclassi¯cation;SVM:Useoneagainstallapproachandalinearkernel;1-NN:x-axisfeaturesandy-axisfeaturesarenon-weighted.Pleaseseethe¯guresontherightfordecisionsurfaceandmisclassi¯edexamplesbyleave-one-outevaluation.DecisiontreesAA