ANDREWID(CAPITALS):NAME(CAPITALS):10-701/15-781Final,Fall2003•Youhave3hours.•Thereare10questions.Ifyougetstuckononequestion,moveontoothersandcomebacktothedifficultquestionlater.•Themaximumpossibletotalscoreis100.•Unlessotherwisestatedthereisnoneedtoshowyourworking.•Goodluck!11ShortQuestions(16points)(a)Traditionally,whenwehaveareal-valuedinputattributeduringdecision-treelearningweconsiderabinarysplitaccordingtowhethertheattributeisaboveorbelowsomethreshold.Patsuggeststhatinsteadweshouldjusthaveamultiwaysplitwithonebranchforeachofthedistinctvaluesoftheattribute.FromthelistbelowchoosethesinglebiggestproblemwithPat’ssuggestion:(i)Itistoocomputationallyexpensive.(ii)Itwouldprobablyresultinadecisiontreethatscoresbadlyonthetrainingsetandatestset.(iii)Itwouldprobablyresultinadecisiontreethatscoreswellonthetrainingsetbutbadlyonatestset.(iv)Itwouldprobablyresultinadecisiontreethatscoreswellonatestsetbutbadlyonatrainingset.(b)YouhaveadatasetwiththreecategoricalinputattributesA,BandC.ThereisonecategoricaloutputattributeY.YouaretryingtolearnaNaiveBayesClassifierforpredictingY.WhichoftheseBayesNetdiagramsrepresentsthenaivebayesclassifierassumption?ACBACBYYACBYACBY(i)(iii)(ii)(iv)(iii)(c)Foraneuralnetwork,whichoneofthesestructuralassumptionsistheonethatmostaffectsthetrade-offbetweenunderfitting(i.e.ahighbiasmodel)andoverfitting(i.e.ahighvariancemodel):(i)Thenumberofhiddennodes(ii)Thelearningrate(iii)Theinitialchoiceofweights(iv)Theuseofaconstant-termunitinput2(d)Forpolynomialregression,whichoneofthesestructuralassumptionsistheonethatmostaffectsthetrade-offbetweenunderfittingandoverfitting:(i)Thepolynomialdegree(ii)Whetherwelearntheweightsbymatrixinversionorgradientdescent(iii)TheassumedvarianceoftheGaussiannoise(iv)Theuseofaconstant-termunitinput(e)ForaGaussianBayesclassifier,whichoneofthesestructuralassumptionsistheonethatmostaffectsthetrade-offbetweenunderfittingandoverfitting:(i)WhetherwelearntheclasscentersbyMaximumLikelihoodorGradientDescent(ii)Whetherweassumefullclasscovariancematricesordiagonalclasscovariancematrices(iii)Whetherwehaveequalclasspriorsorpriorsestimatedfromthedata.(iv)Whetherweallowclassestohavedifferentmeanvectorsorweforcethemtosharethesamemeanvector(f)ForKernelRegression,whichoneofthesestructuralassumptionsistheonethatmostaffectsthetrade-offbetweenunderfittingandoverfitting:(i)WhetherkernelfunctionisGaussianversustriangularversusbox-shaped(ii)WhetherweuseEuclidianversusL1versusL∞metrics(iii)Thekernelwidth(iv)Themaximumheightofthekernelfunction(g)(TrueorFalse)GiventwoclassifiersAandB,ifAhasalowerVC-dimensionthanBthenAalmostcertainlywillperformbetteronatestset.(h)P(GoodMovie|IncludesTomCruise)=0.01P(GoodMovie|TomCruiseabsent)=0.1P(TomCruiseinarandomlychosenmovie)=0.01WhatisP(TomCruiseisinthemovie|NotaGoodMovie)?T∼TomCruiseisinthemovieG∼GoodMovieP(T|˜G)=P(T,˜G)P(˜G)=P(˜G|T)P(T)P(˜G|T)P(T)+P(˜G|˜T)P(˜T)=0.01×(1−0.01)0.01×(1−0.01)+(1−0.1)×(1−0.01)=1/91≈0.0109932MarkovDecisionProcesses(13points)Forthisquestionitmightbehelpfultorecallthefollowinggeometricidentities,whichassume0≤α1.kXi=0αi=1−αk+11−α∞Xi=0αi=11−αThefollowingfigureshowsanMDPwithNstates.Allstateshavetwoactions(NorthandRight)exceptSn,whichcanonlyself-loop.UnlikemostMDPs,allstatetransitionsaredeterministic.Assumediscountfactorγ.s1r=1p=1s2r=1p=1s3r=1p=1snr=10p=1p=1p=1p=1...sn-1r=1p=1p=1Forquestions(a)–(e),expressyouranswerasafiniteexpression(nosummationsignsor...’s)intermsofnand/orγ.(a)WhatisJ∗(Sn)?J∗(Sn)=10+γ·J∗(Sn)=⇒J∗(Sn)=101−γ(b)Thereisauniqueoptimalpolicy.Whatisit?Ai=Right(i=1,...,n)(c)WhatisJ∗(S1)?J∗(S1)=1+γ+···+γn−2+J∗(Sn)·γn−1=1+9γn−11−γ(d)SupposeyoutrytosolvethisMDPusingvalueiteration.WhatisJ1(S1)?J1(S1)=14(e)SupposeyoutrytosolvethisMDPusingvalueiteration.WhatisJ2(S1)?J2(S1)=1+γ(f)Supposeyourcomputerhasexactarithmetic(noroundingerrors).Howmanyitera-tionsofvalueiterationwillbeneededbeforeallstatesrecordtheirexact(correcttoinfinitedecimalplaces)J∗value?Pickone:(i)Lessthan2n(ii)Between2nandn2(iii)Betweenn2+1and2n(iv)ItwillneverhappenIt’salimitingprocess.(g)Supposeyourunpolicyiteration.Duringonestepofpolicyiterationyoucomputethevalueofthecurrentpolicybycomputingtheexactsolutiontotheappropriatesystemofnequationsinnunknowns.Supposetoothatwhenchoosingtheactionduringthepolicyimprovementstep,tiesarebrokenbychoosingNorth.SupposepolicyiterationbeginswithallstateschoosingNorth.Howmanystepsofpolicyiterationwillbeneededbeforeallstatesrecordtheirexact(correcttoinfinitedecimalplaces)J∗value?Pickone:(i)Lessthan2n(ii)Between2nandn2(iii)Betweenn2+1and2n(iv)ItwillneverhappenAfteripolicyiterations,wehaveAction(Sj)=Rightifn−ijnNorthotherwise.53ReinforcementLearning(10points)ThisquestionusesthesameMDPasthepreviousquestion,repeatedhereforyourconve-nience.Again,assumeγ=12.s1r=1p=1s2r=1p=1s3r=1p=1snr=10p=1p=1p=1p=1...sn-1r=1p=1p=1SupposewearediscoveringtheoptimalpolicyviaQ-learning.WebeginwithaQ-tableinitializedwith0’severywhere:Q(Si,North)=0foralliQ(Si,Right)=0foralliBecausetheMDPisdetermistic,werunQ-learningwithalearningrateα=1.AssumewestartQ-learningatstateS1.(a)Supposeourexploratio