Structural extension to logistic regression Discri

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

StructuralExtensiontoLogisticRegression:DiscriminativeParameterLearningofBeliefNetClassiersRussellGreiner(greiner@cs.ualberta.ca)†DeptofComputingScience,UniversityofAlberta,Edmonton,ABT6G2H1CanadaXiaoyuanSu(xsu1@umsis.miami.edu)Electrical&ComputerEngineering,UniversityofMiami,CoralGables,FL33124,USABinShen(bshen@cs.ualberta.ca)DeptofComputingScience,UniversityofAlberta,Edmonton,ABT6G2H1CanadaWeiZhou(w2zhou@math.uwaterloo.ca)SchoolofComputerScience,UniversityofWaterloo,Waterloo,ONN2L3G1,CanadaAbstract.Bayesianbeliefnets(BNs)areoftenusedforclassicationtasks—typicallytoreturnthemostlikelyclasslabelforeachspeciedinstance.ManyBN-learners,however,attempttondtheBNthatmaximizesadifferentobjectivefunction—viz.,likelihood,ratherthanclassicationaccuracy—typicallybyrstlearninganappropriategraphicalstructure,thenndingtheparametersforthatstructurethatmaximizethelikelihoodofthedata.Astheseparametersmaynotmaximizetheclassicationaccuracy,“discriminativeparameterlearners”followthealternativeapproachofseekingtheparametersthatmaximizeconditionallikelihood(CL),overthedistributionofinstancestheBNwillhavetoclassify.Thispaperrstformallyspeciesthistask,showshowitextendsstandardlogisticregression,andanalyzesitsinherentsampleandcomputationalcomplexity.Wethenpresentageneralalgorithmforthistask,ELR,thatappliestoarbitraryBNstructuresandthatworkseffectivelyevenwhengivenincompletetrainingdata.Unfortunately,ELRisnotguaranteedtondtheparametersthatoptimizeconditionallikelihood;moreover,eventheoptimal-CLparametersneednothaveminimalclassicationerror.ThispaperthereforepresentsempiricalevidencethatELRproduceseffectiveclassiers,oftensuperiortotheonesproducedbythestandard“generative”algorithms,especiallyincommonsituationswherethegivenBN-structureisincorrect.Keywords:(Bayesian)beliefnets,Logisticregression,Classication,PAC-learning,Computational/samplecomplexity1.IntroductionManytasks—includingfaultdiagnosis,patternrecognitionandforecast-ing—canbeviewedasclassication,aseachrequiresassigningtheclass(“label”)toagiveninstance,whichisspeciedbyasetofattributes.Anincreasingnumberofprojectsareusing“(Bayesian)beliefnets”(BN)torepresenttheunderlyingdistribution,andhencethestochasticmappingfromevidencetoresponse.Thispaperextendstheearlierresultsthatappearin[27]and[49].†Thise-mailaddressisavailableforallproblemsandquestions.©2005KluwerAcademicPublishers.PrintedintheNetherlands.elr.tex;15/02/2005;9:44;p.12Whenthisdistributionisnotknownapriori,wecantrytolearnthemodel.OurgoalisanaccurateBN—i.e.,onethatreturnsthecorrectanswerasoftenaspossible.Whileaperfectmodelofthedistributionwillperformoptimallyforanypossiblequery,learnerswithlimitedtrainingdataareunlikelytoproducesuchamodel;moreover,optimalitymaybeimpossibleforlearn-ersconstrainedtoarestrictedrangeofpossibledistributionsthatexcludesthecorrectone(e.g.,whenonlyconsideringparameterizationsofagivenBN-structure).Here,itmakessensetondtheparametersthatdowellwithrespecttothequeriesposed.This“discriminativelearning”taskdiffersfromthe“genera-tivelearning”thatisusedtolearnanoverallmodelofthedistribution[47].Followingstandardpractice,ourdiscriminativelearnerwillseektheparame-tersthatmaximizethelogconditionallikelihood(LCL)overthedata,ratherthansimplelikehood—thatis,giventhedataS=fhci;eiig(eachclasslabelC=ciassociatedwithevidenceE=ei),adiscriminativelearnerwilltrytondparametersQthatmaximizedLCL(S)(Q)=1jSjåhci;eii2SlogPQ(cijei)(1)ratherthantheonesthatmaximizeåhci;eii2SlogPQ(ci;ei)[47].OptimizingtheLCLoftherootnode(giventheotherattributes)ofanaïve-bayesstructurecanbeformulatedasastandardlogisticregressionprob-lem[39,32].Generalbeliefnetsextendnaïve-bayes-structuresbypermittingadditionaldependenciesamongtheattributes.ThispaperprovidesageneraldiscriminativelearningtoolELRthatcanlearntheparametersforanarbitrarystructure,completingtheanalogyNaïve-bayes:GeneralBeliefNet::LogisticRegression:ELR:(2)Moreover,whilemostalgorithmsforlearninglogisticregressionfunctionsrequirecompletetrainingdata,theELRalgorithmcanacceptincompletedata.Wealsopresentempiricalevidence,fromalargenumberofdatasets,todemon-stratethatELRworkseffectively.Section2providesthefoundations,overviewingbeliefnetsthendeningourtask:discriminativelylearningtheparameters(foraxedbeliefnetstruc-ture,G)thatmaximizeLCL.Section3formallyanalysesthistask,providingbothsampleandcomputationalcomplexity,andnotinghowtheseresultscomparewithcorrespondingresultsforgenerativelearning.SeeingthatourtaskisNP-hardingeneral,Section4presentsagradient-descentdiscrimi-nativeparameterlearningalgorithmforgeneralBNs,ELR.Section5reportsempiricalresultsthatdemonstratethatourELRproducesaclassierthatisoftensuperiortoonesproducedbystandardlearningalgorithms(whichmax-imizelikelihood),overavarietyofsituations,involvingbothcompleteandelr.tex;15/02/2005;9:44;p.2StructuralExtensiontoLogisticRegression3incompletedata.Section6providesabriefsurveyoftherelevantliterature.Thewebpage[25]providestheproofsofallofourtheoreticclaims(extendingtheproofsketchintheAppendix),aswellasmoreinformationabou

1 / 25
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功