Experiments with a New Boosting Algorithm

hexiaochen
2 ℃
2020-03-17

整理文档很辛苦，赏杯茶钱您下走！

还剩 ... 页未读，继续阅读 >>

免费阅读已结束，点击下载阅读编辑剩下 ... 页

阅读已结束，您可以下载文档离线阅读编辑

资源描述

MachineLearning:ProceedingsoftheThirteenthInternationalConference,1996.ExperimentswithaNewBoostingAlgorithmYoavFreundRobertE.SchapireAT&TLaboratories600MountainAvenueMurrayHill,NJ07974-0636yoav,schapire@research.att.comAbstract.Inanearlierpaper,weintroducedanew“boosting”algorithmcalledAdaBoostwhich,theoretically,canbeusedtosigniﬁcantlyreducetheerrorofanylearningalgorithmthatcon-sistentlygeneratesclassiﬁerswhoseperformanceisalittlebetterthanrandomguessing.Wealsointroducedtherelatednotionofa“pseudo-loss”whichisamethodforforcingalearningalgorithmofmulti-labelconceptstoconcentrateonthelabelsthatarehardesttodiscriminate.Inthispaper,wedescribeexperimentswecarriedouttoassesshowwellAdaBoostwithandwithoutpseudo-loss,performsonreallearningproblems.Weperformedtwosetsofexperiments.TheﬁrstsetcomparedboostingtoBreiman’s“bagging”methodwhenusedtoaggregatevariousclassiﬁers(includingdecisiontreesandsingleattribute-valuetests).Wecomparedtheperformanceofthetwomethodsonacollectionofmachine-learningbenchmarks.Inthesecondsetofexperiments,westudiedinmoredetailtheperformanceofboostingusinganearest-neighborclassiﬁeronanOCRproblem.1INTRODUCTION“Boosting”isageneralmethodforimprovingtheperfor-manceofanylearningalgorithm.Intheory,boostingcanbeusedtosigniﬁcantlyreducetheerrorofany“weak”learningalgorithmthatconsistentlygeneratesclassiﬁerswhichneedonlybealittlebitbetterthanrandomguessing.Despitethepotentialbeneﬁtsofboostingpromisedbythetheoret-icalresults,thetruepracticalvalueofboostingcanonlybeassessedbytestingthemethodonrealmachinelearningproblems.Inthispaper,wepresentsuchanexperimentalassessmentofanewboostingalgorithmcalledAdaBoost.Boostingworksbyrepeatedlyrunningagivenweak1learningalgorithmonvariousdistributionsoverthetrain-ingdata,andthencombiningtheclassiﬁersproducedbytheweaklearnerintoasinglecompositeclassiﬁer.TheﬁrstprovablyeffectiveboostingalgorithmswerepresentedbySchapire[20]andFreund[9].Morerecently,wede-scribedandanalyzedAdaBoost,andwearguedthatthisnewboostingalgorithmhascertainpropertieswhichmakeitmorepracticalandeasiertoimplementthanitsprede-cessors[10].Thisalgorithm,whichweusedinallourexperiments,isdescribedindetailinSection2.Homepage:“”.Expectedtochangeto“˜uid”some-timeinthenearfuture(foruidyoav,schapire).1Weusetheterm“weak”learningalgorithm,eventhough,inpractice,boostingmightbecombinedwithaquitestronglearningalgorithmsuchasC4.5.Thispaperdescribestwodistinctsetsofexperiments.Intheﬁrstsetofexperiments,describedinSection3,wecomparedboostingto“bagging,”amethoddescribedbyBreiman[1]whichworksinthesamegeneralfashion(i.e.,byrepeatedlyrerunningagivenweaklearningalgorithm,andcombiningthecomputedclassiﬁers),butwhichcon-structseachdistributioninasimplermanner.(Detailsgivenbelow.)Wecomparedboostingwithbaggingbecausebothmethodsworkbycombiningmanyclassiﬁers.Thiscom-parisonallowsustoseparateouttheeffectofmodifyingthedistributiononeachround(whichisdonedifferentlybyeachalgorithm)fromtheeffectofvotingmultipleclassiﬁers(whichisdonethesamebyeach).Inourexperiments,wecomparedboostingtobaggingusinganumberofdifferentweaklearningalgorithmsofvaryinglevelsofsophistication.Theseinclude:(1)analgorithmthatsearchesforverysimplepredictionruleswhichtestonasingleattribute(similartoHolte’sverysim-pleclassiﬁcationrules[14]);(2)analgorithmthatsearchesforasinglegooddecisionrulethattestsonaconjunctionofattributetests(similarinﬂavortotherule-formationpartofCohen’sRIPPERalgorithm[3]andF¨urnkranzandWidmer’sIREPalgorithm[11]);and(3)Quinlan’sC4.5decision-treealgorithm[18].Wetestedthesealgorithmsonacollectionof27benchmarklearningproblemstakenfromtheUCIrepository.Themainconclusionofourexperimentsisthatboost-ingperformssigniﬁcantlyanduniformlybetterthanbag-gingwhentheweaklearningalgorithmgeneratesfairlysimpleclassiﬁers(algorithms(1)and(2)above).WhencombinedwithC4.5,boostingstillseemstooutperformbaggingslightly,buttheresultsarelesscompelling.Wealsofoundthatboostingcanbeusedwithverysim-plerules(algorithm(1))toconstructclassiﬁersthatarequitegoodrelative,say,toC4.5.KearnsandMansour[16]arguethatC4.5canitselfbeviewedasakindofboostingalgo-rithm,soacomparisonofAdaBoostandC4.5canbeseenasacomparisonoftwocompetingboostingalgorithms.SeeDietterich,KearnsandMansour’spaper[4]formoredetailonthispoint.Inthesecondsetofexperiments,wetesttheperfor-manceofboostingonanearestneighborclassiﬁerforhand-writtendigitrecognition.Inthiscasetheweaklearningalgorithmisverysimple,andthisletsusgainsomeinsightintotheinteractionbetweentheboostingalgorithmandthenearestneighborclassiﬁer.Weshowthattheboostingal-gorithmisaneffectivewayforﬁndingasmallsubsetofprototypesthatperformsalmostaswellasthecompleteset.WealsoshowthatitcomparesfavorablytothestandardmethodofCondensedNearestNeighbor[13]intermsofitstesterror.Thereseemtobetwoseparatereasonsfortheimprove-mentinperformancethatisachievedbyboosting.Theﬁrstandbetterunderstoodeffectofboostingisthatitgeneratesahypothesiswhoseerroronthetrainingsetissmallbycom-biningmanyhypotheseswhoseerro