Active Learning with Statistical Models

daiyi007
1 ℃
2020-02-28

整理文档很辛苦，赏杯茶钱您下走！

还剩 ... 页未读，继续阅读 >>

免费阅读已结束，点击下载阅读编辑剩下 ... 页

阅读已结束，您可以下载文档离线阅读编辑

资源描述

JournalofArticialIntelligenceResearch4(1996)129-145Submitted11/95;published3/96ActiveLearningwithStatisticalModelsDavidA.Cohncohn@harlequin.comZoubinGhahramanizoubin@cs.toronto.eduMichaelI.Jordanjordan@psyche.mit.eduCenterforBiologicalandComputationalLearningDept.ofBrainandCognitiveSciencesMassachusettsInstituteofTechnologyCambridge,MA02139USAAbstractFormanytypesofmachinelearningalgorithms,onecancomputethestatistically\op-timalwaytoselecttrainingdata.Inthispaper,wereviewhowoptimaldataselectiontechniqueshavebeenusedwithfeedforwardneuralnetworks.Wethenshowhowthesameprinciplesmaybeusedtoselectdatafortwoalternative,statistically-basedlearningar-chitectures:mixturesofGaussiansandlocallyweightedregression.Whilethetechniquesforneuralnetworksarecomputationallyexpensiveandapproximate,thetechniquesformixturesofGaussiansandlocallyweightedregressionarebothecientandaccurate.Em-pirically,weobservethattheoptimalitycriterionsharplydecreasesthenumberoftrainingexamplesthelearnerneedsinordertoachievegoodperformance.1.IntroductionThegoalofmachinelearningistocreatesystemsthatcanimprovetheirperformanceatsometaskastheyacquireexperienceordata.Inmanynaturallearningtasks,thisexperienceordataisgainedinteractively,bytakingactions,makingqueries,ordoingexperiments.Mostmachinelearningresearch,however,treatsthelearnerasapassiverecipientofdatatobeprocessed.This\passiveapproachignoresthefactthat,inmanysituations,thelearner’smostpowerfultoolisitsabilitytoact,togatherdata,andtoinuencetheworlditistryingtounderstand.Activelearningisthestudyofhowtousethisabilityeectively.Formally,activelearningstudiestheclosed-loopphenomenonofalearnerselectingac-tionsormakingqueriesthatinuencewhatdataareaddedtoitstrainingset.Examplesincludeselectingjointanglesortorquestolearnthekinematicsordynamicsofarobotarm,selectinglocationsforsensormeasurementstoidentifyandlocateburiedhazardouswastes,orqueryingahumanexperttoclassifyanunknownwordinanaturallanguageunderstandingproblem.Whenactions/queriesareselectedproperly,thedatarequirementsforsomeproblemsdecreasedrastically,andsomeNP-completelearningproblemsbecomepolynomialincom-putationtime(Angluin,1988;Baum&Lang,1991).Inpractice,activelearningoersitsgreatestrewardsinsituationswheredataareexpensiveordiculttoobtain,orwhentheenvironmentiscomplexordangerous.Inindustrialsettingseachtrainingpointmaytakedaystogatherandcostthousandsofdollars;amethodforoptimallyselectingthesepointscouldoerenormoussavingsintimeandmoney.c1996AIAccessFoundationandMorganKaufmannPublishers.Allrightsreserved.Cohn,Ghahramani&JordanThereareanumberofdierentgoalswhichonemaywishtoachieveusingactivelearn-ing.Oneisoptimization,wherethelearnerperformsexperimentstondasetofinputsthatmaximizesomeresponsevariable.Anexampleoftheoptimizationproblemwouldbendingtheoperatingparametersthatmaximizetheoutputofasteelmillorcandyfactory.Thereisanextensiveliteratureonoptimization,examiningbothcaseswherethelearnerhassomepriorknowledgeoftheparameterizedfunctionalformandcaseswherethelearnerhasnosuchknowledge;thelattercaseisgenerallyofgreaterinteresttomachinelearningpractitioners.Thefavoredtechniqueforthiskindofoptimizationisusuallyaformofre-sponsesurfacemethodology(Box&Draper,1987),whichperformsexperimentsthatguidehill-climbingthroughtheinputspace.Arelatedproblemexistsintheeldofadaptivecontrol,whereonemustlearnacontrolpolicybytakingactions.Incontrolproblems,onefacesthecomplicationthatthevalueofaspecicactionmaynotbeknownuntilmanytimestepsafteritistaken.Also,incontrol(asinoptimization),oneisusuallyconcernedwiththeperformingwellduringthelearningtaskandmusttradeofexploitationofthecurrentpolicyforexplorationwhichmayimproveit.Thesubeldofdualcontrol(Fe’ldbaum,1965)isspecicallyconcernedwithndinganoptimalbalanceofexplorationandcontrolwhilelearning.Inthispaper,wewillrestrictourselvestoexaminingtheproblemofsupervisedlearning:basedonasetofpotentiallynoisytrainingexamplesD=f(xi;yi)gmi=1,wherexi2Xandyi2Y,wewishtolearnageneralmappingX!Y.Inrobotcontrol,themappingmaybestateaction!newstate;inhazardlocationitmaybesensorreading!targetposition.Incontrasttothegoalsofoptimizationandcontrol,thegoalofsupervisedlearningistobeabletoecientlyandaccuratelypredictyforagivenx.Inactivelearningsituations,thelearneritselfisresponsibleforacquiringthetrainingset.Here,weassumeitcaniterativelyselectanewinput~x(possiblyfromaconstrainedset),observetheresultingoutput~y,andincorporatethenewexample(~x;~y)intoitstrainingset.ThiscontrastswithrelatedworkbyPlutowskiandWhite(1993),whichisconcernedwithlteringanexistingdataset.Inourcase,~xmaybethoughtofasaquery,experiment,oraction,dependingontheresearcheldandproblemdomain.Thequestionwewillbeconcernedwithishowtochoosewhich~xtotrynext.Therearemanyheuristicsforchoosing~x,includingchoosingplaceswherewedon’thavedata(Whitehead,1991),whereweperformpoorly(Linden&Weber,1993),wherewehavelowcondence(Thrun&Moller,1992),whereweexpectittochangeourmodel(Cohn,Atlas,&Ladner,1990,1994),andwherewepreviouslyfounddatathatresultedinlearning(Schmidhuber&Storck,1993).Inthispaperwew