Computing gaussian mixture models with EM using eq

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

LEIBNIZCENTERFORRESEARCHINCOMPUTERSCIENCETECHNICALREPORT2003-43ComputingGaussianMixtureModelswithEMusingEquivalenceConstraintsNoamShental,AharonBar-Hillel,TomerHertzandDaphnaWeinshallSchoolofComputerScienceandEngineeringandtheCenterforNeuralComputationTheHebrewUniversityofJerusalem,Jerusalem,Israel91904Email:ffenoam,aharonbh,tomboy,daphnag@cs.huji.ac.ilAbstractGaussianmixturemodelsfordensityestimationareusuallyestimatedinanunsupervisedmanner,usinganExpectationMaximization(EM)procedure.Inthispaperweshowhowequivalenceconstraintscanbeincorporatedintothisprocedure,leadingtoimprovedmodelestimationandimprovedclusteringresults.Equivalenceconstraintsprovideadditionalinformationonpairsofdatapoints,indicatingifthepointsarisefromthesamesource(positiveconstraint)orfromdifferentsources(negativeconstraint).Suchconstraintscanbegatheredautomaticallyinsomelearningproblems,andareanaturalformofsupervisioninothers.WepresentaclosedformEMprocedureforhandlingpositiveconstraints,andaGeneralizedEMprocedureusingaMarkovnetworkfortheincorporationofnegativeconstraints.Usingpubliclyavailabledatasets,wedemonstratethatincorporatingequivalenceconstraintsleadstoconsiderableimprovementinclusteringperformance,andthatouralgorithmoutperformsallavailablecompetitors.Keywords:semi-supervisedlearning,equivalenceconstraints,clustering,EM,Guassianmixturemod-els1IntroductionGaussianMixtureModels(GMM)fordensityestimationarepopularfortwomainreasons:theycanbereliablycomputedbytheefficientExpectationMaximization(EM)algorithm[1],andtheyprovideagener-ativemodelforthewaythedatamayhavebeencreated.Thesecondpropertyinparticularmakesfortheircommonuseforunsupervisedclustering,wheretypicallytheGaussiancomponentsoftheGMMmodelaretakentorepresentdifferentsources.Thisuseiscommonbecausemostotherclusteringalgorithmsarenotgenerative,andthereforecannotprovidepredictionsregardingpreviouslyunseenpoints.1Whenusedforclusteringinthisway,theunderlyingassumption-i.e.,thatthedensityiscomprisedofamixtureofdifferentGaussiansources-isoftenhardtojustify.Itisthereforeimportanttohaveadditionalinformation,whichcansteertheGMMestimationinthe“right”direction.Forexamplewemayhaveaccesstothelabelsofpartofthedataset.Nowtheestimationproblembelongstothesemi-supervisedlearningdomain,sincetheestimationreliesonbothlabeledandunlabeledpoints.Inthispaperwefocusonanothertypeofside-information,inwhichequivalenceconstraintsbetweenafewofthedatapointsareprovided.Morespecifically,weuseanunlabeleddatasetaugmentedbyequiv-alenceconstraintsbetweenpairsofdatapoints,wheretheconstraintsdeterminewhethereachpairwasgeneratedbythesamesourceorbydifferentsources.Wedenotetheformercaseas‘positive’constraints,andthelattercaseas‘negative’constraints,andpresentamethodtoincorporatethemintoanEMprocedure.Whatdoweexpecttogainfromthesemi-supervisedapproachtoGMMestimation?Wemayhopethatintroducingside-informationintotheEMalgorithmwillresultinfasterconvergencetoasolutionofhigherlikelihood.Butmuchmoreimportantly,ourequivalenceconstraintsshouldchangetheGMMlikelihoodfunction.Asaresult,theestimationproceduremaychoosedifferentsolutions,whichwouldhaveotherwisebeenrejectedduetotheirlowrelativelikelihoodintheunconstrainedGMMdensitymodel.Ideallythesolutionobtainedwithsideinformationwillbemorefaithfultothedesiredresults.AsimpleexampledemonstratingthispointisshowninFig.1.Unconstrainedconstrainedunconstrainedconstrained(a)(b)Figure1:Illustrativeexamplestodemonstratetheaddedvalueofequivalenceconstraints.(a)Thedatasetconsistsoftwoverticallyalignedclasses:left-givennoadditionalinformation,theEMalgorithmidentifiestwohorizontalclasses,andthiscanbeshowntobethemaximumlikelihoodsolution(withloglikelihoodof3500vs.loglikelihoodof2800forthesolutionshownontheright);right-additionalsideinformationintheformofequivalenceconstraintschangestheprobabilityfunctionandwegetaverticalpartitionasthemostlikelysolution.(b)Thedatasetconsistsoftwoclasseswithpartialoverlap:left-withoutconstraintsthemostlikelysolutionincludestwonon-overlappingsources;right-withconstraintsthecorrectmodelwithoverlappingclasseswasretrievedasthemostlikelysolution.Inallplotsonlytheclassassignmentofnovelun-constrainedpointsisshown.Whydoweuseequivalenceconstraints,ratherthanpartiallabelsasinpriorwork(summarizedbelow)?Ourbasicobservationisthatunlikelabels,inmanyunsupervisedlearningtasksequivalenceconstraintsmaybeextractedwithminimaleffortorevenautomatically.OneexampleiswhenthedataisinherentlysequentialandcanbemodelledbyaMarkovianprocess.Considerforexampleasecuritycameraapplication,wheretheobjectiveistofindalltheframesinwhichthesameintruderappears.Duetothecontinuousnatureofthedata,intrudersextractedfromsuccessiveframesinrthesameclipcanbeassumedtocomefromthesameperson,thusformingpositiveconstraints.Inaddition,twointruderswhichappearsimultaneouslyinfrontoftwocamerascannotbethesameperson,henceanegativeconstraintisautomaticallyestablished.Anotheranalogousexampleisspeakersegmentationandrecognition,inwhichtheconversationbetweenseveralspeakersneedstobesegmentedandclusteredaccordingtospeakeridentity.Here,itma

1 / 15
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功