ASpetralAlgorithmforLearningMixturesofDistributionsSantoshVempalaGrantWangyAbstratWeshowthatasimplespetralalgorithmforlearningamixtureofkspherialGaussiansinRnworksremarkablywell|itsueedsinidentifyingtheGaussiansassumingessentiallytheminimumpossibleseparationbetweentheirentersthatkeepsthemunique(solvinganopenproblemof[1℄).Thesampleomplexityandrunningtimearepolynomialinbothnandk.Thealgorithmalsoworksforthemoregeneralproblemoflearningamixtureof\weaklyisotropidistributions(e.g.amixtureofuniformdistributionsonubes).Thealgorithmisrobustinthatitantoleratesmallamountsofnoiseandthusanalsobeusedforthemoregeneralproblemofndingthebestmixturemodelthattsagivendataset,providedthereisagoodt.1IntrodutionLearningamixtureofdistributionsisalassialprobleminstatistisandlearningtheory(see[10,14℄);morereently,ithasalsobeenproposedasamodelforlustering.Inthebasiversionoftheproblemwearegivenrandomsamplesfromamixtureofkdistributions,F1;:::;Fk.Eahsampleisdrawnindependentlywithprobabilitywifromthei’thdistribution.Thenumbersw1;:::;wkarealledthemixingweights.Theproblemistolassifytherandomsamplesaordingtowhihdistributiontheyomefrom(andtherebyinferthemixingweights,meansandotherpropertiesoftheunderlyingdistributions).AnimportantaseofthisproblemiswheneahunderlyingdistributionisaGaussian.Inthisase,thegoalistondthemeanandovarianesofeahGaussian(alongwiththemixingweights).Thisproblemseemstobeofgreatpratialinterestandmanyheuristishavebeenusedtosolveit.ThemostfamousamongthemistheEMalgorithm[5℄.UnfortunatelyEMisaloalsearhheuristithatanfail.AspeialaseoftheproblemiswhentheGaussiansareassumedtobespherial,i.e.thevarianeisthesameinanydiretion.Inreentyears,therehasbeensubstantialprogressindevelopingpolynomial-timealgorithmsforthisspeialase,bymakingassumptionsontheseparationbetweenthemeansoftheGaussians.Thisseparationonditionisruial,soweproeedtomakeitexpliit.LetF1;:::;FkbespherialGaussiansinRnwithmeanvetors1;:::;kandvarianes21;:::;2k.WewillrefertoipnastheradiusofFiandjjijjjastheseparationbetweenFiandFj.Ifthepairwiseseparationislargerthantheradii,thenpointsfromdierentGaussiansareisolatedinspaeandeasytolassify.Ontheotherhand,iftheseparationisverysmall,thenthelassiationproblemneednothaveauniquesolution.1.1PreviousworkDasgupta[3℄usedrandomprojetiontolearnamixtureofspherialGaussiansprovidedtheyareessentiallynon-overlapping,i.e.theoverlapinprobabilitymassisexponentiallysmallinn.DepartmentofMathematisandLaboratoryforComputerSiene,MIT,vempalamath.mit.eduyLaboratoryforComputerSiene,MIT,gjwtheory.ls.mit.edu.BothauthorsaresupportedinpartbyNSFCareerawardCCR-987024.1Hisalgorithmispolynomial-timeprovidedthesmallestmixingweightwminis(1=k)andtheseparationisjjijjjCmaxfi;jgn12foraonstantC.Inotherwords,theseparationisproportionaltothelargerradius(thealgorithmalsorequiredallthevarianestobewithinaboundedrange).Shortlythereafter,itwasshownbyDasguptaandShulman[6℄thatavariantofEMworkswithasmallerseparation(alongwithsometehnialonditionsonthevarianes):jjijjjCmaxfi;jgn14log14(n=wmin)(1)ThisseparationistheminimumatwhihrandompointsfromthesameGaussiananbedis-tinguishedfromrandompointsfromtwodierentGaussiansbasedonpairwisedistanes.SopointsfromtheGaussianwith(approximately)thesmallestvarianehavethesmallestpairwisedistanes.Theyanbeidentiedandremovedandthisanberepeatedontheremainingpoints.AroraandKannan[1℄generalizedthistonon-spherialGaussians.Theyusedisoperimetrithe-oremstoobtaindistaneonentrationresultsforthenon-spherialase.Atthisseparation,theiralgorithmsimplyidentiesallpointsatroughlytheminimumdistanefromeahotherasomingfromasingleGaussian,removesthemandrepeatsontheremainingdata.Theyalsogiveaversionthatusesrandomprojetion.LearningamixtureofspherialGaussiansatasmallerseparation(whendistaneonentrationresultsarenolongervalid)hasbeenanopenproblem.1.2OurresultsInorderforthesolutiontothelassiationproblemtobewell-dened(i.e.uniquewithreasonableprobability)weneedaseparationofatleastqjjijjjCmaxfi;jg:Atthisseparationtheoverlapintheprobabilitymassisaonstantfration.Soinpartiular,distaneonentrationresultsarenolongerappliable.Inthispaper,weshowthatasimplespetralalgorithmanlearnamixtureofkGaussiansatthisminimumseparationintimepolynomialinkO(k)poly(n).OurmainresultisthatwithaslightlylargerseparationofjjijjjCmaxfi;jgk14log14(n=wmin)(2)thealgorithmispolynomialinbothkandn.Notethatthisonditionisalmostindependentofnandismuhweakerthan(1)asthedimension(n)getslargerthanthenumberofGaussians(k).Moregenerally,thealgorithmworksforamixtureofweaklyisotropidistributions,i.e.dis-tributionswiththepropertythatthevarianeisthesamealonganydiretion.Examplesofthisinludeuniformdistributionoverubes,balls,andgenerallyweaklyisotropionvexsets(denedinsetion4),adistributionproportionaltoejxjforaonstant,et.Themainstepofthealgorithmistoprojettothesubspaespannedbythetopkrightsingularvetorsofthesamplematrix(i.e.itskprinipalomponents).Thisistherankksubspaewhihmaximizesthesquaredprojetionsofthesamples.Thekeyobservationisthatwithhighprobabilitythissubspaeliesverylosetothespanofthemeanvetors
本文标题:A spectral algorithm for learning mixtures of dist
链接地址:https://www.777doc.com/doc-3303112 .html