1 MAXIMUM LIKELIHOOD AND BAYESIAN METHODS FOR MIXT

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

1MAXIMUMLIKELIHOODANDBAYESIANMETHODSFORMIXTURESOFNORMALDISTRIBUTIONS.PeterM.Saama.UCLAOfficeofAcademicComputing.November,1997.A.ABSTRACTDatawerewaitingtimesbetweeneruptionsoftheOldFaithfulgeyserinYellowstoneNationalPark,Wyoming,USA.Thesamplehistogramshowedevidenceofbimodality.Forthwith,atwo-componentnormalmixturemodelwasfittedtothedata.TheGauss-Newtonalgorithmwasusedtoobtainthemaximumlikelihoodestimateofthenuisanceparametersinthemixturemodel.AnalternativemethodwhichusestheGibbsSamplertoobtainparameterestimatesaswellas100(1-FUHGLEOHEDQGVIRUWKHSDUDPHWHUVZDVLPSOHPHQWHGDQGLVSUHVHQWHGB.INTRODUCTIONBecauseofoverdispersionandheterogeneityinthepopulation,amixtureofdistributionsisoftenusedtomodelthequantitativeresponse.Suchdistributionsareoftenconsideredappropriatemodelsforthoughttoconsistofanumberofrelativelydistinctsub-populations(c).Insituationswherethenumberofcomponentsisunknown,mixturedensitiesoftheform()∑=kjjjj12,Nσθπhavefoundtheirwidestapplicationsasamodelbasedclusteringprocedure;jπistheprobabilitythatobservationiycomesfromcomponentthjofthemixture.Hereinπτλθ,,=willdenotethesetofallunknownparametersand()..pisusedtodenoteagenericconditionalprobabilitydensityfunction.AmixtureoftwonormaldensitieswasfirstconsideredbyPearsonin1894withparameterestimatesobtainedfromthemethodofmomentsandinvolvedthesolutionofaninth-degreepolynomial.TheseminalpaperontheEMalgorithm(Dempster,LairdandRubin,1977)hasgreatlystimulatedworkonfinitemixturesofdistributions.Applicationsofmixturemodels2reportedbyTitterington,SmithandMakov(1985)andMcLachlanandBasford(1988)usetheExpectationMaximization(EM)algorithm.Itsdisadvantagesinclude:•extremeslownessofconvergencewhentheproportionofmissingdataishigh;•absenceofstandarderrorsfromtheinformationmatrixatconvergence.CompetitorsofEMareGauss-Newton(Lois,1982;Aitkinetal,1994),FisherScoring(Rao,1948),andDifferentialEvolution(PriceandStorn,1997).TheGauss-Newton(GN)algorithm,isnotguaranteedtoconvergewhenthelog-likelihoodisnotconcavebutwhenitdoesconverge,thisrateofconvergenceisusuallyquadratic,comparedtolinearfromEM.AhybridEM-GNwasproposedandimplementedbyAitkinetal(1994).ABayesiananalysisofmixturemodelspresentscertainadvantagesovertheclassicalapproaches.Intheory,quantitiesofinterestarewrittendownasintegralsoftheform()θθdypynn)()F()F(∫Θ=ΘΕ,butinpracticetheseintegralscannotbeevaluatedbytraditionalnumericalmethods.Whenthenumberofgroupsisassumedknown,MarkovChainMonteCarloMethodssuchastheGibbssamplercanbeusedtoperformtheintegration.Itisawell-knownprobleminfinitemixturemodelsthattheparametersarefundamentallynotidentifiableinthatthelikelihoodparameterscorrespondingtothekcomponentsisunchangedbypermutationsofthecomponentlabelsk,,1K.InaBayesiananalysis,thistypicallyleadstoajointdensityoftheparameterswhichishighlymultimodalwhichcauseslabel-switchingintheGibbssampleroutputandmakesinferencesforindividualcomponentsofthemixturemeaningless.AcommonpracticeistoimposeidentifiabilityconstraintsonthemodelparameterssuchaskσσσK11butthisisoftennotasatisfactorysolution(DieboltandRobert,1994).Stephens(1997)suggestsageneralsolutionwhichinvolvespermutingsamplesfromtheparameterposteriordensitysoastoremoveasmuchmultimodalityaspossibleandallowsinterpretationsforgroupstobediscoveredratherthanimposed.3ThepurposeofthispaperistopresentEM,Gauss-Newton,andMCMCalgorithmsforfittingatwo-componentmixturemodelinordertoprovidethereaderwithtoolsforpracticalmixtureestimation.C.MATERIALSANDMETHODSThedata(Appendix:TableA.)takenfromVenebales,W.N.andB.D.Ripley(1995)arewaitingtimesbetweeneruptionsoftheOldFaithfulgeyserinYellowstoneNationalPark,Wyoming,USA.AhistogramofthewaitingtimesisshowninFigure1.Figure1.HistogramforthewaitingtimesbetweensuccessiveeruptionsfortheOldFaithfulgeyser,withnon-parametricandparametricestimateddensitiessuperimposed.Frominspectionofthisfigureamixtureoftwonormaldistributionswouldseemtobeareasonabledescriptivemodelforthemarginaldistributionsofwaitingtimes.EM/Gauss-NewtonalgorithmIfniyi,,2,1,K=isasamplewaitingtime,thelog-likelihoodfunctionforamixtureoftwonormalcomponentsis:()−−+−=∑=222111122111log,,,,σμφσπσμφσπσμσμπiiniyyL4Theparameterestimatesandtheirvarianceswereobtainedbyminimizing-LusinganimplementationofahybridEM/GNalgorithmintheS-PLUSsoftware.InitialvaluesfortheparameterscanbeobtainedbythemethodofmomentsdescribedinEverittandHand(1985)butforwellseparatedcomponents,initialvaluescanbespecifiedbyreferencetothesamplehistogramandwere:Gibbs-SamplerAssumethateachobservationforthewaitingtime,niyi,,2,1,K=,isdrawnfromoneoftwogroupswithcommonvariance.Let2,1=iTbethetruegroupoftheithobservationwheregroupjhasanormaldistributionwithmeanjλandprecisionτ.Furthermore,assumethatanunknownfractionπofobservationsareingroup2;π−1ingroup1.Themodelisthus,()τλ,Normal~iTiy()jjTαπ:Dirichlet~Areparameterisationofthismodel,whichensuresthatthedatadonotgointoonecomponentofthemixture,givenbyRobert(1994)is:,012+=ωωλλThedirectedacyclicgraphforthismodelisshownbelowinFigure2.Theconjugatepriorforπτλθ,,=isoftheform(Ro

1 / 17
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功