10352010121671)1815(2010)35-8752-08ScienceTechnologyandEngineeringVol110No135Dec12010Z2010Sci1Tech1Engng1增量聚类算法综述李桃迎陈燕秦胜君李楠(,116026)摘要给出了增量聚类的概念分析了增量聚类方法可以用于解决数据的变化和大量存储空间的需求问题增量聚类算法选择恰当时,可以保证数据在变化时有效地提高聚类的精度和效率从传统聚类生物智能聚类和数据流聚类三个角度归纳了增量聚类问题,分析了增量聚类问题的研究进展:包括发展的过程及特点,阐述了研究增量聚类问题的关键技术,最后给出了未来的发展趋势关键词聚类分析增量聚类生物智能数据流中图法分类号TP301.6;文献标志码A2010108(2009BAG13A03)(7094000870801007)(200801510001)(209030)(2009QN085):(1983)),,,E-mai:lytaoli@126.com:,[1],,,Web,,,,[2],:,,,,;,,,,,,,,1增量聚类方法综述[3],:[1],CLIQUE[3]ENCLUSMAFIA[4],PROCLUS[5]OR-CLUS[6],,BIRCH[2]CURE[7]k-meansk-modes,k-means[8,9]k,k,,[10]Hart-igan[11][12],D.Fisher[13]COB-WEB[14,15],[16)18],,,[19][10,17],,,,,,,2基于传统聚类方法及其变形的增量聚类算法,,,2.1,BRICHCOBWEB,[20],DBSCAN,DBSCANDBSCAN,,,,[21],DBSCANHuangZouXuXie[22,23],,,[24],,Chen[25],GRIN,,,(dendrogram),,GRIN,,GRIN,,,GRIN,Widyantoro[26])))IHC,:(homogeneity)(monotonicity),,Charikar[27],,,,875335,:,,,,,HsuHuang[28],,,(),,,2.2:,,;,,,;,,2.2.1簇的初始化:[29],,kk,,,,,,,,,,,,[30][31)37][31],,[32],kk-means[33]k-means[34]kk-means[35],c,,[36],K-[37]C2.2.2增量过程中簇的调整,,,,EdwinLughofer[38],,,,,[39],,,,,,,,[24],,,,,,,,875410;,;,,,,,,,,,,,,;,,;,,,,,,2.2.3聚类的有效性度量3[40]:;,,,XieBeni[41],(1)Vxie(U,V,c)=1nEci=1Enj=1umij+vi-vj+2miniXj+vi-vj+2(1)c-n,0,c[42,43][44],,,[45]F-,F-(2)Fl=1l-1Elj=1rjlEmt=1(x(jl)t-xt)21n-1Elj=1Erjlk=1Emt=1(x(jl)kt-x(jl)t)2(2):f(Ml)=1-min{A|P(Fl\FA(l-1,n-l))=A},FA(l-1,n-l)Al-1,n-lF3基于生物智能的增量聚类算法,,(ArtificialNeuralNe-twork-ANN)(ArtificialImmuneSys-tem-AIS)(GeneticAlgorithm-GA)(AntColonyAlgorithm)(ParticleSwarmOptimization-PSO),,,,,[46],k-meansEM[47],EM,,,,,,[48]PSOk-means,[49]PSOk-harmonicmeans[50],YangPSOKao[51],PSONM-PsoriasisK-PSO,BoLiu[52]ChenZhuo[53],Agent,,875535,:,AgentAgent,,,Agent1991Deneubour[54],E.LumerB.Faieta[55];1),[56];2);3);4),[57],[58,59]Web,,,,,,,,,,,(AIS)Timmis(Re-sourcelimitedAIS,RLAIS)[60]deCastro(aiNet)[61],,AISAIS,[62]AIS,Logistic,,,,,,,4面向数据流的增量聚类算法,,,,,[63],[64],,,[65)68],[66,69)73],[65,66,74],[66,68,75],[66,76],[63,66],[66,77],,XML[78,79],Web[80],WebService[81],Web[82]5结束语,,,,,,,,,,,875610,,,web1JingL,NgMK,HuangJZ.Anentropyweightingk-meansalgorithmforsubspaceclusteringofhigh-dimensionalsparsedata1IEEETrans-actionsonKnowledgeandDataEngineering,2007;19(8):1026)10412ZhangT,RamakrishnanR,LivnyM.BIRCH:Anefficientdataclusteringmethodforverylargedatabases.ACMSIGMODInterna-tionalConferenceonManagementofData.Montrea:lACMPress,1996:103)1143AgrawalR,GehrkeJ,GunopulosD,etal.Automaticsubspaceclus-teringofhighdimensionaldatafordataminingapplications.ACMSIGMODInternationalConferenceonManagementofData.Seattle:ACMPress,1998:94)1054ChengCH,FuAW,ZhangY.Entropy-basedsubspaceclusteringforminingnumericaldata.TheFifthACMSIGKDDInternationalConferenceonKnowledgeDiscoveryandDataMining.SanDiego:ACMPress,1999:84)935AggarwalC,ProcopiucC,WolfJL,etal.Fastalgorithmsforprojec-tedclustering.ThefifthACMSIGKDDInternationalConferenceonKnowledgeDiscoveryandDataMining.SanDiego:ACMPress,1999:61)726AggarwalCC,YuPS.Findinggeneralizedprojectedclustersinhighdimensionalspaces.ACMSIGMODinternationalconferenceonman-agementofdata.Dallas:ACMPress,2000:70)817GuhaS,RastogiR,ShimK.CURE:anefficientclusteringalgorithmforclusteringlargedatabases.ACMSIGMODInternationalConfer-enceonManagementofData.Seattle:ACMPress,1998:73)848HsuCC,HuangY.Incrementalclusteringofmixeddatabasedondistancehierarchy.ExpertSystemswithApplications,2008;35(3):1177)11859JainA,DubesR.Algorithmsforclusteringdata.EnglewoodCliffs,NJ:PrenticeHallCollegeDiv,198810SomloGL,HoweAE.Incrementalclusteringforprofilemainte-nanceininformationgatheringwebagents.TheFifthInternationalConferenceonAutonomousAgents,NewYork:ACMPress,2001:262)26911HartiganJA.Clusteringalgorithms.NewYork:JohnWiley&Sons,Inc,197512CarpenterG,GrossbergS.Art3:hierarchicalsearchusingchemicaltransmittersinsel-forganizingpatternrecognitionarchitectures.NeuralNetworks,1990;3(2):129)15213FisherD.Knowledgeacquisitionviaincrementalconceptualcluste-ring.MachineLearning,1987;2:139)17214CanF.Incrementalclusteringfordynamicinformationprocessing.ACMTransactionforInformationSystems,1993;11:143)16415CanF,FoxEA,SnavelyCD,etal.Incrementalclusteringforver-ylargedocumentdatabases:initialMARIANexperience.Informa-tionSystems,1995;84:101)11416LinJ,VlachosM,KeoghEJ,etal.Iterativeincrementalclusteringoftimeseries.LectureNotesinComputerScience,2004:106)12217CharikarM,ChekuriC,FederT,etal.Incrementalclusteringanddynamicinformationretrieva.lTheTwenty-ninthAnnualACMSym-posiumonTheoryofComputing.ElPaso:ACMPress,1997:626)63518EsterM,KriegelHP,SanderJ,etal.Incrementalclusteringformininginadatawarehousingenvironment.The24rdInternationalConferenceonVeryLargeDataBases.NY:MorganKaufmann,1998:323)33319SimoviciD,SinglaN,KuperbergM.Metricincrementalclusteringofnominaldata.The4thIEEEInternationalConferenceonDataMining.Brighton:IEEEComputerSocietyPress,2004:523)52620EsterM,KriegelHP,SanderJ,etal.Incrementalclusteringformininginadatawarehousingenvironment.The24rdInternationalConferenceonVeryLargeDataBases.NY:MorganKaufmann,1998:323)33321,,..,2002;13(1):1)722,鹍.数据仓库中