I.J.ModernEducationandComputerScience,2016,5,12-18PublishedOnlineMay2016inMECS()DOI:10.5815/ijmecs.2016.05.02Copyright©2016MECSI.J.ModernEducationandComputerScience,2016,5,12-18AnEnsembleofAdaptiveNeuro-FuzzyKohonenNetworksforOnlineDataStreamFuzzyClusteringZhengbingHuSchoolofEducationalInformationTechnology,CentralChinaNormalUniversity,Wuhan,ChinaEmail:hzb@mail.ccnu.edu.cnYevgeniyV.BodyanskiyKharkivNationalUniversityofRadioElectronics,Kharkiv,Ukraine,Email:yevgeniy.bodyanskiy@nure.uaOleksiiK.TyshchenkoandOlenaO.BoikoKharkivNationalUniversityofRadioElectronics,Kharkiv,Ukraine,Email:lehatish@gmail.com,olena.boiko@ukr.netAbstract—Anewapproachtodatastreamclusteringwiththehelpofanensembleofadaptiveneuro-fuzzysystemsisproposed.Theproposedensembleisformedwithadaptiveneuro-fuzzyself-organizingKohonenmapsinaparallelprocessingmode.Theirlearningprocedureiscarriedoutwithdifferentparametersthatdefineanatureofclusterborders’blurriness.Clusters’qualityisestimatedinanonlinemodewiththehelpofamodifiedpartitioncoefficientwhichiscalculatedinarecurrentform.Afinalresultischosenbythebestneuro-fuzzyself-organizingKohonenmap.IndexTerms—ComputationalIntelligence,DataStreamProcessing,Neuro-FuzzySystem,FuzzyClustering,MachineLearning.I.INTRODUCTIONMultidimensionaldataclusteringiscommoninDataMiningtasks.SuchapplicationareasasTextMiningandWebMininghavebecomereallywidespreadlately.Atraditionalapproachtosolvingthissortoftasksassumesthateachvectorofaprocessedsequencemayonlybelongtoasingleclass.Althoughit’samorenaturalcasewheneachspecificobservationmaybeattributedtoseveralclassesatthesametimewithdifferentmembershiplevels.Thissituationisasubjectunderstudyforfuzzyclusteranalysis[1,2].Inthisapproach,themosteffectiveandsimplestmethodsareprobabilisticfuzzyclusteringproceduresbasedonoptimizationofsomeobjectivefunctions.InitialdataforafuzzyclusteringproblemisasampleofobservationswhichconsistsofN1mdimensionalfeaturevectors1,2,,,nXxxxkxNR,andaresultofthisclusteringprocedureisapartitionoftheinitialdatasetintomoverlappingclasseswithsomemembershiplevels01jukofthekthfeaturevectortothejthcluster,1,2,,jm.Thus,theoverwhelmingmajorityofthewell-knownfuzzyclusteringalgorithmsisdesignatedforabatchmodeprocessingwhichmeansthatasamplevolumeNcan’tbechangedwhilethedataareprocessed.There’sawideclassoftaskstobesolvedonlywiththehelpoftheDataStreamMining[3-16]approachwhendataarefedandprocessedinanonlinemode.ThistaskisrathertypicalforWebMiningwheninformationisfedinarealtimemodedirectlyfromtheInternet.Self-organizingmaps(SOMs)byKohonenproveditsefficiencyinclusteringtasks.Theirefficiencyisdefinedbytheircomputationalsimplicityandtheirabilitytoworkinarealtimemodeforsequentialdataprocessing.Theseneuralnetworksarelearntwiththehelpofself-learningproceduresbasedontheprinciples“Winnertakesall”(WTA)and“Winnertakesmore”(WTM).It’spreviouslyassumedthatastructureofprocesseddataimpliesthatformedclustersdon’tmutuallyintersectwhichmeansthatit’spossibletobuildaseparatinghyper-surfacewhichclearlydistinguishdifferentclassesduringalearningprocedureofaneuralnetwork.Recurrentmodificationsofthefuzzyclusteringalgorithms(whichmakeitpossibletosolveataskinanonlinemode)wereintroducedforsequentialdataprocessingin[17,18].ItshouldbenotedthattheintroducedproceduresarestructurallyclosetotheKohonenself-learningruleaccordingtotheprinciple«WinnerTakesMore».Itallowsintroducingaso-called«fuzzyclusteringKohonennetwork»[19]whichpossessesanumberofadvantagescomparingtoaconventionalself-organizingmap.Thewell-knownandmostcommonlyusedfuzzyclusteringalgorithmscan’tbecalledfuzzyinthefullsense,becausetheirresultsaresignificantlydefinedbyavalueofaspecialparameter(alsoknownasafuzzifierAnEnsembleofAdaptiveNeuro-FuzzyKohonenNetworksforOnlineDataStreamFuzzyClustering13Copyright©2016MECSI.J.ModernEducationandComputerScience,2016,5,12-18whichischosenempirically).Acasewhenbelongstoanintervalfrom1tocorrespondstoatransitionfromcrispborders1,whichareobtainedwiththehelpoftheK-meansprocedure,totheircompleteblurriness,whenallobservationsbelongtoallclusterswiththesamemembershiplevel.Weshouldnotethat2inmostcasesthatcorrespondstothefuzzyC-meansprocedure(FCM)byBezdek[20].Theremaybeasituationwhileprocessingreal-worlddatawhenoneobjectbelongstodifferentclassesatthesametimeandtheseclassesmutuallyintersect(overlap).ConventionalSOMsdon’ttakeintoconsiderationthisoccasion,butthisproblemcanbeconsideredwiththehelpoffuzzyclusteringtechniques.Theremainderofthispaperisorganizedasfollows:Section2describesfuzzyclusteringtechniqueswithavariablefuzzifier.Section3describesanensemble’sarchitectureofadaptiveneuro-fuzzyKohonennetworks.Section4givessomedetailsonpossibilisticfuzzyclusteringwithavariablefuzzifier.Section5presentsareal-worldapplicationtobesolvedwiththehelpoftheproposedfuzzyclusteringapproach.Conclusionsandfutureworkaregiveninthefinalsection.II.FUZZYCLUSTERINGWITHAVARIABLEFUZZIFIERAlgorithmsbasedongoalfunctionsareconsideredtobestrictfromamathematicalpointofviewamongallclusteringproc