I.J.EngineeringandManufacturing2011,5,59-65PublishedOnlineOctober2011inMECS()DOI:10.5815/ijem.2011.05.08Availableonlineat—webservicestechnologyintodistributeddataminingfield,andtooksometentativeeffortsinsolvingtheproblemofdesigningforsuitablearchitectureofdistributeddataminingsystemsandcorrespondingdistributedminingalgorithms.IndexTerms:DataMining;DistributedComputing;ComponentTechnology;WebService©2011PublishedbyMECSPublisher.Selectionand/orpeerreviewunderresponsibilityoftheResearchAssociationofModernEducationandComputerScience.1.IntroductionWiththerapiddevelopmentofinformationtechnology,wecaneasilyaccessandstorevariousdata.However,wearenowfacedwithissuesthatawealthofinformationresourcesbutlackofknowledge.Peoplewillnotbesatisfiedwithsurfacetreatmentofdata,suchasstatisticsandinquiry,sodighiddeninformationandintrinsicrelationshipbetweenthedata,naturallybecameanimportanttask.AsthelargestdatasourceInternetis,therearealotofdifferenttypesofinformationresources,andcontainstheknowledgeofgreatpotentialvalue.Equally,therearealotofdatasourcesofrichinformation,peopleareeagertoobtainvaluableresourcesandknowledgefromthesedatasources.DataMiningisatechnologywhichintelligentlyandautomaticallyconvertsdataintousefulinformationandknowledge,itbecomethefocusofinformationtechnology.Thisisanewcross-disciplinarywhichbasedonstatistics,patternrecognition,artificialintelligence,machinelearning,databasetechnologyandhigh-performanceparallelcomputingandotherfields,ithasbeensuccessfullyappliedintheeconomic,financial,astronomyandotherindustriesfield.Theoriginaldatasetcanbestructured,semi-structured,evenheterogeneousdatawhichdistributeonthenetwork.Miningknowledgecanadoptmathematics,non-mathematical,deductiveandinductivemethods.60ResearchofDistributedDataMiningSystemBasedonWebServices2.OverviewofDistributedDataMining2.1.WhatisdistributeddataminingDistributeddatamininghastwomeanings:first,byusingdistributedalgorithm,itisaprocedurethatdiscoversknowledgefromlogicalorphysicaldistributeddatasources[1].Distributionisemphasizedhere.Second,users,data,miningsoftwareandothersoftwarecomponentswhichrelatetoacertaindataminingtaskaregeographicallydispersed.Itisemphasizedondispersionofthesoftcomponent.2.2.Theproblemtobesolvedbydistributeddatamining1)Globalcentralizedcontrol:inordertofacilitatetherealizationofdistributeddatamining,asitewhichisusedforcentralizedcontrollingisnecessary,inthecasethatthereisnoglobalcontrolsite,thecommunicationoverheadthewholesystemisverylarge.Inordertogettheglobalknowledge,allthesiteswillbealotofradio,comparingtoglobalcontrolsite,thereisnodoubtthatcostanddifficultyismuchbigger.2)Parallelanddistributeddataminingalgorithms:thisisactuallypastforperformanceissues,runningdataminingalgorithmsonthelarge-capacitydatasetwilltakealongtime,becausethetimecomplexityofdataminingalgorithmishigh,abetterapproachistouseparalleldataminingalgorithms,partitiondatasetintoseveralsubsets,andcombinetheresultsofeachsubsetofminingafterparallelprocessing[2].3)Knowledgesharing:whenmakingdistributedminingbetweeneachsite,itisnecessarytoselectaunderstoodknowledgeform.Oneisthatdistributeddatamininggenerallyconsistsoftheexcavationfortheknowledge,sowemusttakeaunifiedunderstoodknowledgerepresentationinordertoachievesynergyminingbetweeneachsite.Theotheristhatusersmayneedaccesstoknowledgeonothersites,italsorequiresageneralknowledgerepresentation.4)Distributedsoftwaredesign:softcomponentisadistributedobjectwhichdoesnotbindwithotherspecificprogramorcomputerlanguage.Itcanacrossheterogeneousplatforms,withencapsulation.Itinteractswiththeoutsideworldbyre-definedapplicationprograminterface.Itsbiggestadvantageissupportingforsoftwarereuse,systemdesignerscanuseexistingsoftwarecomponents,thiswilloptimizethedivisionoflabor,greatlyreducingthecodingworkload.3.WebServicesTechnology3.1.WebservicesarchitectureAtypicalwebservicearchitectureisshowninFigure1.Therearethreerolesinthewebservicesarchitecture:servicesprovider,serviceregistryandservicerequestor.Serviceproviderissupplierwhoprovidesthefinalwebservices,itimplementsaapplicationwhichiswrittenforaparticulardemandandplacedintheonlineserverforotherstouse.Fromabusinessperspective,serviceprovideristheownerofwebservicesandinchargeofpublishing,updatingandrecyclingoftheservices,service