基于hadoop的海量日志分析计算

xwy803
0 ℃
2020-11-16

整理文档很辛苦，赏杯茶钱您下走！

还剩 ... 页未读，继续阅读 >>

免费阅读已结束，点击下载阅读编辑剩下 ... 页

阅读已结束，您可以下载文档离线阅读编辑

资源描述

H Hadoo op 201 114 UDC CA Analysisa 201 Haandcalcul 114 20adooplationofm 0114massivelog 104gbasedon 2011 10497497208250 nHadoop 430070 5 00 0 I 2005CPU18ITGoogleIBMFaceBookYaohooHadoopHadoopMapReduceHadoopHadoopHadoopHadoopHDFSMapReduceHadoopHadoopHadoopHadoopHadoopHadoopHDFSMapReduce IIAbstract Withdevelopmentofscientifictechnology,thetransistorcircuithasbeengraduallyapproachingitsphysicallimitsontheperformance.Moor’Lawhasceasesedtobeinforceafter2005.ThecomputingpowerofsingleCPUisdoubledevery18monthsthatcannotbepossible.But,peopleon-lineexplode,thesecompanieswhoareprovingservicesonnetworkhavetoanalyzemassiverecordlogseverydayinordertomodifytheproductstomeetthecustomers’srequirementsintime.So,somecriticaldataoftheproductshouldbeprocessedinagiventime.Traditionaldatabasetechnologycannotprovideenoughcomputationalabilityandstoragetoalldatatomeetcustormer’sprocessingdataneeds.Peoplegiveaconceptofcloudcomputingtosolvethisproblem.Thisconceptcometobethedirectioninnearfurther.Nowadays,ITindustrybusinessgiantsuchasGoogle,IBM,FaceBook,YaohooandMicrosofthavetakenitsowncloudcomputingplatformtoprocessmassivedataandprovidecomputationalability.Inthispaper,Google’sHadoopcloudcomputingplatformwasselectedtoenhancethepowerofprocessinglargeoflog.Hadoopisanopensourcedistributedcomputingframework.Thisframeworkowngoodexpandcapactity,cheaperoperatingcosts,higherefficiencyandbetterstability.themore,MapReduceprogrammingmodelcanbecompatiblewithprocessingtextapplicationperfectly.Secondly,Hadoopcandealwithalllowermessagesforprogrammersduringparallelcomputing.Programmersonlyneedtodealwiththelogicalofdataandunnecessarytoconsiderthemessagesbetweentheparallelcomputersonhadoopcloudcomputing.Theprogrammerscanfocusonthecriticalissuesandspeedupprogramdevelopment.So,Hadoopplatformwaswidelyusedlaterreleased.Thispaperin-depthstudiedHadoop’sHDFSandMapReducemodel.AccordingtoHadoop’smodelofprocessingdata,wedesignprocessingdatamodeltofitourbusinessrequirements.Thismodelisappliedtopracticeworktosolvemassivelogprocessingandcutdownthetimeofdataprocessing.ThemostimportisHadoop IIIcloudplatformsolvedsingleseverdataprocessingpowerbottleneck.Inthispaper,Hadoopcloudcomputingplatformwasdesignedandimplemented.Onthehadoopplatform,Thedata-processmodelwasdesignedandimplementedtoresolvelogstatisticsandimprovethespeedofmassivelogprocessing.Programmingfordata-processsomestatisticproductonownHadoopcloudplatformanddosomeperformancetest.Byanalyzingrelationshipbetweencomputingpowerandnumberofworknodes,comparingthecomputingpowerofmultiplenodeswithsingledatabasecomputing,experimentaldatashowhadoophasastrongadvantageofpowerdealingwithmassivedata.Keywords:HadoopHDFSMapReduceCloudcomputingmassivedataprocessingandanalysis i ...........................................................................................................................I Abstract.........................................................................................................................II ...........................................................................................................................I 1...........................................................................................................1 1.1.............................................................................................................1 1.2.............................................................................................3 1.3.................................................................................................5 2...........................................................................................6 2.1HDFS...........................................................................................................6 2.2HDFS......................................................................................................7 2.3.............................................................................................7 2.4...............................................................................10 3HadoopMapReduce...................................................................11 3.1MapReduce................................................................................................11 3.2MapReduce................................................................................12 3.3...........................................................................................13 3.4...........................................................................................................14 4Hadoop................................................................