基于hadoop的海量日志分析计算

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

                                                                                          H                          Hadoo                          op    201            114                                                     UDC    CA  Analysisa         201           Haandcalcul           114           20adooplationofm                       0114massivelog                        104gbasedon                   2011             10497497208250 nHadoop             430070  5                          00         0                                          I   2005CPU18ITGoogleIBMFaceBookYaohooHadoopHadoopMapReduceHadoopHadoopHadoopHadoopHDFSMapReduceHadoopHadoopHadoopHadoopHadoopHadoopHDFSMapReduce IIAbstract Withdevelopmentofscientifictechnology,thetransistorcircuithasbeengraduallyapproachingitsphysicallimitsontheperformance.Moor’Lawhasceasesedtobeinforceafter2005.ThecomputingpowerofsingleCPUisdoubledevery18monthsthatcannotbepossible.But,peopleon-lineexplode,thesecompanieswhoareprovingservicesonnetworkhavetoanalyzemassiverecordlogseverydayinordertomodifytheproductstomeetthecustomers’srequirementsintime.So,somecriticaldataoftheproductshouldbeprocessedinagiventime.Traditionaldatabasetechnologycannotprovideenoughcomputationalabilityandstoragetoalldatatomeetcustormer’sprocessingdataneeds.Peoplegiveaconceptofcloudcomputingtosolvethisproblem.Thisconceptcometobethedirectioninnearfurther.Nowadays,ITindustrybusinessgiantsuchasGoogle,IBM,FaceBook,YaohooandMicrosofthavetakenitsowncloudcomputingplatformtoprocessmassivedataandprovidecomputationalability.Inthispaper,Google’sHadoopcloudcomputingplatformwasselectedtoenhancethepowerofprocessinglargeoflog.Hadoopisanopensourcedistributedcomputingframework.Thisframeworkowngoodexpandcapactity,cheaperoperatingcosts,higherefficiencyandbetterstability.themore,MapReduceprogrammingmodelcanbecompatiblewithprocessingtextapplicationperfectly.Secondly,Hadoopcandealwithalllowermessagesforprogrammersduringparallelcomputing.Programmersonlyneedtodealwiththelogicalofdataandunnecessarytoconsiderthemessagesbetweentheparallelcomputersonhadoopcloudcomputing.Theprogrammerscanfocusonthecriticalissuesandspeedupprogramdevelopment.So,Hadoopplatformwaswidelyusedlaterreleased.Thispaperin-depthstudiedHadoop’sHDFSandMapReducemodel.AccordingtoHadoop’smodelofprocessingdata,wedesignprocessingdatamodeltofitourbusinessrequirements.Thismodelisappliedtopracticeworktosolvemassivelogprocessingandcutdownthetimeofdataprocessing.ThemostimportisHadoop IIIcloudplatformsolvedsingleseverdataprocessingpowerbottleneck.Inthispaper,Hadoopcloudcomputingplatformwasdesignedandimplemented.Onthehadoopplatform,Thedata-processmodelwasdesignedandimplementedtoresolvelogstatisticsandimprovethespeedofmassivelogprocessing.Programmingfordata-processsomestatisticproductonownHadoopcloudplatformanddosomeperformancetest.Byanalyzingrelationshipbetweencomputingpowerandnumberofworknodes,comparingthecomputingpowerofmultiplenodeswithsingledatabasecomputing,experimentaldatashowhadoophasastrongadvantageofpowerdealingwithmassivedata.Keywords:HadoopHDFSMapReduceCloudcomputingmassivedataprocessingandanalysis i   ...........................................................................................................................I Abstract.........................................................................................................................II ...........................................................................................................................I 1...........................................................................................................1 1.1.............................................................................................................1 1.2.............................................................................................3 1.3.................................................................................................5 2...........................................................................................6 2.1HDFS...........................................................................................................6 2.2HDFS......................................................................................................7 2.3.............................................................................................7 2.4...............................................................................10 3HadoopMapReduce...................................................................11 3.1MapReduce................................................................................................11 3.2MapReduce................................................................................12 3.3...........................................................................................13 3.4...........................................................................................................14 4Hadoop................................................................

1 / 32
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功