并行计算介绍

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

IntroductiontoParallelProgrammingMapReduceExceptwhereotherwisenotedallportionsofthisworkareCopyright(c)2007GoogleandarelicensedundertheCreativeCommonsAttribution3.0License•Intheearlydaysofcomputing,programswereserial,thatis,aprogramconsistedofasequenceofinstructions,whereeachinstructionexecutedoneaftertheother.Itranfromstarttofinishonasingleprocessor.Serialvs.ParallelProgramming•Parallelprogrammingdevelopedasameansofimprovingperformanceandefficiency.•Inaparallelprogram,theprocessingisbrokenupintoparts,eachofwhichcanbeexecutedconcurrently.•TheinstructionsfromeachpartrunsimultaneouslyondifferentCPUs.•TheseCPUscanexistonasinglemachine,ortheycanbeCPUsinasetofcomputersconnectedviaanetwork.Whydoit?•Notonlyareparallelprogramsfaster,theycanalsobeusedtosolveproblemsonlargedatasetsusingnon-localresources.•Whenyouhaveasetofcomputersconnectedonanetwork,youhaveavastpoolofCPUs,andyouoftenhavetheabilitytoreadandwriteverylargefiles(assumingadistributedfilesystemisalsoinplace)Howtodoit?•Thefirststepinbuildingaparallelprogramisidentifyingsetsoftasksthatcanrunconcurrentlyand/orpartitionsofdatathatcanbeprocessedconcurrently.Howtodoit?•Sometimesit'sjustnotpossible.ConsideraFibonaccifunction:Fk+2=Fk+Fk+1•Afunctiontocomputethisbasedontheformabove,cannotbeparallelizedbecauseeachcomputedvalueisdependentonpreviouslycomputedvalues.Howtodoit?•Acommonsituationishavingalargeamountofconsistentdatawhichmustbeprocessed.•Ifthedatacanbedecomposedintoequal-sizepartitions,wecandeviseaparallelsolution.•Considerahugearraywhichcanbebrokenupintosub-arrays….Howtodoit?Ifthesameprocessingisrequiredforeacharrayelement,withnodependenciesinthecomputations,andnocommunicationrequiredbetweentasks,wehaveanidealparallelcomputingopportunity.ImplementationStrategy:Master/Worker•TheMASTER:–initializesthearrayandsplitsitupaccordingtothenumberofavailableWORKERS–sendseachWORKERitssubarray–receivestheresultsfromeachWORKER•TheWORKER:–receivesthesubarrayfromtheMASTER–performsprocessingonthesubarray–returnsresultstoMASTERLoadBalancing•TheMaster/Workermodelimplementsstaticloadbalancingwhichiscommonlyusedifalltasksareperformingthesameamountofworkonidenticalmachines.•Loadbalancingreferstotechniqueswhichtrytospreadtasksamongtheprocessorsinaparallelsystemtoavoidsomeprocessorsbeingidlewhileothershavetasksqueueingupforexecution.LoadBalancing•Astaticloadbalancerallocatesprocessestoprocessorsatruntimewhiletakingnoaccountofcurrentnetworkload.•Dynamicalgorithmsaremoreflexible,thoughmorecomputationallyexpensive,andgivesomeconsiderationtothenetworkloadbeforeallocatingthenewprocesstoaprocessor.ClassicExample1•Consideroneofthemethodsforapproximatingπ.Thefirststepistoinscribeacircleinsideasquare:SomeCalculations•Areaofthesquare:As=(2r)2or4r2•Areaofthecircle:Ac=π*r2•So:pi=Ac/r2As=4r2r2=As/4pi=4*Ac/As“Parallelizing…”1.Randomlygeneratepointsinthesquare2.Countthenumberofgeneratedpointsthatarebothinthecircleandinthesquare3.r=thenumberofpointsinthecircledividedbythenumberofpointsinthesquare4.π=4*r“Parallelizing…”NUMPOINTS=100000;//somelargenumber-thebigger,theclosertheapproximationp=numberofWORKERS;numPerWorker=NUMPOINTS/p;countCircle=0;//oneoftheseforeachWORKER//eachWORKERdoesthefollowing:for(i=0;inumPerWorker;i++){generate2randomnumbersthatlieinsidethesquare;xcoord=firstrandomnumber;ycoord=secondrandomnumber;if(xcoord,ycoord)liesinsidethecirclecountCircle++;}MASTER:receivesfromWORKERStheircountCirclevaluescomputesPIfromthesevalues:PI=4.0*countCircle/NUMPOINTS;ClassicExample2:MapReduce•TheMapReduceprogrammingmodelderivesfromthemapandreducecombinatorsfromafunctionallanguagelikeLisp.•InLisp,amaptakesasinputafunctionandasequenceofvalues.Itthenappliesthefunctiontoeachvalueinthesequence.•Areducecombinesalltheelementsofasequenceusingabinaryoperation.Forexample,itcanuse+toaddupalltheelementsinthesequence.ClassicExample2:MapReduce•MapReduceisinspiredbytheseconcepts.•ItdevelopedwithinGoogleasamechanismforprocessinglargeamountsofrawdata,forexample,crawleddocumentsorwebrequestlogs.•Thisdataissolarge,itmustbedistributedacrossthousandsofmachinesinordertobeprocessedinareasonabletime.ClassicExample2:MapReduce•ThisdistributionimpliesparallelcomputingsincethesamecomputationsareperformedoneachCPU,butwithadifferentdataset.•MapReduceisanabstractionthatallowsGoogleengineerstoperformsimplecomputationswhilehidingthedetailsofparallelization,datadistribution,loadbalancingandfaulttolerance.Map•Map,writtenbyauseroftheMapReducelibrary,takesaninputpairandproducesasetofintermediatekey/valuepairs.•TheMapReducelibrarygroupstogetherallintermediatevaluesassociatedwiththesameintermediatekeyIandpassesthemtothereducefunction.Reduce•Thereducefunction,alsowrittenbytheuser,acceptsanintermediatekeyIandasetofvaluesforthatkey.Itmergestogetherthesevaluestoformapossiblysmallersetofvalues.Example•Considertheproblemofcountingthenumberofoccurrencesofeachwordinalargecollectionofdocuments:map(Stringkey,Stringvalue)://key:documentname//value:documentcontentsforeachwor

1 / 27
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功