OMPC++ ―A Portable High-Performance Implementation

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

OMPC++|APortableHigh-PerformanceImplementationofDSMusingOpenC++ReectionYukihikoSohda,HirotakaOgawa,andSatoshiMatsuokaDepartmentofMathematicalandComputingSciencesTokyoInstituteofTechnology2-12-1Oo-okayama,Meguro-ku,Tokyo152-8552TEL:(03)5734-3876FAX:(03)5734-3876email:fsohda,ogawa,matsug@is.titech.ac.jpAbstract.Platformportabilityisoneoftheutmostdemandedprop-ertiesofasystemtoday,duetothediversityofruntimeexecutionen-vironmentofwide-areanetworks,andparallelprogramsarenoexcep-tions.However,parallelexecutionenvironmentsareVERYdiverse,couldchangedynamically,whileperformancemustbeportableaswell.Asaresult,techniquesforachievingplatformportabilityaresometimesnotappropriate,orcouldrestricttheprogrammingmodel,e.g.,tosimplemessagepassing.Instead,weproposetheuseofreectionforachievingplatformportabilityofparallelprograms.Asaprototypeexperiment,asoftwareDSMsystemcalledOMPC++wascreatedwhichutilizesthecompile-timemetaprogrammingfeaturesofOpenC++2.5togenerateamessage-passingMPC++codefromaSPMD-style,shared-memoryC++program.Thetranslationcreatesmemorymanagementobjectsoneachnodetomanagetheconsistencyprotocolsforobjectsarraysresid-ingondierentnodes.Read-andwrite-barriersareautomaticallyin-sertedonreferencestosharedobjects.TheresultingsystemturnedouttobequiteeasytoconstructcomparedtotraditionalDSMconstruc-tionmethodologies.WeevaluatedthissystemonaPCclusterlinkedbytheMyrinetgigabitnetwork,andresultedinreasonableperformancecomparedtoahigh-performanceSMP.1IntroductionDuetorapidcommoditizationofadvancedhardware,parallelmachines,whichhadbeenspecializedandoflimiteduse,arebeingcommoditizedintheformofworkstationandPCclusters.Ontheotherhand,commoditysoftwaretechnolo-giessuchasstandardlibraries,object-orientation,andcomponentsarenotsu-cientforguaranteeingthatthesamecodewillworkacrossallparallelplatformsnotonlywiththesamesetoffeaturesbutalsosimilarperformancecharacteris-tics,similarfaultguarantees,etc.Suchperformanceportabilityisfundamentallydicultbecauseplatformsdierinprocessors,numberofnodes,communicationhardware,operatingsystems,libraries,etc.,despitecommoditization.Traditionally,portabilityamongstdiverseparallelcomputershavebeenei-therachievedbystandardlibrariessuchasMPI,orparallelprogramminglan-guagesandcompilerssuchasHPFandOpenMP[Ope97].However,suchef-fortswillcouldrequireprogrammingunderaxedprogrammingmodel.More-over,portableimplementationofsuchsystemsthemselvesarequitedicultandrequiresubstantialeortandcost.Instead,Reectionandopencompil-erscouldbealternativemethodologiesandtechniquesforperformanceportablehigh-performanceprograms.Basedonsuchabelief,wearecurrentlyembarkedontheOpenJIT[MOS+98]project.OpenJITisa(reective)Just-In-TimeopencompilerwrittenalmostentirelyinJava,andplugsintothestandardJDK1.1.xand1.2JVM.Atthesametime,OpenJITisdesignedtobeacompilerframeworksimilartoStan-fordSUIF[Suif],inthatitfacilitatesuser-customizablehigh-levelandlow-levelprogramanalysisandtransformationframeworks.WithOpenJIT,parallelpro-gramsofvariousparallelprogrammingmodelsinJavacompiledintoJavabyte-code,willbedownloadedandexecutedondiverseplatformsoverthenetwork,fromsingle-nodecomputerstolarge-scaleparallelclustersandMPPs,alongwithcustomizationclassesforrespectiveplatformsandprogrammingmodelsusingcompilermetaclasses1.Thequestionis,willsuchaschemebefeasible,especiallywithstrongre-quirementsforperformanceofhigh-performanceparallelprograms?Moreover,howmuchmetaprogrammingeortwouldsuchanapproachtake?So,asapre-cursorworkusingOpenC++,wehaveemployedreectiontoimplementDSM(distributedsharedmemory)inaportableway,tosupportJava’schiefmodelofparallelprogrammingi.e.,themultithreadedexecutionoversharedmem-ory.Morespecically,wehavedesignedasetofcompilermetaclassesandthesupportivetemplateclassesandruntimesforOpenC++2.5[Chi95,Chi97]thatimplementsnecessaryprogramtransformationswithitscompile-timeMOPforecientandportableimplementationofsoftware-basedDSMforprogramswrit-tenin(shared-memory)SPMDstyle,calledOMPC++.AmultithreadedC++programistransformedintomessage-passingprograminMPC++[Ish96]level0MTTL(multithreadtemplatelibrary),andexecutedonourRWC(Real-WorldComputingPartnership)-specPC-cluster,whosenodesarestandardPCsbutinterconnectedwiththeMyrinet[Myr]gigabitnetwork,andrunningtheRWC’sSCoreparalleloperatingsystembasedonLinux.TheresultingOMPC++isquitesmall,requiringapproximately700linesofmetaclass,templateclass,andruntimeprogramming.Also,OMPC++provedtobecompetitivewithtraditionalsoftware-basedDSMimplementationsaswellashardware-basedSMPmachines.Earlybenchmarkresultsusingnumericalcoreprogramswritteninshared-memorySPMD-styleprograms(afastparallelCG-kernel,andparallelFFTfromSPLASH2)shownthat,ourreectiveDSMim-plementationscaleswell,andachievesperformancecompetitivewiththatofhigh-performanceSMPs(SparcServer4000,whichhasdedicatedandexpensive1OpenJITisinactivedevelopment,andiscurrentlyreadyingtherstreleaseasofFeb.1,1999.hardwareformaintaininghardwarememoryconsistency).NotonlythisresultservesasasolidgroundworkforOpenJIT,butOMPC++itselfservesasahigh-performanceandportableDSMin

1 / 20
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功