Topology and Routing in Clusters From Theory to Pr

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

TopologyandRoutinginClusters:FromTheorytoPracticeYoavEtsionMickaelRaizmanDrorG.FeitelsonSchoolofComputerScienceandEngineeringTheHebrewUniversityofJerusalem91904Jerusalem,IsraelAbstractDesignersofcommunicationsubsystemsforclustersoftenpresentperformancedatabymeasuringbandwidthandlatencyforsinglepoint-to-pointconnections.Suchdataisnotsensitivetoroutingalgorithmsandtonetworktopology,andlittleexperimentalevidencerelatingtotheimpactofthesefactorsonperformancehasbeencollected.Ontheotherhandthereisabundanttheoreticalevidencefortheuseoftopologieswithhighbisectionbandwidthandmultipleandevenrandomizedroutes.Weshowthatindeedtheseideascanbeappliedinpracticetoachievesignicantbenets.Inthecaseoftopology,weshowthatthecharacteristicsofcommerciallyavailableMyrinetswitchesallowfortheconstructionofrelatedtopologieswithrelativelyfewaddedlinksthatprovidemuchbettersupportforintensivecommunicationpatterns.Inthecaseofrouting,weshowsimplemechanismsforimplementingmultiplepathsinFMandrandomizationbyrandomizedmappingoflogicalnodes.Thesemechanismsalleviatecongestionduetouneventracpatterns.1IntroductionClustersofcommodityPCsconnectedbyfastLANsarebecomingacommonarchitectureforhigh-performancecomputing[25].AninterestingdesignissueforsuchclustersishowtoconguretheLANforbestperformance.Thisincludesthechoiceoftopology,theroutingalgorithm,andtheinterplaybetweenthem.Whiletheseissueshavereceivedmuchattentionintheprofessionalliterature,thereislittleexperimentaldataregardingthematchbetweentheoryandpractice.feit@cs.huji.ac.il,tel:+97226584115,fax:+97226511912.1Tworecurringideasininterconnectionnetworkresearcharetheuseoftopologieswithahighbisectionbandwidth,andtheuseofdynamicroutingratherthanasinglepredenedroutebetweeneachpairofnodes.WeinvestigatedtheseideasusingtheParParclusterasourexperimentaltestbed.Thisisaclusterof16Pentium-Pro200PCsconnectedbyaMyrinetLAN.CommunicationsoftwareisbasedonFastMessages.Inourexperimentswecheckedasetofrelatedtopologieswithincreasingbisectionband-width.Wefoundthatindeedhavingaverysparsetopologymayhurttheachievablepeakbandwidth.However,thebisectionshouldnotbemeasuredinunitsof\numberoflinks,butratherinunitsofbandwidth.Inourcase,thenetworkisfasterthanthenodes,sofewerlinksthanthe\idealbisectionaresucient.Wepostulatethattheoppositeisalsotrue:arelativelyslownetworkcanbecompensatedforbyusingmorelinks,providedtheroutingmechanismknowshowtousethem.Inanothersetofexperiments,wecomparedtheuseofasingleroutebetweeneachpairofnodes,usingalternateroutes,andusingarandomizedroute.Theresultsinthiscasewerethatusingalternateroutesimprovedperformance,butrandomizationimproveditevenmore.Theimplicationisthataclustermanagementsystemneednottakepainstomapnodesinanorderlymanner;onthecontrary,mappingthelogicalnodesrandomlytophysicalnodesmaypromotemoreecientuseofthenetwork.Itshouldbenotedthatinallcasesourtestprogramswerenotoptimizedforthetopol-ogyandroutingalgorithmbeingused.Thisreectscommonusageonclusters,whichareperceivedaso-the-shelfgeneral-purposeplatformsforhighperformancecomputing.Forexample,parallelprogramsonmanyclustersarebasedonusingtheMPICHimplementationofthestandardMPIinterface[16].Thisimplementationispopularbecauseitisportable:thesamecorealgorithmsarealwaysused,andonlyathinabstractionlayerhastobere-writtenforeachnewplatform.Ourresearchisaimedatndingmechanismsthatmaybeexpectedtoworkfairlywellforabroadrangeofapplications,ratherthanatattemptingtoachieveoptimalperformanceatthecostofsignicantspecialization.1.1MyrinetOurexperimentalplatformisbasedontheMyrinetproductfromMyricom.Myrinetisagigabit-per-secondswitchedLAN[6],whichisaleadingcandidateforhigh-performancecommunicationinclusterenvironments[2].CreatingaMyrinetrequiresthreecomponents:anetworkinterfacecardwithmemoryandacommunicationsco-processor(calledtheLANai)ineachnode,oneormoreswitches,andcablestoconnectthenodestotheswitchesandtheswitchestoeachother.Inprinciple,anytopologycanbeused.Inpractice,theusabletopologiesareconstrainedbythetypesofswitchesthatareavailable.Untilrecently,themostcommonswitchwasthe2ABFigure1:Dual8-wayMyrinetswitch:schematicandpictureofswitchwithcables.dual8-wayswitch,whichisaboxcontainingtwoswitcheswith8portseach.TheswitchesaredenotedAandB.Theboxitselfhas8two-linkports,eachservingapairofswitchports,onefromeachswitch(Fig.1).Normallybothlinkscanbeaccommodatedinasinglecable,butasplittercanbeusedtoseparatethetwolinksandchannelthemtodierentcables.Thebestwaytoconnectupto8nodesistouseonlyoneofthetwoswitchesinthebox.Whenusingboth,upto14nodescanbeconnected(becauseatleastoneportoneachswitchisneededinordertoconnecttheswitchestoeachother).Thereforeatleasttwodualswitchesareneededinordertocreatea16-nodecluster.Insuchacongurationthereareatotalof32ports:16forthenodes,andanother16for8inter-switchlinks,leadingtoadesignsuchasthatshownonthetopofFig.2.Theconnectivitycanbeincreasedbyusing3dualswitches,asshownatthebottomofthegure(actuallythisdesigncansupport18nodes).Forlargerclusters,moreswitchesareneeded.Also,anoctswitchcanbeusedasthebasicbuildingblock.Thisisaboxcontainingeight8

1 / 20
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功