The Peregrine high-performance RPC system

dinghai12345
2 ℃
2020-01-25

整理文档很辛苦，赏杯茶钱您下走！

还剩 ... 页未读，继续阅读 >>

免费阅读已结束，点击下载阅读编辑剩下 ... 页

阅读已结束，您可以下载文档离线阅读编辑

资源描述

ThePeregrineHigh-PerformanceRPCSystemDavidB.Johnson1WillyZwaenepoelDepartmentofComputerScienceRiceUniversityP.O.Box1892Houston,Texas77251-1892dbj@cs.cmu.edu,willy@cs.rice.eduAversionofthispaperappearedinSoftware—Practice&Experience,23(2):201–221,February1993.ThisworkwassupportedinpartbytheNationalScienceFoundationunderGrantsCDA-8619893andCCR-9116343,andbytheTexasAdvancedTechnologyProgramunderGrantNo.003604014.1Author’scurrentaddress:SchoolofComputerScience,CarnegieMellonUniversity,Pittsburgh,PA15213-3891.SummaryThePeregrineRPCsystemprovidesperformanceveryclosetotheoptimumallowedbythehardwarelimits,whilestillsupportingthecompleteRPCmodel.ImplementedonanEthernetnetworkofSun-3/60workstations,anullRPCbetweentwouser-levelthreadsexecutingonseparatemachinesrequires573microseconds.ThistimecompareswellwiththefastestnetworkRPCtimesreportedintheliterature,rangingfromabout1100to2600microseconds,andisonly309microsecondsabovethemeasuredhardwarelatencyfortransmittingthecallandresultpacketsinourenvironment.Forlargemulti-packetRPCcalls,thePeregrineuser-leveldatatransferratereaches8.9megabitspersecond,approachingtheEthernet’s10megabitpersecondnetworktransmissionrate.Betweentwouser-levelthreadsonthesamemachine,anullRPCrequires149microseconds.ThispaperidentiﬁessomeofthekeyperformanceoptimizationsusedinPeregrine,andquantitativelyassessestheirbeneﬁts.Keywords:Peregrine,remoteprocedurecall,interprocesscommunication,performance,distributedsystems,operatingsystems1.IntroductionThePeregrineremoteprocedurecall(RPC)systemisheavilyoptimizedforprovidinghigh-performanceinterprocesscommunication,whilestillsupportingthefullgeneralityandfunctionalityoftheRPCmodel[3,10],includingargumentsandresultvaluesofarbitrarydatatypes.ThesemanticsoftheRPCmodelprovidesampleopportunitiesforoptimizingtheperformanceofinterprocesscommunication,someofwhicharenotavailableinmessage-passingsystemsthatdonotuseRPC.ThispaperdescribeshowPeregrineexploitstheseandotheropportunitiesforperformanceimprovement,andpresentsPeregrine’simplementationandmeasuredperformance.WeconcentrateprimarilyonoptimizingtheperformanceofnetworkRPC,betweentwouser-levelthreadsexecutingonseparatemachines,butwealsosupportefﬁcientlocalRPC,betweentwouser-levelthreadsexecutingonthesamemachine.High-performancenetworkRPCisimportantforsharedserversandforparallelcomputationsexecutingonnetworksofworkstations.PeregrineprovidesRPCperformancethatisveryclosetothehardwarelatency.FornetworkRPCs,thehardwarelatencyisthesumofthenetworkpenalty[6]forsendingthecallandtheresultmessageoverthenetwork.Thenetworkpenaltyisthetimerequiredfortransmittingamessageofagivensizeoverthenetworkfromonemachinetoanother,andismeasuredwithoutoperatingsystemoverheadorinterruptlatency.Thenetworkpenaltyisgreaterthanthenetworktransmissiontimeforpacketsofthesamesizebecausethenetworkpenaltyincludesadditionalnetwork,device,andprocessorlatenciesinvolvedinsendingandreceivingpackets.LatencyforlocalRPCsisdeterminedbytheprocessorandmemoryarchitecture,andincludestheexpenseoftherequiredlocalprocedurecall,kerneltraphandling,andcontextswitchingoverhead[2].WehaveimplementedPeregrineonanetworkofSun-3/60workstations,connectedbya10megabitpersecondEthernet.Theseworkstationseachusea20-megahertzMotorolaMC68020processorandanAMDAm7990LANCEEthernetnetworkcontroller.TheimplementationusesanRPCpacketprotocolsimilartoCedarRPC[3],exceptthatablastprotocol[20]isusedformulti-packetmessages.TheRPCprotocolislayereddirectlyontopoftheIPInternetdatagramprotocol[13].Inthisimplementation,themeasuredlatencyforanullRPCwithnoargumentsorreturnvaluesbetweentwouser-levelthreadsexecutingonseparateSun-3/60workstationsontheEthernetis573microseconds.ThistimecompareswellwiththefastestnullnetworkRPCtimesreportedintheliterature,rangingfromabout1100to2600microseconds[3,12,8,15,17,19],andisonly309microsecondsabovethemeasuredhardwarelatencydeﬁnedbythenetworkpenaltyforthecallandresultpacketsinourenvironment.AnullRPCwithasingle1-kilobyteargumentrequires1397microseconds,showinganincreaseoverthetimefornullRPCwithnoargumentsofjustthenetworktransmissiontimefortheadditionalbytesofthecallpacket.Thistimeis338microsecondsabovethenetworkpenalty,andisequivalenttoauser-leveldatatransferrateof5.9megabitspersecond.Forlargemulti-packetRPCcalls,thenetworkuser-leveldatatransferratereaches8.9megabitspersecond,achieving89percentofthehardwarenetworkbandwidthand95percentofthemaximumachievabletransmissionbandwidthbasedonthenetworkpenalty.Betweentwouser-levelthreadsexecutingonthesamemachine,anullRPCwithnoargumentsorreturnvaluesrequires149microseconds.InSection2ofthispaper,wepresentanoverviewofthePeregrineRPCsystem.Section3discussessomeofthekeyperformanceoptimizationsusedinPeregrine.InSection4,wedescribethePeregrineimplementation,includingsingle-packetnetworkRPCs,multi-packetnetworkRPCs,andlocalRPCs.ThemeasuredperformanceofPeregrineRPCispresentedinSection5.InSection6,wequantifytheeffectivenessoftheoptimizationsmentionedinSection3.Section7comparesourworktootherRPCsystems,andSection8presentsourconclusions.12.Overview