HighPerformanceStorageSystemScalability:Architecture,ImplementationandExperienceRichardW.WatsonLawrenceLivermoreNationalLaboratorydwatson@llnl.govAbstractTheHighPerformanceStorageSystem(HPSS)providesscalablehierarchicalstoragemanagement(HSM),archive,andfilesystemservices.Itsdesign,implementationandcurrentdominantusearefocusedonHSMandarchiveservices.Itisalsoageneral-purpose,global,shared,parallelfilesystem,potentiallyusefulinotherapplicationdomains.WhenHPSSdesignandimplementationbeganoveradecadeago,scientificcomputingpowerandstoragecapabilitiesatasite,suchasaDOEnationallaboratory,wasmeasuredinafew10sofgigaops,dataarchivedinHSMsinafew10softerabytesatmost,datathroughputratestoanHSMinafewmegabytes/s,anddailythroughputwiththeHSMinafewgigabytes/day.Atthattime,theDOEnationallaboratoriesandIBMHPSSdesignteamrecognizedthatwewereheadedforadatastorageexplosiondrivenbycomputingpowerrisingtoteraops/petaopsrequiringdatastoredinHSMstorisetopetabytesandbeyond,datatransferrateswiththeHSMtorisetogigabytes/sandhigher,anddailythroughputwithaHSMin10softerabytes/day.ThispaperdiscussesHPSSarchitectural,implementationanddeploymentexperiencesthatcontributedtoitssuccessinmeetingtheaboveordersofmagnitudescalingtargets.Wealsodiscussareasthatneedadditionalattentionaswecontinuesignificantscalingintothefuture.1.IntroductionTheHighPerformanceStorageSystem(HPSS)providesscalablehierarchicalstoragemanagement(HSM),archive,andfilesystemservices.Itsdesign,implementationandcurrentdominantusearefocusedonHSMandarchiveservices.Itisalsoageneral-purpose,global,shared,parallelfilesystem,potentiallyusefulinotherapplicationdomains.WhenHPSSdesignandimplementationbeganoveradecadeago,scientificcomputingpowerandstoragecapabilitiesatasite,suchasaDOEnationallaboratory,wasmeasuredinafew10sofgigaops,dataarchivedinHSMsinafew10softerabytesatmost,datathroughputratestoanHSMinafewmegabytes/s,anddailythroughputwiththeHSMinafewgigabytes/day.Atthattime,theDOEnationallaboratory1andIBMHPSSdesignteamrecognizedthatwewereheadedforadatastorageexplosiondrivenbycomputingpowerrisingtoteraops/petaopsrequiringdatastoredinHSMstorisetopetabytesandbeyond,datatransferrateswiththeHSMtorisetogigabytes/sandhigher,anddailythroughputwithaHSMin10softerabytes/day.Therefore,wesetouttodesignanddeployasystemthatwouldscaleandevolvefromthebaseabovetowardtheseexpectedtargets.Thesetargetshavebeensuccessfullymet.Whiletherapidincreaseinbothcomputationalpowerandmemory,storagedevicecapacity,andnetworkingbandwidthhavemadetheseincreasesinstoragesystemcapacityandperformancepossible,withoutproperattentiontosoftwarearchitecture,implementationanddeployment,thishardwarepotentialcannotbefullyrealizedorexploited.Evenassumingnewfasterhardwareandaproperlydesignedandimplementedstoragesystem,successfulscaling,particularlyfordatatransfer,isnotjustamatterofplugginginthenewhardware,changingafewconfigurationsettingsandrunningthesystem.Itrequirescarefulattentiontoallphasesoftheend-to-endprocess.Therearemanydimensionsofscalabilitytowhichastoragesystemarchitectureandimplementationmustpayattention.Thispaperdiscussesthosedimensionsandillustratesthearchitecturalapproachandsomeoftheimplementationchoicesanddeploymentexperiencesthathavefacilitatedachievingscalabilityinthesedimensions.Italsodiscussessomeareaswherefurtherworkisrequiredasthesystemcontinuestoscaleacrossthesedimensionsinthefuture.1LawrenceLivermore(LLNL),LosAlamos(LANL),LawrenceBerkeley-NationalEnergyResearchSupercomputerCenter(NERSC),OakRidge(ORNL),andSandia(SNL)NationalLaboratories.Proceedingsofthe22ndIEEE/13thNASAGoddardConferenceonMassStorageSystemsandTechnologies(MSST2005)0-7695-2318-8/05$20.00©2005IEEEScalabledatathroughput:Thisdimensionfocusesonend-to-endI/Othroughput,forbothsinglefilesandfortheaggregatethroughputofmanysimultaneousfiletransfersorI/Ooperations.Scalablestoragecapacityandstoragespacemanagement:Thisdimensionincludesscalingstoragecapacity,numbersandtypesofstoragedevices,andfilesandfilesizes.Italsoincludesscalablespacemanagementformigrationandpurgeofdiskcache.Scalablerobustness:Thisdimensionincludestheabilityofthesystemto(1)tolerateorrecoverfromhardwarefailureswithoutlossofuserdataorsystemmetadataand(2)tomaintaintheconsistencyofbothuserdataandsystemmetadatainthefaceofconcurrentaccessesduringnormaloperation.Scalablenameservice:ThisdimensionforHPSSinvolvesascalablehierarchicaldirectoryservicewithvirtuallyunlimitednumbersofdirectoriesanddirectoryentries,andaglobalnamespacespanningmultipledistributedHPSSsystems.Italsoincludesscalingthenumberofsimultaneousdirectoryaccessesandaccessperformance.Scalablenumbersofclients:Thisdimensionincludesbothincreasingnumbersofendusersandinternalclientsandassociatedconcurrentoperations.Scalabledeploymentacrossgeographicaldistancesandmultiplecooperatinginstitutions:Thisdimensioninvolvesdistributionofdatastoragedevicesandmetadataforperformanceandrobustness,andintegrationofmultiplestoragesystemsintoaglobalnamespaceandsecureenvironment.Scalablestoragesystemmanagement:Thisdimensionenablessystemadministratorstomanageandconfig