The_Google_File_System Final

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

TheGoogleFileSystemBySanjayGhemawat,HowardGobioff,andShun-TakLeung(PresentedatSOSP2003)Introduction„Google–searchengine.„Applicationsprocesslotsofdata.„Needgoodfilesystem.„Solution:GoogleFileSystem(GFS).MotivationalFacts„Morethan15,000commodity-classPC's.„Multipleclustersdistributedworldwide.„Thousandsofqueriesservedpersecond.„Onequeryreads100'sofMBofdata.„Onequeryconsumes10'sofbillionsofCPUcycles.„GooglestoresdozensofcopiesoftheentireWeb!Conclusion:Needlarge,distributed,highlyfault-tolerantfilesystem.Topics„DesignMotivations„Architecture„Read/Write/RecordAppend„Fault-Tolerance„PerformanceResultsDesignMotivations1.Fault-toleranceandauto-recoveryneedtobebuiltintothesystem.2.StandardI/Oassumptions(e.g.blocksize)havetobere-examined.3.Recordappendsaretheprevalentformofwriting.4.GoogleapplicationsandGFSshouldbeco-designed.GFSArchitecture(Analogy)„Onasingle-machineFS:„Anupperlayermaintainsthemetadata.„Alowerlayer(i.e.disk)storesthedatainunitscalled“blocks”.„Upperlayerstore„IntheGFS:„Amasterprocessmaintainsthemetadata.„Alowerlayer(i.e.asetofchunkservers)storesthedatainunitscalled“chunks”.GFSArchitectureMasterMetadataChunkserverLinuxFSChunkserverLinuxFSClient(requestformetadata)(metadatareponse)(read/writerequest)(read/writeresponse)GFSArchitectureWhatisachunk?„Analogoustoblock,exceptlarger.„Size:64MB!„Storedonchunkserverasfile„Chunkhandle(~chunkfilename)usedtoreferencechunk.„Chunkreplicatedacrossmultiplechunkservers„Note:TherearehundredsofchunkserversinaGFSclusterdistributedovermultipleracks.GFSArchitectureWhatisamaster?„Asingleprocessrunningonaseparatemachine.„Storesallmetadata:„Filenamespace„Filetochunkmappings„Chunklocationinformation„Accesscontrolinformation„Chunkversionnumbers„Etc.GFSArchitectureMaster-ChunkserverCommunication:„Masterandchunkservercommunicateregularlytoobtainstate:„Ischunkserverdown?„Aretherediskfailuresonchunkserver?„Areanyreplicascorrupted?„Whichchunkreplicasdoeschunkserverstore?„Mastersendsinstructionstochunkserver:„Deleteexistingchunk.„Createnewchunk.GFSArchitectureServingRequests:„Clientretrievesmetadataforoperationfrommaster.„Read/Writedataflowsbetweenclientandchunkserver.„Singlemasterisnotbottleneck,becauseitsinvolvementwithread/writeoperationsisminimized.Overview„DesignMotivations„Architecture„Master„Chunkservers„Clients„Read/Write/RecordAppend„Fault-Tolerance„PerformanceResultsAndnowfortheMeat…ReadAlgorithmApplicationGFSClient(filename,byterange)Master(filename,chunkindex)(chunkhandle,replicalocations)213ReadAlgorithmApplicationGFSClientChunkServerChunkServerChunkServer(chunkhandle,byterange)(datafromfile)(datafromfile)456ReadAlgorithm1.Applicationoriginatesthereadrequest.2.GFSclienttranslatestherequestfrom(filename,byterange)-(filename,chunkindex),andsendsittomaster.3.Masterrespondswithchunkhandleandreplicalocations(i.e.chunkserverswherethereplicasarestored).4.Clientpicksalocationandsendsthe(chunkhandle,byterange)requesttothatlocation.5.Chunkserversendsrequesteddatatotheclient.6.Clientforwardsthedatatotheapplication.ReadAlgorithm(Example)IndexerGFSClient(crawl_99,2048bytes)(crawl_99,index:3)(ch_1003,{chunkservers:4,7,9})213Mastercrawl_99Ch_1001{3,8,12}Ch_1002{1,8,14}Ch_1003{4,7,9}ReadAlgorithm(Example)Calculatingchunkindexfrombyterange:(Assumption:Filepositionis201,359,161bytes)„Chunksize=64MB.„64MB=1024*1024*64bytes=67,108,864bytes.„201,359,161bytes=67,108,864*2+32,569bytes.„So,clienttranslates2048byterange-chunkindex3.ReadAlgorithm(Example)ApplicationGFSClientChunkServer#4ChunkServer#7ChunkServer#9(ch_1003,{chunkservers:4,7,9})(2048bytesofdata)(2048bytesofdata)456WriteAlgorithmApplicationGFSClient(filename,data)Master(filename,chunkindex)(chunkhandle,primaryandsecondaryreplicalocations)213WriteAlgorithmApplicationGFSClient4BufferChunkPrimarySecondaryBufferChunkSecondaryBufferChunk(Data)(Data)(Data)WriteAlgorithmApplicationGFSClient5D1|D2|D3|D4ChunkPrimarySecondaryD1|D2|D3|D4ChunkSecondaryD1|D2|D3|D4Chunk(Writecommand)67(writecommand,serialorder)WriteAlgorithmApplicationGFSClient(empty)ChunkPrimarySecondary(empty)ChunkSecondary(empty)Chunk(response)(response)89WriteAlgorithm1.Applicationoriginateswriterequest.2.GFSclienttranslatesrequestfrom(filename,data)-(filename,chunkindex),andsendsittomaster.3.Masterrespondswithchunkhandleand(primary+secondary)replicalocations.4.Clientpusheswritedatatoalllocations.Dataisstoredinchunkservers’internalbuffers.5.Clientsendswritecommandtoprimary.WriteAlgorithm6.Primarydeterminesserialorderfordatainstancesstoredinitsbufferandwritestheinstancesinthatordertothechunk.7.Primarysendsserialordertothesecondariesandtellsthemtoperformthewrite.8.Secondariesrespondtotheprimary.9.Primaryrespondsbacktoclient.Note:Ifwritefailsatoneofchunkservers,clientisinformedandretriesthewrite.RecordAppendAlgorithmImportantoperationatGoogle:„Mergingresultsfrommultiplemachinesinonefile.„Usingfileasproducer-consumerqueue.1.Applicationoriginatesrecordappendrequest.2.GFSclienttranslatesrequestandsendsittomaster.3.Masterrespondswithchunkhandleand(primary+secondary)replicalocat

1 / 20
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功