NCStore:基于网络编码的P2P备份系统计算机科学技术学院李钧指导教师陈学青摘要:随着存储设备容量的不断提升,对重要数据进行有效备份和保护的需求也在快速增长。传统的C/S模型具有高硬件需求、高维护性需求、低带宽利用率等缺点,而P2P存储模型由于具有分布式存储的特征,可以有效利用网络中的存储及带宽资源,从而提供廉价、快速、可靠的数据存储服务。在数据传输过程中使用网络编码,由于其在中间节点编码的特征,可以有效提高网络吞吐率和鲁棒性。因此,在存储系统中使用网络编码,一方面可以提高数据的可用性,另一方面,当系统中存在数据丢失需要进行数据修复时,网络编码可以有效降低修复过程中的带宽开销。本文分析了存储系统中使用网络编码的参数选择,以避免网络编码带来的线性相关的问题。基于分析结果,本文设计并实现了一个使用网络编码的数据备份系统NCStore。实验结果表明,与副本及纠删码方式相比,NCStore可以降低数据修复时的传输时间。此外,使用网络编码可以保持数据可靠性的同时,降低带宽使用。关键词:网络编码P2P备份系统。Abstract:Withtheincreasingofthecapacityofharddisk,thedemandforthepreservationofimportantdataisalsoincreasingrapidly.WhilethetraditionalC/Sarchitecturesuffersfromhighcostandlowbandwidthusage,thepeer-to-peerarchitecturecanutilizethestorageandbandwidthresourceinthenetworkandthusprovidecheap,fastandreliablebackupstorage.Byusingnetworkcodingduringthedatatransmissioninthenetwork,itmaybringtwomainbenefits:theimprovementofnetworkthroughputandahighrobustness.Thereforebyusingnetworkcodinginthestoragesystem,itcannotonlyprovidehighdatareliability,butalsoreducebandwidthcostduringthedataregeneration,whendatastoredinthesystemgetslost.Inthispaper,weanalyzehowtochooseasuitablenetworkcodingschemetoavoidtheproblemoflinearindependence.Secondly,weproposeadesignofarealbackupsystemandimplementit.ResultsofexperimentinLANshow,comparedwithreplicaanderasurecodes,NCStoreconsumeslesstimefordataregeneration.Inaddition,thenetworkcodingschemecannotonlykeepthedatareliable,butsavebandwidthusageaswell.Keywords:networkcoding,P2Pbackupsystem1IntroductionInthispaper,wefocusonthequestionthathowtosaveimportantdatainthenetworkandkeepitdurable,orreliablewithlowcost.Thesimpleprincipleofallthesolutionsisusingredundancy.Astherapidincreasingofharddiskcapacity,itiscommonthatthereisfreespaceinaharddiskinwhichuserscanbackupimportantdata,andthecostisacceptable.However,theharddiskissomehowvulnerablesothatthedatasavedinitmaygetlostbecauseofdiskfailure.Thus,itisnaturalthatwestoredatainotherhard-disks.OneexampleisRAID[9].RAIDisreliableandcheap,butthedatacanneverberecoveredifthecomputerisphysicallydestroyed.Sothedatacanbesaferifitissavedgeographicallyseparated.Therefore,networkstorageisnecessary.Thetraditionalarchitectureofnetworkstorageisclient/serverarchitecture.Allthedatawhichclientsintendtobackupistransmittedtotheserver.Theclientdoesnothingexceptsendingoperationrequeststoserverandtransmittingdatawithserver.Theserverstoresallthedataofeveryclient,thusitshouldcontainlargefreediskspace.Inordertoguaranteedatareliability,theservershouldalsobewellmanaged.Thistwocharacteristicsresultinthehighcostatserver,eventhoughitcanservermanyclients,andwastedfreediskspaceofusers.Inaddition,asthenumberofclientsscalesup,eachclient’sshareoftheserver’sbandwidthisfurtherlimited.ThePeer-to-Peer(P2P)technologydemonstratesexcellentscalabilityandithasbeenwidelyusedinthenetworkapplication,suchascontentdistribution[6].InP2Pnetwork,theclientconnectswithotherclientsandusestheirresource,ornamely,getsservicefromotherclients.ConsideringthereisfreediskspaceinalmostallclientsintheC/Sarchitecture,itisnaturaltoexploitthebenefitsofP2Pnetworkstoragesystem.What’smore,asthetransmissionisamongclients,theP2Pnetworkexploitsthebandwidthofnetworkandguaranteeloadbalance.InP2Pnetwork,thereisnomanagementofthestatesofallthenodes.Inotherwords,atanytime,onenodemaybeactiveorinactive,anditcanchangeitsstateatanymoment.Inaddition,clientsarenotcooperated.ThestatesoractionsofallthenodesintheP2Pnetworkarenotrelatedwitheachother.Sowhenonenodeisinactive,thedatasavedinthisnodewillbecomeunavailabletemporarily.However,itdoesnotmeanthatthedatahasbeenlost,becausethedataisstillstoredinthenodecorrectlyandthenodemaybeactivesometimelater,sothedataisavailableagain.IntheP2Pstoragenetwork,thedataaccessfailurecanbeclassifiedastransientfailureandpermanentfailure.Asstatedabove,transientfailureiscausedwhentherelevantnodeleavesthenetworktemporarily,yetinwhichthedataissafe.Permanentfailurehappenswhendataiscorruptedforsomereason.Forexample,thecomputercatchesfire,ortheharddiskcrashes,orthedataisdeletedbymistake.HowtodetectpermanentfailureiscriticaltotheperformanceoftheP2Pnetworkstorage.[2]proposesCarbonitealgorithmtomaintainthereplicaswithlowcost.Inoursystem,wewillproposeamodifiedCarbonitealgorithmtomakeitsuitableforourcodingscheme.2.BackgroundandRelatedWork2.1NetworkCodingNetworkcodingisarecentfieldproposedbyAhlswedeetal.[1],whichcanimprovethenetworkutilization.Thecentralideaofnetworkcodingisthattheintermediatecodecanencodethedataitreceives.Thenetworkcodingcanusetheresourceinthenetworkoptimally.What'smore,networkcodingsuitsforthedecisionmakingwhenonlypartialinformationofthenetworkisavailable.Thebenefitsofnetworkcodinghavebeenprovedtheoretically.First,networkcodingcangainthroughputinstaticenvironment.Second,networkcodingcanimproverobustnessandadaptabilityofthenetwork.TheusageofnetworkcodingcontainsP2Pfiledistribution[6],wirelessnetwork,etc.Mostpreviousworkofnetworkcodingisbasedtheoreticalcalculation.However,littleefforthasbeenmadetoputnetworkcodingintopractice,especiallyinthestoragesystem.2.