Learn more, sample less Control of volume and vari

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

LearnMore,SampleLess:ControlofVolumeandVarianceinNetworkMeasurementNickDufeldCarstenLundMikkelThorupAT&TLabs–Research180ParkAvenue,FlorhamPark,NJ07932,USAE-mail:duffield,lund,mthorup@research.att.comAbstractConsiderasetofobjects,eachendowedwithasizeandacolor.Wewishtoestimatethesums!#%$&(’*)+&,ofthesizesofobjectsofagivencolor,fromasampledsubsetofobjects.Howshouldthesamplingdistributionbechoseninordertojointlycontrolboththevarianceoftheestimators-of./andthenumberofsamplestaken?Thisproblemismotivatedfromnetworkmeasurement,inwhichthearethebytesizesoftrafcowsreportedbyrouters,andthearethecommonpropertiesofthepacketoftheow,e.g.,sourceanddestinationIPaddress.Inthispaperweproposeasamplingschemethatoptimallycontrolsthevolumeofthemeasurements,andthevarianceofunbiasedusageestimates-0/,whileretainingusagedetaildowntothenestlevelofgranularityinthecolors.Weprovidealgorithmsfordynamiccontrolofsamplevolumesandevaluatethemonowdatagath-eredfromacommercialIPnetwork.Thealgorithmsaresimpletoimplementandrobusttovariationinnetworkconditions.TheworkreportedherehasbeenappliedinthemeasurementinfrastructureofthecommercialIPnetwork.Tonothaveemployedsamplingwouldhaveentailedanorderofmagnitudegreatercapitalexpendituretoaccommodatethemeasurementtrafcanditsprocessing.Keywords:InternetMeasurement,Flows,Sampling,Estimation,VarianceReduction1IntroductionConsiderthefollowingestimationproblem.Asetofobjects1325476896;:;:;:6=,eachendowedwithasize+?A@CBandacolorD?.WewishtoestimatethesumsE#F(DHGI2KJ?LMON*PQM+?ofthesizesofobjectsofagivencolorD,fromasampledsubsetofobjects.HowshouldthesamplingdistributionbechoseninordertojointlycontrolboththevarianceoftheestimatorsRESF(DHGofE#F(DHGandthenumberofsamplestaken?Thisisanabstractversionofapracticalproblemthatarisesinestimatingusageofnetworkresourcesduetodifferentusersandapplications.Theusageisdeterminedfromnetworkmeasurements,andsamplingisemployedtocontroltheresourcesconsumedbythemeasurementsthemselves.Westartthispaperbyexplainingthemotivationbehindthestatedsamplingproblem,andshowinghowtheconstraintsimposedbytheintendeduseofthemeasurementsleadustoemployowsampling.1.1MotivationTheneedfordetailednetworkusagedata.Thecollectionofnetworkusagedataisessentialfortheengineeringandmanagementofcommunicationsnetworks.Untilrecently,theusagedataprovidedby1networkelements(e.g.routers)hasbeencoarse-grained,typicallycomprisingaggregatebyteandpacketcountsineachdirectionatagiveninterface,aggregatedovertimewindowsofafewminutes.However,thesedataarenolongersufcienttoengineerandmanagenetworksthataremovingbeyondtheundifferentiatedservicemodelofthebest-effortInternet.Networkoperatorsneedmorenelydifferentiatedinformationontheuseoftheirnetwork.Examplesofsuchinformationinclude(i)therelativevolumesoftrafcthatusedifferentprotocolsorapplications;(ii)trafcmatrices,i.e.,thevolumesoftrafcoriginatingfromand/ordestinedtogivenrangesofInternetProtocol(IP)addressesorAutonomousSystems(AS's);(iii)thepacketandbytevolumesanddurationsofusersessions,andoftheindividualowsofwhichtheycomprise.Suchinformationcanbeusedtosupportnetworkmanagement,inparticular:trafcengineering,networkplanning,peeringpolicy,customeracquisition,usage-basedpricing,andnetworksecurity;someapplicationsarepresentedindetailsin[2,11,12].Animportantapplicationoftrafcmatrixestimationistoefcientlyredirecttrafcfromoverloadedlinks.UsingthistotuneOSPF/IS-ISroutingonecantypicallyaccommodate50%moredemand;see[15].SatisfyingthedataneedsoftheseapplicationsrequiresgatheringusagedatadifferentiatedbyIPheaderelds(e.g.sourceanddestinationIPaddressandTypeofService),transportprotocolheaderelds(e.g.sourceanddestinationTCP/UDPports),routerinformationspecictoapacket(e.g.input/outputinter-facesusedbyapacket),informationderivedfromtheseandroutingstate(e.g.sourceanddestinationASnumbers),orcombinationsofthese.Collectingthepacketheadersthemselvesasrawdataisinfeasibleduetovolume:weexpectthatasingledirectionofanOC48linkcouldproduceasmuchas100GBofpacketheadersperhour,thisestimatebasedonstatisticscollectedfortheexperimentsreportedlaterinthispaper.Flowlevelstatistics.InthispaperwefocusonameasurementparadigmthatiswidelydeployedinthecurrentInternetandthatofferssomereductioninthevolumeofgathereddata.Thisisthecollection—byroutersordedicatedtrafcmonitors–ofIPowstatistics,whicharethenexportedtoaremotecollectionandanalysissystem.(SomealternativeparadigmsarereviewedinSection8).Mostgenerally,anIPowisasetofpackets,thatareobservedinthenetworkwithinsometimeperiod,andthatsharesomecommonproperty.Particularlyinterestingforusare“raw”ows:asetofpacketsobservedatagivennetworkelement,whosecommonpropertyisthesetofvaluesofthoseIPheadereldsthatareinvariantalongapacket'spath.(Forexample,IPaddressesareincluded,TimeToLiveisnot).Thecommonpropertymayincludejoinswithstateinformationattheobservationpoint,e.g.,nexthopIPaddressasdeterminedbydestinationIPaddress,androutingpolicy.Moregenerally,thecommonpropertymayaggregaterangesofIPheaderelds(e.g.overso

1 / 33
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功