ReverseHashingforHigh-speedNetworkMonitoring:Algorithms,Evaluation,andApplicationsRobertSchweller1,ZhichunLi1,YanChen1,YanGao1,AshishGupta1,YinZhang2,PeterDinda1,Ming-YangKao1,GokhanMemik11LabforInternetandSecurityTechnology(LIST),NorthwesternUniv.2UniversityofTexasatAustinTheSpreadofSapphire/SlammerWormsMotivation(onlinechangedetection)•Onlinenetworkanomaly/intrusiondetectionoverhighspeedlinks–Smallmemoryusage–Small#ofmemoryaccessperpacket–Scalabletolargekeyspacesize•Primitivesforonlineanomalydetection–Heavyhitters(lotsofpriorwork)–Heavychanges:enablerforaggregatequeriesovermultipledatastreams•Asymmetricroutingdemandsspatialaggregation•TimeSeriesAnalysis(TSA)needtemporalaggregationOutline•Backgroundonk-arysketch•Reversiblesketchproblem•Modularhashing•IPmangling•Reversehashing•Evaluation•Conclusion[Krishnamurthy,Sen,Zhang,Chen,2003]Firsttodetectflow-levelheavychangesinmassivedatastreamsatnetworktrafficspeedsK-arysketch1jH01K-1………k-arysketch1jH01K-1………hj(k)hH(k)h1(k)Update(k,u):Tj[hj(k)]+=u(forallj)Estimatev(S,k):sumofupdatesforkeykKKsumkhTjjj/11/)]([median[Krishnamurthy,Sen,Zhang,Chen,2003]APIs:+=abS=COMBINE(a,S1,b,S2):??•Mainproblem–CannotefficientlyreportkeyswithheavychangeINFERENCE(S,t)–Importantfunctionforanomalydetection!•OurContribution–Determinesetofkeysthathave“large”estimatesinasketchReverseSketchProblemReversiblesketchframeworkStreamingdatarecordingreversiblek-arysketchvaluestoredvalueModularhashingIPmanglingkeyHeavychangedetectionreversiblek-arysketchReversehashingReverseIPmanglingheavychangekeyschangethresholdOutline•Backgroundonk-arysketch•Reversiblesketchproblem•Modularhashing•IPmangling•Reversehashing•Evaluation•Conclusion•IntersectA1,A2,A3,A4,A5TakingIntersectionsH=5K=212#keys=232(IPaddresses)E[falsepositives]1Theproblemwithsimpleintersection•EachsetAicanbeverylarge!H=5K=212#keys=232(IPaddresses)|A1|=232/212=220Theproblemwithsimpleintersection•EachsetAicanbeverylarge!•Solution:ModularhashingModularhashingreducesthesetsize32bits8bits10010100101010111001010110100011010110001101h()12bitsModularhashingreducesthesetsize32bits8bits10010100101010111001010110100011h1()h2()h3()h4()010110001101010110001101GreatlyreducessizeofreversemappedsetsModularhashingreducesthesetsize12354b1b2b4b5b3A1:25*25*25*25Intersection:Only32elementsperwordset12354b1b2b4b5b3A1:25*25*25*25A2:25*25*25*25Intersection:ModularhashingreducesthesetsizeProblem:Toomanycollisions129.105.56.23129.105.56.28129.105.56.109129.105.56.35129.105.56.98...7.4.0.*32bits12bitsProblem:Toomanycollisions129.105.56.23129.105.56.28129.105.56.109129.105.56.35129.105.56.98...7.4.0.*32bits12bitsIPManglingwithGF(GaloisExtensionField)Solution:IPMangling:abijectivemappingfunctionforbreakingthekeyspacecontinuityOutline•Backgroundonk-arysketch•Reversiblesketchproblem•Modularhashing•IPmangling•Reversehashing•Evaluation•ConclusionHandlingMultipleIntersections…12354b1b2b4b5b3b3b1b2b4b52HdifferentintersectionsMuchmoredifficult–Solution:ReverseHashingalgorithms•Step1:Reversehashingforeachmodule•Step2:InferthewholekeythroughbucketindexmatchingamongcandidatesfromeachmoduleReverseHashingforEachModule12354H=5,r=1,K=212rtolerancelevel41231221211212AAAAA32wijA}5,2{}3,2{11211111AAGiirGI11candidatesetofthefirstwordinHashtableiAllpossiblevaluesofthefirstwordinthesketch1iGTakethefirstwordasanexample}3,2{}3,0{13213113AAG}10,9{}6,2{12212112AAG}8,2{}10,3{14214114AAG}9,6{}7,3{15215115AAG{2,3,5}{2,6,9,10}{0,2,3}{2,3,8,10}{3,6,7,9}{2}{2,3}BucketIndexMatrixofCandidatesH=5,r=1,K=212ForeachxinI1,wecangetB1(x),avectoroftheheavybucketsetswhichxhashesto.192.168.0.112354b11b21b42b51b32b31b12b22b41b5212354b11b21b42b51b32b31b12b22b41b52192.123.47.6212354b11b21b42b51b32b31b12b22b41b52192.*.*.*hashtotheredheavybuckets52514241322112111,,,)192(bbbbbbbbBPrefixExtensionAlgorithmI1I2B1B21504723636,3,19,4,115,241153315,27,3,22721048,7,359,45,12,19,3126,22,1+=150.72}8,7,3{}3{}5{}6,3,1{}9,4{}9,4,1{}5,1{}1{}2,1{}5,2{3*9,41247.72***5**morethanr=1Ignore!236.10431222Ignore!Pathdiscoveryalgorithm150.723*9,412236.10431222+=150.72.1823*412236.104.4931222150.72.323*9121823249314,312,137,19123126,22I3B3+=759,5,3142,12I4B43*412150.72.182.7531*22236.104.49.75PrefixExtensionAlgorithmRecap:Streamingdatarecordingreversiblek-arysketchvaluestoredvalueModularhashingIPmanglingkeyHeavychangedetectionreversiblek-arysketchReversehashingReverseIPmanglingheavychangekeyschangethreshold)(loglog/1nn)logloglog(nnnisthesizeofkeyspaceOutline•Backgroundonk-arysketch•Reversiblesketchproblem•Modularhashing•IPmangling•Reversehashing•Evaluation•ConclusionEvaluation•Dataset–AlargeUSISP(330MNetflowrecords)–NU(19MNetflowrecords)•Ef