基于时间衰减模型的数据流频繁模式挖掘

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

XXXVol.XX,No.X200XXACTAAUTOMATICASINICAMonth,200X.,.,,;,,,..,MSW,(90%),,,.,,,DOI10.3724/SP.J.1004.2008.xxxxxMiningFrequentPatternsoverDataStreamUndertheTimeDecayingModelWUFengZHONGYanWUQuan-yuanAbstractFrequentpatternminingisanimportanttaskinminingdatastreams.Consideringthetimelinessofdatastreamandtheshiftofthestreamcenter,weproposeanalgorithmofdatastreamfrequentpatternmining,namedDFPMiner,combiningwithboththelandmarkwindowandthetimedecayingmodel.Bydynamicallyconstructuringtheglobalpatterntree,themethodusestimeexponentialdecayfunctiontoaccountthesupportofeachpattern,withinthelandmarkwindow,whosefrequentdegreeisdescribed.Moreover,toreducethespacecoste®ectively,theconstructedpruningthresholdfunctionisusedtocutintimethepatternthatisnotabletoturnintothefrequentpatternfromtheglobaltree.Thispaperdeeplyanalyzestheimportantparametersandthresholdsinthealgorithm.Theexperimentalresultsshowthat,comparingwiththealgorithmMSW,DFPMinerhashigherminingprecision(over90%inaverage)andlowerstoragecost,anditcanreachtherequestofprocessinghigh-speeddatastreamsintheexecutionspeed.Ourmethodcanbeappliedtothestreamswithdi®erentnumberoftransactions,averagetransactionsizesandmaximalpotentiallyfrequentpatternsizes.KeywordsDatastream,frequentpatternmining,datamining,timedecayingmodel,,,,[1].,&,[1;2].,.,,,XXXX-XX-XXXXXX-XX-XXManuscriptreceivedMonthDate,Year;acceptedMonthDate,Year(863)(2006AA01Z451,2007AA01Z474)SupportedbyNationalHigh-TechResearchandDevelopmentPlanofChina(2006AA01Z451,2007AA01Z474)410073SchoolofComputers,NationalUniversityofDefenseTechnol-ogy,Changsha410073DOI:10.3724/SP.J.1004.2009.01063.,[2]:,;,;,..,.,,;,,...1:2XX1.[3].,,,.[4]Sticky.[5]Sticky,,.Sticky,[6;7],,.2.[3].,,.[4]Lossy,,,,,.,[8,9],,,.,[10;11].[11]FP-stream,FP-Growth,.[12],MSW.,,.,[13]SS-BESS-MB,,,.2,:I=fi1;i2;¢¢¢;iMg,Á();DS=fTA1;TA2;¢¢¢;TAj;¢¢¢g,TAjµI;sup(P)P(PµI),jsup(P)j=N¸s[3],,N,s.[3],,.1[12]..TA=fa1;a2;¢¢¢,akgÁTA=fa01;a02;¢¢¢;a0kgTA.,.2[12].TA=fa1;a2;¢¢¢;akgP=fb1;b2;¢¢¢;bmg(m·k),8i,1·i·mai=bi,PTAm-.,,.,,.,[12].tf(t)=2¡¸t,,¸0.,.3..P,Time(P).tc,:F(P;tc)=PTj2Time(P)2¡¸(tc¡Tj).,,.,,.,,.4..tc,PTimeo(P);P:D(P;tc)=XTj2Timeo(P)2¡¸(tc¡Tj):5..tc,PTimew(P);tc,P:Da(P;tc)=XTj2Timew(P)2¡¸(tc¡Tj):5,3:6.,.PDm,(FPattern);Dl,(SPattern);(PPattern);PPatternFPattern(fSPattern).33.1[12]SW-tree,Global-tree.:1.Global-tree6:item;D(P;t);tP;tl;status;nodelink,,X:3tPPGlobal-tree;tlP;status4,(0)(1)(-1)(-2).SW-tree,Global-treeSW-tree4tP;status.2.Global-treeItemHead-Table:item,headofnodelink,SW-tree4,tidcount.,.,Global-tree,SW-tree,[12].1.1Global-treeFig.1ExampleofGlobal-tree1.Global-tree;..Global-tree,,P,P1,PµP1,P;P1,P1P,.,P,P1;,P1,P.,,Global-tree,.,(),,(,).,.Global-tree,;,(:Global-tree,,,6,).,,,,,().¤3.2,Global-Tree,,.,,;,,.,.,,:1)(OPattern).,,SPattern.,t,OPatternDa(P;t)Dl.POPattern,P1,PµP1,P1OPattern.2)(TPattern).fSPattern,SPattern,.,tc,t0(0tc¡t0T¾),TDl,TPattern:Da(P;t)Dl(tc·ttc+TDl),Da(P;t0)¸Dl,Da(P;tc+TDl)¸Dl.PTPattern,P1,PµP1,P1SPattern.3)(YPattern).,,.,tc,tf(0tc¡tfT,T),TDl,YPattern:Da(P;t)Dl(tf·ttc+TDl),Da(P;tc+TDl)¸Dl.PYPattern,P1,PµP1,4XXP1YPatternOPattern.4)(APattern).fSPattern,,SPattern;T(TTDl),,.tc,t0(tc¡t0T¾),APattern:Da(P;t)Dl(tc¡T¾·ttc+T),Da(P;t0)¸Dl,Da(P;tc+T)¸Dl.PAPattern,P1,PµP1,P1OPatternAPattern.,,,,,.,()TDl,.,,TDl,(2).±1,±2,:TDl2[minf±2;±1g;maxf±2;±1g].2.P,TD1.tc,Da(P;tc)Dl,[tc;tc+TD1]Da(P;t)Dl.[tc;tc+TDl],P,tc+TDl.3.P,fSPatternSPa-ttern:±0=¹1¸log2(DlDl¡1)º;FPatternSPattern:±2=¹1¸log2(DmDl)º;SPatternFPattern:±1=¹1¸log2((1¡2¡¸)Dl¡1(1¡2¡¸)Dm¡1)º;fSPattern:±3=¹1¸log2(11¡(1¡2¡¸)Dl)º:,(5).4.Ptm,tP.tc,PSPattern,±0+±3tc¡tP0,PfSPatternt0:tc¡t0¸±0+2TD1.TD1,±0+2TD1,,T¾·±0+2TD1.P,4,tcP±0+±3tc¡tP0,PTPattern;,,tcPtc¡tP¸T,YPattern.,:5.PtP.tc,PSPattern,±0+±3tc¡tP¸T,POPatternAPattern.2-5,.,³,tDl.7.PtP,tc,³:³(tP;tc)=(1¡2¡¸(tc¡tP+1))Dl.6.Ptm,tP.tc,,Da(P;tc)Dl.(1)D(P;tc)³(tP;tc);(2)tc=NTD1(N),0tc¡tPTD1,³(tP;tc)·D(P;tc)·(1¡2¡¸TDl)Dl;(3)t1=NTD1(N),0tP¡t1TD1,D(P;tc)·(1¡2¡¸(tc¡tl))Dl;(4)P,P1(P1µP),:Da(P1;tc)Dl.3.3DFPMiner,;,Global-tree;,,;,DFPMiner,:DFPMiner(2),TA.DFPMiner(3-12),,ItemHeadTableitem,Global-treeitemnode,:1)node(6),4,,01-1-2;2)node:(1)-22,,(7-9);(2)-15,,(10-12).DFPMiner(13-17),,Global-treefSPatterns,Global-tree,DFPMiner.X:5AlgAlgAlgorithm:DFPMinerAlgorithmorithm:DFPMinerAlgorithmorithm:DFPMinerAlgorithm1GetthearrivingtransactionTattimetc;2Merge(T,Root,tc);3IfIfIf(tc%TD1)==0thenthenthen4ForForForeachiteminItemHeadTable5ForForForeachnodeinGlobal-treeonthenodelinkofitemwherewherewherenode.item=item6Update(node.(item;D(P;t);tp;tl;status;nodelink),tc);7IfIfIfnode:status==-2thenthenthen8IfIfIftc¡node:tlTD1thenthenthen9DeletenodeinGlobal-tree;10ElseIfElseIfElseIfnode:status==-1thenthenthen11IfIfIf±0+±3tc¡node:tP¸Tthenthenthen12DeletenodeinGlobal-tree;13IfIfIftherequestforresultarrivesthenthenthen14IfIfIf(tc%TD1)==0thenthenthen15UseGlobal-treetogenerateresult;16ElseElseElse17KeepthecurrentGlobal-treeunchanged,andthenusethemto¯ndthefrequentpatternsuptodatetogenerateresult.44.1Global-tree,C.[14],TD1,l=TD1.T,N;ni(1·i·N)CN¡i+1;N¡i+1Nmi;,,.:jXi=1mini·jl(j=1;2;¢¢

1 / 11
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功