362Vol.36No.2Jan.20,20162016120ProceedingsoftheCSEE©2016Chin.Soc.forElec.Eng.379DOI10.13334/j.0258-8013.pcsee.2016.02.0080258-8013(2016)02-0379-09TM71512131(1()10008427100481;3510080)AnomalyDetectionforPowerConsumptionPatternsBasedonUnsupervisedLearningZHUANGChijie1,ZHANGBin2,HUJun1,LIQiushuo3,ZENGRong1(1.StateKeyLabofControlandSimulationofPowerSystemsandGenerationEquipments(Dept.ofElectricalEngineering,TsinghuaUniversity),HaidianDistrict,Beijing100084,China;2.NorthwestBranchofStateGridCorporationChina,Xi’an710048,ShaanxiProvince,China;3.CSGElectricPowerResearchInstitute,Guangzhou510080,GuangdongProvince,China)ABSTRACT:Theprimarypurposeofanomalydetectionforpowerconsumptionpatternsistolowerthenon-technicallosses(NTL),thusreducingtheoperatingcostsforpowerutility.Amodelbasedonunsupervisedlearningwasproposedtodetectanomalyconsumptionpatterns.Thismodelissuitableforloaddatasetwithouttrainingset.Themodelincludesmodulesoffeatureextraction,principalcomponentanalysis,gridprocessing,calculationoflocaloutlierfactor(LOF),etc.Firstly,variousfeatureswereextractedfromloadprofilestocharacterizeconsumptionpatternsofthecustomers.ThenPCAwasusedtomapcustomerstoatwo-dimensionalplane.ThismappingprocedureisinfavorofdatavisualizationandLOFcalculation.Thegridprocessingprocedurecanscreendatainlowdensityregionandthusliftcalculationefficiency.Theoutputofthemodelisabnormaldegreeforallcustomers’consumptionpatterns.Theresultindicatesthatwiththeuseofthisabnormalitysequence,detectingcustomerswithhigherLOFrankcanfindoutmostabnormalconsumptionpatterns.KEYWORDS:powerconsumptionpatterns;powerbigdata;anomalydetection;unsupervisedlearning;localoutlierfactor;anti-stealingofpowerenergy(non-technicallossesNTL);0[1][2](non-technicallossesNTL)NTLNTL[3-4]38036NTLNTL[5-7]()()NTL[8-9][10]NTLNTL[11][12]CNTL(localoutlierfactorLOF)LOF11.1nkkLOFLOF1AA1Fig.1Schematicdiagramforlocaloutlierfactor[13]1pk-Dpod(p,o)kpk-k_dist(p)d(p,o)k_dist(p)1k\{}op′∈D(,)(,)dpodpo′≤2k-1\{}op′∈D(,)(,)dpodpo′2pk-pk-pk-(){\{}|(,)_dist()}kNpqpdpqkp=∈≤D(1)3kpo2381reach_dist(,)max{dist(),(_,)}kpokodpo=(2)2BCA(k=3)CADB2Fig.2Schematicdiagramforreachabilitydistance4MinPtsp()reach_dist(,)()1/[]()MinPtsMinPtsoNpMinPtsMinPtspolrdpNp∈=(3)ppMinPtspk-pk-opok-reach_distk(po)d(po)ppk-opok-reach_distk(po)k_dist(o)5p()()()()()MinPtsMinPtsoNpMinPtsMinPtsMinPtslrdolrdpLOFpNp∈=(4)pMinPtspLOFplrdlrdpLOFpplrdpLOF1LOF1.2LOFO(n2)LOFLOFGridLOF(Grid-basedLOF)LOFq(q=0)LOFGridLOF3[14]3GridLOFFig.3ExampleillustratingtheideaofGridLOFalgorithm.GridLOFwwLOFwGridLOF12314382365LOF22.1NHHnx=(){,1,2,,}hnxhH={,1,2,,}nnN==XxX2.1.1[15]n{A1,A2,…,An}tn12(tttFAA−−=++)/tnAn−+1X2AnF3AFAFua1,a2,…,auAFv12,,,vbbb4tratrb21tra()/uttau==(5)21trb()/vttbv==(6)2.1.21rr()()11davg_//rrtHtnnttrxrxr−===−2rr[/21]()()2121dfou_()rtHtnntryy−−==−yn1yn2r3H2.1.31Hsd2rbsd_r3resd_r2.1.41r22.2(principalcomponentanalysisPCA)(factoranalysisFA)2.2.1X1X2…Xs[16]F1F2···FmX1X2···Xsm11111221221122221122...............ssssmmmmssFaXaXaXFaXaXaXFaXaXaX=+++=+++=+++(7)2.2.2[17]1111122112211222223311322331122kkkkkkmmmmkkmxafafafxafafafxafafafxafafafεεεε=++++=++++=++++=++++(8)2383=+XAFε(9)XFAε2.32.3.1[18]1(positive/negative)(true/false)1Tab.1ConfusionmatrixTruePositive(TP)TrueNegative(TN)FalsePositive(FP)FalseNegative(FN)FPFNPrecision=TP/(TP+FP)Recall=TP/(TP+FN)TPR=TP/(TP+FN)FPR=FP/(TP+FN)99:199%012.3.2ROCAUCROC(receiveroperatingcharacteristic)FPRTPRTPRFPRROC[19]ROCROC(0,1)ROC(0,1)(areaundercurveAUC)AUCROCAUCAUC=12.3.3(cumulativerecallCR)Recall[20]CR(0,1)2.444Fig.4Anomalydetectionmodelforpowerconsumptionpatternsbasedonunsupervisedlearning362001830()62006123771.24%384363.114V1V2136V3V4V5V66V7V8369V9V10V11V126V13V1455Fig.5Correlogramforfeatureset5()1(1)2(2)6LOFLOF71(a)240−4−10−5502(b)21−1−3−4−2206Fig.6ScatterdiagramforallcustomersaftermappingFPRTPR0.00.20.80.60.41.01.00.80.20.60.40.0FCAFA7ROCFig.7ROCcurveforthemodelROCROCAUC7PCAAUC0.827FAAUC0.7773.28(a)0.20.220%70%80%30%CR2385(a)LOF0.00.20.80.60.41.01.00.80.20.60.40.0(b)GridLOF0.00.20.80.60.41.01.00.80.20.60.40.08LOFGridLOFFig.8CumulativerecallcurveforLOFandGridLOFCRLOF8(a)(b)kGridLOFCR10%60%9GridLOFLOFROCkROCAUC22kGridLOFLOFAUCLOFGridLOF1010LOFGridLOFGridLOFLOFFPRTPR0.00.20.80.60.41.01.00.80.20.60.40.0LOFGridLOF9LOFGridLOFROCFig.9ROCcurveforLOFandGridLOF2kAUCTab.2AUCofthetwoalgorithmsfordifferentkvalueskLOFGridLOF500.8100.8291000.8290.8351500.8250.8322000.8270.8272500.8300.8263000.8320.8253500.8330.8264000.8350.8274500.8360.8295000.8380.830LOFGridLOF/s1000300050002015105010LOFGridLOFFig.10ComputingtimeofLOFandGridLOFfordatasetsofdifferentscale3.3wk3.3.1ww1111ww3wwAUCw1(a)w=50240−4−1005−51(b)w=100240−4−1005−5386361(c)w=100240−4−1005−51(d)w=200240−4−1005−511Fig.11Scatteredpointsinboundarygridsfordifferentsubdivision3GridLOFAUCTab.3ComputingtimeandAUCofGridLOFfordifferentsubdivisionw/sAUC501.390.7591008.670.82715017.690.83220018.310.829wAUCw1003.3.2kLOFkLOFkw100k1213kROCAUCkkkkk/s1612840010050020030020012kFig.12RelationshipbetweencomputingtimeandkkAUC0.840.830.82010050020030020013ROC