华中科技大学硕士学位论文基于机器学习的入侵检测系统研究姓名:程恩申请学位级别:硕士专业:计算机软件与理论指导教师:韩宗芬20060507IElmanSVMElmanElmanSVMSVMLinuxCC++DARPA0Elman92.7%2.3%96.2%0SVM87.3%2.8%100%ElmanSVMIIAbstractTraditionalintrusiondetectionsystemsemployedFeed-forwardNeuralNetwroksforanalyzingnetworkpacketheader.Currentstudieshaveshownthatpacketinter-arrivaltimesfollowapacket-trainmodel,whiletraditionalmechanismsneglectthisdynamiccharacteristic.Furthermore,currentavailablemechanismsdiscardthepayloadandretaintheheaderofeachpacketfordataanalysis.Asaresult,thesesystemscannotdetectinter-packetsequenceanomalies,cannotdetecttheanomalynetworktrafficonapplicationlevel,andcannotdetectcomplicatedanddistributedintrusions.Ontheotherhand,host-basedintrusiondetectionsystemsusingmachinelearningalgorithmsarelimitedbythenoiseinthetrainingdata,whichleadstoanover-fittingproblem.Inreal-timedetection,thesesystemsfacethechallengeofhighfalsepositiverates;theadministratorisindifficultyofaccuratelyanalyzingtheseintrusionsandconfiguringthesecuritypoliciestimely.Toovercometheabovelimitations,weimplementedanintrusiondetectionsystembasedonmachinelearningalgorithm.Thissystemincludestwosubsystems–Network-basedIntrusionDetectionsubsystemusinganElmanNetworkandHost-basedIntrusionDetectionsubsystemusingaRobustSVMsNearestNeighborClassifier.Intheformersubsystem,theclusteringalgorithmisusedforclusteringthepacketpayloadtodistillvaluableinformationbesidesthepacketheader.Todevelopanefficientlyworkingreal-timeanomalydetector,theBPTTalgorithmisusedfortrainingtheElmannetwork.Furthermore,withthedynamicfeatureoftheElmannetwork,theproposednetworkdetectorhasthecapabilityofdetectingtheinter-packetanomalies.Inthelattersubsystem,thegradient-basedweightingschemeisproposedforovercomingtheover-fittinglimitation.Meanwhile,thisweightingschememakesapositiveeffectonthecurseofdimensionality,sothatthedetectionperformanceisimproved.ThissystemisimplementedintheLinuxplatformusingCandC++language.Tofullyevaluateitsperformance,wemadesolidexperimentsonDARPAdatasetintermsofnetwork-basedandhost-basedintrusiondetectionrespectively.Resultsindicatethatthenetwork-basedsubsystemcanattainadetectionrateof92.7%withazerofalsepositiverate.Itreaches100%withafalsepositiverateof2.3%.Thehost-basedsubsystemcanattainadetectionrateof87.3%withazerofalsepositiverate.Itreaches100%withafalsepositiverateof2.8%.Keywords:IntrusionDetection,MachineLearning,ElmanNeuralNetwork,RobustSVM111.1TCP/IPQQ2005808968%Denning[1]19862[2](DataMining)[3](ExpertSystem)[4](ArtificialNeuralNetwork)[5](ArtificialImmuneSystem)[6](HiddenMarkovModel)[7]Agent(AutomaticAgent)[8]1.21.2.11.(AnomalyDetection)(MisuseDetection)(BehaviorBasedIntrusionDetection)(KnowledgeBasedIntrusionDetection)2.(Host-Based)(Network-Based)3.IDSIDSIDSIDS4.IDS35.1.2.27019804JamesP.Anderson(ComputerSecurityThreatMonitoringandSurveillance)[9]19841986DorothyDenningSRI/CSLPeterNeumann(IntrusionDetectionExpertSystem,IDES)IDES[1](Subject)(Object)(AuditRecords)(Profile)(AnomalyRecords)(ActivityRules)1988(Morrisworm)5TeresaLuntDenning(IntrusionDetectionExpertSystem,IDES)1995(Next-generationIntrusionDetectionExpertSystem,NIDES)[10]1990DAVISL.T.Heberlein(NetworkSecurityMonitor,NSM)[11]NSM1996GrIDS[12]42090IDSSRI/CSLDavisIDSIDSIDSISSRealSecureIDSNFRNFRCISCONetRangerNACyberCopSnort[13]IDSNP-IDS1100NISDetectorIDS1.2.3IDS1.(Accuracy)IDSIDS2.(Completeness)IDSIDSIDS3.(Performance)IDS4.(RealTime)IDS5IDS5.(FaultTolerance)IDSDoSIDSIDS6.(SystemResources)IDS7.(Scalability)IDS8.(Adaptability)IDS1.2.41.[4]CPUSchonlau[14]Bayesone-stepMarkovHybridmulti-stepMarkovIPAMUniquesnessSequence-MatchCompressionMaxionNaïveBayes[2]OkaEigenCo-occurrenceMatrix[15]Lane6[16]Yeung(HiddenMarkovModels,HMM)[17]2.Unix/Linux(trace)2.7.10Linux221801996Forrest[18]LeeForrestRipper[19]WespiForrest[20]Asaka[21]LiaoHu(K-NearestNeighbor)(SupportVectorMachine,SVM)[22,23]Hamming[24][25][26]3.IPIPLeeRipper[27]Giacinto[28]Lazarevic[29][30][31][32][33]74.TripwireSamhainTripwireSamhainUnix/LinuxTripwire1.3IDS1.G238IDS4DoS/DDoS1.4Elman9SVMElmanElmanElmanSVMSVMElmanElmanSVMSVMElmanSVM1022.12.1.111TCP/IPElmanElman[34]k-k-SVMk-SVM[35],,,,,ElmanSVM,Elman12SVM,,,,2.1.2ElmanSVM2.1InternetElmanSVMSVMSVM2.12.1,ElmanElman,SVMSVM,132.2ElmanSVMElmanElmanElmanSVM2.2ElmanElmanSVMElmanSVM2.2TcpdumpElmanElmanBackPropagationThroughTime(BPTT)ElmanBPTTBPElman14ElmanElmanElmanElman2.1ID10152.1ID1015closeexecveopenmmapopenmmapmmapmunmapmmapmmapcloseopenmmapcloseopenmmapmmapmunmapmmapcloseclosemunmapopenioctlaccesschownioctlaccesschmodcloseclosecloseclosecloseexitSVMSVMSVMSVMk1.ASCII152.1413143.ElmanBPTTElmanElmanElman4.ElmanElman5.SVM6.SVM7.SVM2.31.2.ElmanElman163.Elman4.SVM2.4ElmanSVMElmanElmanElmanSVM173ElmanElman3.1LeeSYN[36]HyperView[37]Rhodes(SOM)[38]SOMCannady9[39]TCP/IP18[40]Feed-forwardNeuralNetwork(FNNs)RecurrentNeuralNetworks(RNNs)FNNsRNNsTCP/IPpacket_trainpacket_trainElman[41]Elmanpacket_trainElman3.1Z-1Z-1y1(k)y2(k)x1(k)xr(k)u1(k)um(k)c1(k)cr(k)3.1193.2ElmanElmanElmanElman3.2ElmanElmanElmanElman3.21.Elman20TcpdumpTcpdumpASCII2.TCPUDPk-means3.3Elman3.BPElmanElman