web数据挖掘__4监督学习2.ppt

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

SupervisedLearning(2)RoadMap2BasicconceptsDecisiontreeinductionEvaluationofclassifiersClassificationusingassociationrulesNaïveBayesianclassificationSupportvectormachinesK-nearestneighborEnsemblemethods:BaggingandBoostingSummaryBayesianTheorem:Basics假设X是未知分类标号的样本数据H代表某种假设,例如X属于分类CP(H|X):给定样本数据X,假设H成立的概率例如,假设样本数据由各种水果组成,每种水果都可以用形状和颜色来描述。如果用X代表红色并且是圆的,H代表X属于苹果这个假设,则P(H|X)表示,已知X是红色并且是圆的,则X是苹果的概率。3BayesianTheorem:BasicsP(H):拿出任一个水果,不管它什么颜色,也不管它什么形状,它属于苹果的概率P(X):拿出任一个水果,不管它是什么水果,它是红色并且是圆的概率P(X|H):一个水果,已知它是一个苹果,则它是红色并且是圆的概率。4BayesianTheorem:Basics现在的问题是,知道数据集里每个水果的颜色和形状,看它属于什么水果,求出属于每种水果的概率,选其中概率最大的。也就是要算:P(H|X)但事实上,其他三个概率,P(H)、P(X)、P(X|H)都可以由已知数据得出,而P(H|X)无法从已知数据得出Bayes理论可以帮助我们:)()()|()|(XPHPHXPXHP5NaïveBayesClassifier每个数据样本用一个n维特征向量表示,描述由属性对样本的n个度量。假定有m个类。给定一个未知的数据样本X(即,没有类标号),分类法将预测X属于具有最高后验概率(条件X下)的类。即,朴素贝叶斯分类将未知的样本分配给类Ci,当且仅当:这样,我们最大化。其最大的类Ci称为最大后验假定。根据贝叶斯定理:.1)|()|(ijmjXCPXCPji)|(XCPi)()()|()|(XPiCPiCXPXCPi6NaïveBayesClassifier由于P(X)对于所有类为常数,只需要最大即可。如果类的先验概率未知,则通常假定这些类是等概率的;即,。并据此只对最大化。否则,我们最大化。类的先验概率可以用计算;其中,si是类C中的训练样本数,而s是训练样本总数。)()|(iiCPCXP)(...)()(21mCPCPCP)|(iCXP)()|(iiCPCXPssCPii)()()()|()|(XPiCPiCXPXCPi7Conditionalindependenceassumption8AllattributesareconditionallyindependentgiventheclassC=cj.Formally,weassume,Pr(A1=a1|A2=a2,...,A|A|=a|A|,C=cj)=Pr(A1=a1|C=cj)andsoonforA2throughA|A|.I.e.,||1||||11)|Pr()|,...,Pr(AijiiiAAcCaAcCaAaAFinalnaïveBayesianclassifierWearedone!HowdoweestimateP(Ai=ai|C=cj)?Easy!.||1||1||1||||11)|Pr()Pr()|Pr()Pr(),...,|Pr(CrAiriirAijiijAAjcCaAcCcCaAcCaAaAcC9Classifyatestinstance10Ifweonlyneedadecisiononthemostprobableclassforthetestinstance,weonlyneedthenumeratorasitsdenominatoristhesameforeveryclass.Thus,givenatestexample,wecomputethefollowingtodecidethemostprobableclassforthetestinstance||1)|Pr()Pr(maxargAijiijccCaAccjAnexample11AnExample(cont…)12ForC=t,wehaveForclassC=f,wehaveC=tismoreprobable.tisthefinalclass.252525221)|Pr()Pr(21jjjtCaAtC251525121)|Pr()Pr(21jjjfCaAfCTrainingdatasetageincomestudentcredit_ratingbuys_computer=30highnofairno=30highnoexcellentno30…40highnofairyes40mediumnofairyes40lowyesfairyes40lowyesexcellentno31…40lowyesexcellentyes=30mediumnofairno=30lowyesfairyes40mediumyesfairyes=30mediumyesexcellentyes31…40mediumnoexcellentyes31…40highyesfairyes40mediumnoexcellentnoClass:C1:buys_computer=‘yes’C2:buys_computer=‘no’DatasampleX=(age=30,Income=medium,Student=yesCredit_rating=Fair)13NaïveBayesianClassifier:AnExampleComputeP(X|Ci)foreachclassP(buys_computer=“yes”)=9/14=0.643P(buys_computer=“no”)=5/14=0.357P(age=“30”|buys_computer=“yes”)=2/9=0.222P(age=“30”|buys_computer=“no”)=3/5=0.6P(income=“medium”|buys_computer=“yes”)=4/9=0.444P(income=“medium”|buys_computer=“no”)=2/5=0.4P(student=“yes”|buys_computer=“yes)=6/9=0.667P(student=“yes”|buys_computer=“no”)=1/5=0.2P(credit_rating=“fair”|buys_computer=“yes”)=6/9=0.667P(credit_rating=“fair”|buys_computer=“no”)=2/5=0.4X=(age=30,income=medium,student=yes,credit_rating=fair)P(X|Ci):P(X|buys_computer=“yes”)=0.222x0.444x0.667x0.667=0.044P(X|buys_computer=“no”)=0.6x0.4x0.2x0.4=0.019P(X|Ci)*P(Ci):P(X|buys_computer=“yes”)*P(buys_computer=“yes”)=0.044x0.643=0.028P(X|buys_computer=“no”)*P(buys_computer=“no”)=0.019x0.357=0.007Therefore,Xbelongstoclass“buys_computer=yes”ageincomestudentcredit_ratingbuys_computer=30highnofairno=30highnoexcellentno30…40highnofairyes40mediumnofairyes40lowyesfairyes40lowyesexcellentno31…40lowyesexcellentyes=30mediumnofairno=30lowyesfairyes40mediumyesfairyes=30mediumyesexcellentyes31…40mediumnoexcellentyes31…40highyesfairyes40mediumnoexcellentno14Additionalissues15Numericattributes:NaïveBayesianlearningassumesthatallattributesarecategorical.Numericattributesneedtobediscretized.Zerocounts:Anparticularattributevalueneveroccurstogetherwithaclassinthetrainingset.Weneedsmoothing.Missingvalues:IgnoredijijjiinnncCaA)|Pr(Avoidingthe0-ProbabilityProblemNaïveBayesianpredictionrequireseachconditionalprob.benon-zero.Otherwise,thepredictedprob.willbezeroEx.Supposeadatasetwith1000tuples,income=low(0),income=medium(990),andincome=high(10),UseLaplaciancorrection(orLaplacianestimator)Adding1toeachcaseProb(income=low)=1/1003Prob(income=medium)=991/1003Prob(income=high)=11/1003The“corrected”prob.estimatesareclosetotheir“uncorrected”counterpartsnkCixkPCiXP1)|()|(16OnnaïveBayesianclassifier17Advantages:EasytoimplementVeryefficientGoodresultsobtainedinmanyapplicationsDisadvantagesAssumption:classconditionalindependence,thereforelossofaccuracywhentheassumptionisseriouslyviolated(thosehighlycorrelateddatasets)RoadMap18BasicconceptsDecisiontreeinductionEvaluationofclassifiersClassificationusingassociationrulesNaïveBayesianclassificationSupportvectormachinesK-nearestneighborEnsemblemethods:BaggingandBoostingSummaryIntroduction19SupportvectormachineswereinventedbyV.Vapnikandhisco-workersin1970sinRussiaandbecameknowntotheWestin1992.SVMsarelinearclassifiersthatfindahyperplanetosepar

1 / 74
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功