使用R语言的BNLearn包实现贝叶斯网络(1)标签:生活2013-08-0122:26星期四1.加载程序包导入数据library(bnlearn)#CRAN中有,可以直接用install.packages(“bnlearn”)安装或者去网上下载后复制到library文件夹下即可。library(Rgraphviz)#用于绘图。这个包CRAN中没有,需要到去下载。data(learning.test)#导入数据,数据框中的变量必须全部为因子型(离散)或数值型(连续)。lear.test=read.csv(***.csv,colClasses=factor)#也可以直接从csv文件直接导入数据。需要注意的是如果数据中含有0-1之类的布尔型,或者1-3之类的等级数据,需要强行指定其为因子型,不然许多BN函数会报错。因为read函数只会自动的将字符型转换成因子型,其他的不会自动转换。该包包含贝叶斯网络的结构学习、参数学习和推理三个方面的功能,其中结构学习包含基于约束的算法、基于得分的算法和混合算法,参数学习包括最大似然估计和贝叶斯估计两种方法。此外还有引导(bootstrap),交叉验证(cross-validation)和随机模拟(stochasticsimulation)等功能,附加的绘图功能需要调用前述的Rgraphvizandlattice包。Bayesiannetworkstructurelearning(viaconstraint-based,score-basedandhybridalgorithms),parameterlearning(viaMLandBayesianestimators)andinference.ThispackageimplementssomealgorithmsforlearningthestructureofBayesiannetworks.Constraint-basedalgorithms,alsoknownasconditionalindependencelearners,arealloptimizedderivativesoftheInductiveCausationalgorithm(VermaandPearl,1991).ThesealgorithmsuseconditionalindependenceteststodetecttheMarkovblanketsofthevariables,whichinturnareusedtocomputethestructureoftheBayesiannetwork.Score-basedlearningalgorithmsaregeneralpurposeheuristicoptimizationalgorithmswhichranknetworkstructureswithrespecttoagoodness-of-fitscore.Hybridalgorithmscombineaspectsofbothconstraint-basedandscore-basedalgorithms,astheyuseconditionalindependencetests(usuallytoreducethesearchspace)andnetworkscores(tofindtheoptimalnetworkinthereducedspace)atthesametime.Severalfunctionsforparameterestimation,parametricinference,bootstrap,cross-validationandstochasticsimulationareavailable.Furthermore,advancedplottingcapabilitiesareimplementedontopoftheRgraphvizandlatticepackages.使用R语言的BNLearn包实现贝叶斯网络(2)标签:生活2013-08-0122:27星期四2基于约束的算法Bnlearn包中可使用的基于约束的算法有gs、iamb、fast.iamb、inter.iamb。Availableconstraint-basedlearningalgorithms引用方法很简单,就是函数名加数据框作为参数就可以了。做结构学习的时候还可以自定义黑名单、白名单列表,在学习中引入专家知识。res=gs(learning.test)Grow-Shrink算法(GS):是第一个(也是最简单)将马尔科夫边界检测算法(Margaritis,2003年)用于结构学习的算法。伸展/收缩。Grow-Shrink(gs):basedontheGrow-ShrinkMarkovBlanket,thefirst(andsimplest)Markovblanketdetectionalgorithm(Margaritis,2003)usedinastructurelearningalgorithm.IncrementalAssociation(iamb):基于马尔可夫边界检测算法相同的名称(Tsamardinos等,2003),这是基于两个阶段的选择方案(一个向前的选择后紧跟着尝试消除误报)。增量协会IncrementalAssociation(iamb):basedontheMarkovblanketdetectionalgorithmofthesamename(Tsamardinosetal.,2003),whichisbasedonatwo-phaseselectionscheme(aforwardselectionfollowedbyanattempttoremovefalsepositives).FastIncrementalAssociation(fast.iamb):IAMP使用投机逐步向前选择条件独立测试的人数减少(YaramakalaMargaritis,2005年)的一个变种。快速增量协会FastIncrementalAssociation(fast.iamb):avariantofIAMBwhichusesspeculativestepwiseforwardselectiontoreducethenumberofconditionalindependencetests(YaramakalaandMargaritis,2005).InterleavedIncrementalAssociation(inter.iamb):IAMP的另一个变种,采用向前逐步选择(Tsamardinos等,2003),以避免误报马尔可夫边界检测阶段。交错增量协会InterleavedIncrementalAssociation(inter.iamb):anothervariantofIAMBwhichusesforwardstepwiseselection(Tsamardinosetal.,2003)toavoidfalsepositivesintheMarkovblanketdetectionphase.这些算法的计算复杂度是多项式的测试的数量,通常为O(N^2)(O(N^4)在最坏的情况下),其中N是变量的数目。执行的时间尺度线性数据集的大小。Thecomputationalcomplexityofthesealgorithmsispolynomialinthenumberoftests,usuallyO(N^2)(O(N^4)intheworstcasescenario),whereNisthenumberofvariables.Executiontimescaleslinearlywiththesizeofthedataset.条件独立测试(有条件)独立测试Available(conditional)independencetests基于约束的算法在实践中使用的条件独立测试,统计测试数据集。可用的测试(以及相应的标签)包括:Theconditionalindependencetestsusedinconstraint-basedalgorithmsinpracticearestatisticaltestsonthedataset.Availabletests(andtherespectivelabels)are:离散情况(多项式分布)discretecase(multinomialdistribution)互信息:理论上的信息距离测度。相关的测试模型有:渐近卡方检验(MI),蒙特卡罗置换检验(MC-MI),序贯蒙特卡罗置换检验(SMC-MI),和半参数测试(SP-MI)。mutualinformation:aninformation-theoreticdistancemeasure.It'sproportionaltothelog-likelihoodratio(theydifferbya2nfactor)andisrelatedtothedevianceofthetestedmodels.Theasymptoticchi-squaretest(mi),theMonteCarlopermutationtest(mc-mi),thesequentialMonteCarlopermutationtest(smc-mi),andthesemiparametrictest(sp-mi)areimplemented.?互信息(MI-SH):基于互信息的J-S估计的改进渐近卡方检验。测试模型包括:皮尔逊的X^2:经典的皮尔逊的X^2检验应急表。渐近卡方检验(X2),蒙特卡罗(MC-X^2)置换检验,序贯蒙特卡罗置换检验(SMC-X2)和半参数测试(SP-X2)来实现。shrinkageestimatorforthemutualinformation(mi-sh):animprovedasymptoticchi-squaretestbasedontheJames-Steinestimatorforthemutualinformation.Pearson'sX^2:theclassicalPearson'sX^2testforcontingencytables.Theasymptoticchi-squaretest(x2),theMonteCarlopermutationtest(mc-x2),thesequentialMonteCarlopermutationtest(smc-x2)andsemiparametrictest(sp-x2)areimplemented.连续情况(多元正态分布)continuouscase(multivariatenormaldistribution)线性相关性:线性相关。检验方法包括:t检验(COR),蒙特卡罗置换检验(MC-COR)和序贯蒙特卡罗置换检验(SMC-COR)。linearcorrelation:linearcorrelation.TheexactStudent'sttest(cor),theMonteCarlopermutationtest(mc-cor)andthesequentialMonteCarlopermutationtest(smc-cor)areimplemented.Fisher'sZ:atransformationofthelinearcorrelationwithasymptoticnormaldistribution.Usedbycommercialsoftware(suchasTETRADII)forthePCalgorithm(anRimplementationispresentinthepcalgpackageonCRAN).Thea