1机器学习图像中心第3章期望最大算法2内容提要一、期望最大(ExpectationMaximization,EM)算法及应用二、GMM及其应用3一、EM算法及应用1最大似然估计(MaximizationLikelihood)设总体X的密度函数是参数或参数向量,是该总体的样本,对给定的一组观测值,其联合密度是的函数,又称似然函数,记为:),,(xfnXXX,,,21nxxx,,,21nkknxfxxLL11),,(),,,()(4其中为参数集,若存在使就称是的最大似然估计值,而是的最大似然估计量。,),,(ˆˆ1nxx),()ˆ(LL),,(ˆ1nxx),,(ˆ1nXX求最大似然估计法的步骤第一步:写出似然函数;计算;第二步:如果似然函数关于参数是可微的,求;第三步:解方程,从中得到使取得极大值的,就是参数的最大似然估计量.),,;(21nXXXL),,;(ln21nXXXLdLdln0lndLdLlnˆˆ5例设是正态总体的一个样本,试求和的最大似然估计.nXX,1),(2N2解:似然函数),,;,(212nXXXL22)(21121iXnieniiXne122)(21)21(取对数niiXnL122)(2121lnlnniiXnn1222)(21ln22ln60()(212ln0)()1(221ln12222212niiniiXnLXL)解方程得到niiniiXXnXn1221(11)niiniiXXnXXn1221(1ˆ1ˆ)7Problem1FittingpointstoonelineSolution:LSE21niiiyaxb21iiiiiiiiiiixxxyabxyFittingpointstotwolinesHowtodeterminewhichlinegenerateseachpoint?Solution:???2实际问题正规方程8Problem2:ParameterEstimationforMixtureModelsSingleGaussian11()()21/2/21(|,)(2)TxxdpxeSolution:MLEbymaximizing1log(|,)niipx11,()()nnTiiiiixxx9MultipleGaussians1(|)(|,)MiiiiipxpxWhichcomponentdoeseachpointbelongto?Solution:???11Mii10IncompletenessAnalyticallyhardCommonFeature•LikelihoodofparameterΘgivendataX:•MaximizeexpectationofLby“tweaking”Θ)|(:)|(XpXL11CommonFeature----IncompletenessObservationProbDistributionNotKnownParametersAparadox:InterdependencybetweenhiddenvaluesandparametersgoverningthedistributionHiddenValueHiddenValue12Amoregeneralproblem:estimatingtheparametersofap.d.f.Nodirectaccesstotheneededdata(missed)Outcomeisanaccumulationofsimpleroutcomes(grouped)Thenumberofunderlyingdataisunknown(truncated)13A.P.Dempster,etal1977:“MaximumlikelihoodfromincompletedataviatheEMalgorithm”GeneralstatementofthealgorithmProveconvergenceCointheterm”EMalgorithm”2EMAlgorithm-Creativework14•在许多实际的学习问题框架中,相关实例特征中只有一部分可观察到•已有许多方法被提出来处理存在未观察到变量的问题–比如,如果某些变量有时能观察到,有时不能,那么可以用观察到该变量的实例去预测未观察到的实例中的变量的值•EM算法是存在隐含变量时广泛使用的一种学习方法,可用于变量的值从来没有被直接观察到的情形,只要这些变量所遵循的概率分布的一般形式已知–用于GMM的训练–用于马尔可夫模型的训练15•Stochasticallyindependent•Bayes’rule•Logarithm•Expectation),|()|,()|(XYpYXpXpbaablnlnln)()(),(BpApBApdyyxfyxXYEXY),(||Review16•MaximizeexpectationofLbytweakingΘ:–Analyticallyhard)|(:)|(XpXLIdea:),|(/)|,()|(XYpYXpXp),|(ln)|,(ln)|(lnXYpYXpXpIntroducenonexistentdataYIncompletedataXStochasticallyindependentYCompletedataZ:=(X,Y)-Easier!17Solution:IntroducingadditionalvariablestomakeitcompleteObservedDataX={x1,x2,…,xn}HiddenVariableY={y1,y2,…,yn}Z=(X,Y)1log(|)niipxIncompletedatalikelihood1log(,|)niiipxyCompletedatalikelihood18],|)|,([ln),()0()0(XYXpEQDefine19Initializewithrandom/guess,setn=1E-step:usecurrentparameterstoestimateM-step:computemaximumlikelihoodestimationofusingsetn=n+1repeatuntilconvergence],|)|,([ln),()1()1(nnXYXpEQEMAlgorithm步骤),()1(nQ),(maxarg)1()(nnQ),()1(nQ)1(n)0(20SolutiontoLineFittingParameters:Θ={a1,b1,a2,b2}PosteriorProbability:221222212()/1()/()/()(1|,)riiririewiplxee2()(2|,)1(1|,)iiwiplxplx()lliliriaxbywhere,l=1,221Expectation:22121211(,,,)()()NlilililQaabbiyaxbMaximization:TakingthederivativeofQwithrespecttoal,blandsettingtheresulttozero,weobtain2()()()()()()iiiilllliiiliillliiixxxywiwiwiabxywiwiwiWherel=1,222SolutiontoMixtureModellingParameters:PosteriorProbability:(1)(1)(1)(1)(1)(1)(1)1(|,)(|,)(|,)ttttllilliMtttjjijjjpxplxpx(1)(1)(|,)ttlillpxwhereisasingleGaussian,111222{,,,,,,...,,,}MMMcenteredatμiwithcovariancematrixΣiaiisthemixingweight,i=1,2,…,M23Expectation:Maximization:(1)1211(,,...,)log((|,))(|,)MNtMllilliliQpxplxTakingthederivativeofQwithrespecttoal,ulandΣandsettingtheresulttozero,weobtainthefollowingupdaterules()(1)1(|,)/NttliiplxN(1)()1(1)1(|,)(|,)NtiitilNtiixplxplx(1)(1)(1)()1(1)1(|,)()()(|,)NtttTiililtilNtiiplxxxplx244EMAlgorithm应用举例例1假设我们观察到随机序列),,,(4321yyyyy111(,,,,)24444Multinomialn),(maxarg)1(nQ∽求:(这里用EM算法来求解该问题,当然,也可以用优化算法如Newton’smethod.)25)4log()41log()41log()421log()(4321yyyyL对数似然函数为:引入辅助变量453423121yxyxyxyxx26),,,,(54321xxxxxx)4,41,41,421,(nlMultinomia∽X是完整的随机序列,Y是观察到的随机序列。则,完整的随机序列的对数似然函数为:5432144141421log)(xxxxxL)1log()(log)(4352xxxxconst27EStep:nnyxfEQ,|)|(log),(nyxxxxconstE,|)1log()(log)(4352)1log()(log)),|((4352xxxyxEconstn)1log()(log)2(4351xxxyconstnn),|(212nxxx注:∽)2,(21nnxxBin28MStep:5431511435122011)(1)2(0),(xxxyxyxxxyQddnnnnnnnn29EMIterationResults:IterationThetaError12345678910110.60824740.6243210.62648890.62677730.62681560.62682070.62682140.62682150.62682150.62682150.62682150.10824740.016073630.0021678290.00028844333.830976e-055.086909e-066.754367e-078.968368e-081.190809e-081.581141e-092.099418e-1030例2现在有来自一个