johnson演讲

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

ParsingandLearningInternationalWorkshoponParsingTechnologyMarkJohnsonBrownUniversity/MacquarieUniversityOctober20091/88ParsingandLearningParsingandgrammarinductiontraditionallyviewedasdistinctIUnknownwordsandphrases)parsermustbepreparedtoadaptMostgrammarlearningsystemsareerror-drivenIparsingisacentralcomponentofmostlearningalgorithmsTraditionalstatisticallearningalgorithmslearnruleprobabilitiesLearninggrammarrulesIvia\generateandtestIvianon-parametricBayes)adaptorgrammars2/88TalkoutlineLearningPCFGruleprobabilitiesfromterminalstringsIcollapsedGibbssamplersexplictlysampleparses,butdon'texplicitlyrepresentruleprobabilitiesNon-parametricextensionstoPCFGswithunboundedlymanypossiblerulesIadaptorgrammarslearnprobabilitiesofarbitrarysubtreesIcollapsedGibbssampleronlyneedstorepresentsampleparses,andnotprobabilitiesofin nitelymanyrulesAdaptorgrammarscanbeusedtolearn:IsimpleconcatenativemorphologyIunsupervisedwordsegmentationIcollocationsintopicmodelsIstructureinnames3/88OutlineLearningruleprobabilitiesLearninggrammarrules(notjustprobabilities)ChineseRestaurantProcessesAdaptorgrammarsBayesianinferenceforadaptorgrammarsAdaptorgrammarsforunsupervisedwordsegmentationTopicCollocationmodelsusingadaptorgrammarsAdaptorgrammarsfornamedentitiesConclusionExtendingAdaptorGrammars4/88Probabilisticcontext-freegrammarsRulesinContext-FreeGrammars(CFGs)expandnonterminalsintosequencesofterminalsandnonterminalsAProbabilisticCFG(PCFG)associateseachnonterminalwithamultinomialdistributionovertherulesthatexpanditProbabilityofatreeistheproductoftheprobabilitiesoftherulesusedtoconstructitRulerrRulerrS!NPVP1:0NP!Sam0:75NP!Sandy0:25VP!barks0:6VP!snores0:4P0B@SamNPSVPbarks1CA=0:45P0B@SandyNPSVPsnores1CA=0:15/88MaximumlikelihoodestimationfromvisibleparsesEachruleexpansionissampledfromparent'smultinomial)MaximumLikelihoodEstimator(MLE)isrule'srelativefrequencySamNPSVPbarksSamNPSVPsnoresSandyNPSVPsnoresRulernrrRulernrrS!NPVP31:0NP!Sam20:66NP!Sandy10:33VP!barks10:33VP!snores20:66ButMLEisoftenoverlycertain,especiallywithsparsedataIE.g.,\accidentalzerosnr=0)r=0.6/88BayesianestimationfromvisibleparsesBayesianestimatorsestimateadistributionoverruleprobabilitiesP(jn)|{z}Posterior/P(nj)|{z}LikelihoodP()|{z}PriorDirichletdistributionsareconjugatepriorsformultinomialsIADirichletdistributionover(1;:::;m)isspeci edbypositiveparameters( 1;:::; m)IIfPrior=Dir( )thenPosterior=Dir( +n)01234500.20.40.60.81P(θ1|α)Ruleprobabilityθ1α=(1,1)α=(3,2)α=(21,11)7/88SparseDirichletpriorsAs !0,Dirichletdistributionsbecomepeakedaround0\Grammarincludessomeoftheserules,butwedon'tknowwhich!01234500.20.40.60.81P(θ1|α)Ruleprobabilityθ1α=(1,1)α=(0.5,0.5)α=(0.25,0.25)α=(0.1,0.1)8/88EstimatingruleprobabilitiesfromstringsaloneInput:terminalstringsandgrammarrulesOutput:ruleprobabilitiesIngeneral,noclosed-formsolutionforIiterativealgorithmsusuallyinvolvingrepeatedlyreparsingtrainingdataExpectationMaximization(EM)proceduregeneralizesvisibledataMLestimatorstohiddendataproblemsInside-Outsidealgorithmisacubic-timeEMalgorithmforPCFGsBayesianestimationofvia:IVariationalBayesorIMarkovChainMonteCarlo(MCMC)methodssuchasGibbssampling9/88GibbssamplerforparsetreesandruleprobabilitiesInput:terminalstrings(x1;:::;xn),grammarrulesandDirichletpriorparameters Output:streamofsampleruleprobabilitiesandparsetreest=(t1;:::;tn)Algorithm:Assignparsetreestothestringssomehow(e.g.,randomly)Repeatforever:ComputerulecountsnfromtSamplefromDir( +n)Foreachstringxi:replacetiwithatreesampledfromP(tjxi;).Afterburn-in,(;t)aredistributedaccordingtoBayesianposteriorSamplingparsetreefromP(tjxi;)involvesparsingstringxi.10/88CollapsedGibbssamplersIntegrateoutruleprobabilitiestoobtainpredictivedistributionP(tijxi;ti)ofparsetiforsentencexigivenotherparsestiCollapsedGibbssamplerForeachsentencexiintrainingdata:ReplacetiwithasamplefromP(tjxi;ti)Aproblem:P(tijxi;ti)isnotaPCFGdistribution)nodynamic-programmingsampler(AFAIK)SNPcatsVPVchaseNPdogsSNPdogsVPVchaseNPdogs11/88Metropolis-HastingssamplersMetropolis-Hastings(MH)acceptance-rejectionprocedureusessamplesfromaproposaldistributiontoproducesamplesfromatargetdistributionWhensentencesizetrainingdatasize,P(tijxi;ti)isalmostaPCFGdistributionIuseaPCFGapproximationbasedontiasproposaldistributionIapplyMHtotransformproposalstoP(tijxi;ti)ToconstructaMetropolis-Hastingssampleryouneedtobeableto:IecientlysamplefromproposaldistributionIcalculateratiosofparseprobabilitiesunderproposaldistributionIcalculateratiosofparseprobabilitiesundertargetdistribution12/88CollapsedMetropolis-within-GibbssamplerforPCFGsInput:terminalstrings(x1;:::;xn),grammarrulesandDirichletpriorparameters Output:streamofsampleparsetreest=(t1;:::;tn)Algorithm:Assignparsetreestothestringssomehow(e.g.,randomly)Repeatforever:Foreachsentencexiintrainingdata:ComputerulecountsnifromtiComputeproposalgrammarprobabilitiesfromniSampleatreetfromP(tjxi;)ReplacetiwithtaccordingtoMetropolis-Hastingsaccept-rejectformula13/88Q:Arethereecientlocal-movetreesamplers?DPPCFGsa

1 / 88
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功