自然语言处理课件P16

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

IntroductiontoNaturalLanguageProcessing200611NLP2„„„„NLP3„,NLP4„NaturalLanguageProcessing,NLP„NLP5„(ChineseInformationProcessing)„(NaturalLanguageUnderstanding)„(ComputationalLinguistics)„(HumanLanguageTechnology)NLP6„„„NLP7„„„„„„„NLP8NLP9„GB2312-80:6763NNN=14%N=204898%N=2016.7%N=307299.7%N=3221%N=383899.9%N=30065%N=517799.99%N=60081%N=620999.993%NLP10„2135„5„400„„NLP11„„--„5?„„„“”„“”„“”„NLP12„„„„„„NLP13„„„„NLP14„„……NLP15„inflectinglanguage„analyticlanguageAllprofessorscamehere.EvenProfessorZhangcamehere.Editingisverydifficult.Howtobecomeagoodeditor?NLP16NLP17„„NLP18„„„„„NLP19„„„„„„„„„NLP20NLP„2050„„50-60„„„70-80„80„„NLP21„„„„„„NLPNLP22„(Ambiguity)„(Ill-Formedness)NLP23„(le4)(yue4)„„„„„„NLP24„„„„„„“”NLP25„(UnknownWords,Out-of-VocabularyWords)„„„„„Mycardrinksgasolinelikewater„NLP26„„NLP27—„„NLP28—„„NLP29—„„NLP30—„„NLP31NLPNLP32NLP„NLP„„NLP„--„NLP„„NLP33NLP„„CD„„„„„„NLP34NLP„(KnowledgeAcquisition)„„„„(HybridApproach)„„NLP35„„„„„„„„NLP36„„NLP„NLP„„„„„„„„„NLP37„2005.4„2005.1„ChristopherD.Manning,HinrichSchütze,FoundationsofStatisticalNaturalLanguageProcessing,MITPress,1999„2005.6„DanielJurafsky,JamesH.Martin,SpeechandLanguageProcessing:AnIntroductiontoNaturalLanguageProcessing,ComputationalLinguistics,andSpeechRecognition,PrenticeHallPress,2000„1993.12„1998.9NLP38„„„—„,„NLP39„ISO/IEC10646/Unicode„Unicode„2003.8„—2000.3„Summary,Explanations,AndRemarks:GB18030-2000,fromUnicodeemail:dmeyer@adobe.com„:„„„„NLP3„„„„„„„NLP4(Probability))(APAΩ10)(≥AP21)(=ΩP3ij(ji≠),iAjA(Φ=∩jiAA),∑∝=∝==00)()(iiiiAPAPUNLP5(Maximizationlikelihoodestimation,MLE){1s,2s,…,ns},Nks(nk≤≤1))(kNsnksNsnsqkNkN)()(=∑==nkkNNsn1)(∑==nkkNsq11)(N)(kNsqks)(ksP)()(limkkNNsPsq=→∝NLP6200.0408550.0084700.0060120.0054060.0139940.0083560.0058570.0051720.0117580.0072970.0057200.0051170.0101750.0068210.0057050.0048240.0090340.0065570.0054880.004685NLP7(conditionalprobability)ABΩ0)(BPBA)|(BAP)()()|(BPBAPBAP∩=)()|(APBAP≠NLP8„“”“”“”“”NLP9Ω1B,2B,…,nBΩ1B,2B,…,nBΩAΩ1B,2B,…,nBΩ0)(iBPni,,2,1L=∑∑=====∪=niniiiiiniBAPBPABPABPAP111)|()()()()(NLP10(Bayes’Theorem)AΩ1B,2B,…,nBΩ0)(AP0)(iBPni,,2,1L=∑==njjjiiiBAPBPBAPBPABP1)|()()|()()|(1=n)()()|()|(APBPBAPABP=NLP11„(Priorprobability):„(Posteriorprobability):NLP12AS)|(ASP)|(maxarg^ASPSS=)()|()(maxarg^APSAPSPSS=)(APA)|()(maxarg^SAPSPSS=)|(SAP)(SPNLP13„100,000“”0.95“”0.005NLP14G“”T“”00001.01000001)(==GP99999.01000001100000)(=−=GP95.0)|(=GTP005.0)|(=GTP?)|(=TGP002.099999.0005.000001.095.000001.095.0)()|()()|()()|()|(≈×+××=+=GPGTPGPGTPGPGTPTGPNLP15(binomialdistribution)AAApnXAnnX,,1,0L=nrrnrrnrppCp−−=)1(!)!(!rrnnCrn−=nr≤≤0X),(~pnBXNLP16(Expectation)XkkpxXP==)(L,2,1=k∑∝=1kkkpxX∑∝==1)(kkkpxXENLP17(Variance)X)()()))((()(222XEXEXEXEXVar−=−=NLP18„1948Shannan“”„„„NLP19„„„„„„„NLP20(Entropy)XNLP21X)()(xXPxp==Xx∈X)(XH∑∈−=XxxpxpXH)(log)()(200log0=)(XH)(pHbitNLP22NLP23-70.426log)261log261(26)(log)()(222==−×=−=∑∈XxxpxpXHNLP24-„438023„4.1606NLP25E0.1268L0.0394P0.0186T0.0978D0.0389B0.0156A0.0788U0.0280V0.0102O0.0776C0.0268K0.0060I0.0707F0.0256X0.0016N0.0706M0.0244J0.0010S0.0634W0.0214Q0.0009R0.0594Y0.0202Z0.0006H0.0573G0.0187NLP26„„NLP27„6000123669.65NLP28(JointEntropy)XY),(yxpXY∑∑∈∈−=XxYyyxpyxpYXH),(log),(),(2NLP29(ConditionalEntropy)(X,Y)),(yxpXY∑∑∑∑∑∈∈∈∈∈−=−===XxYyXxYyXxxypyxpxypxypxpxXYHxpXYH)|(log),(])|(log)|()[()|()()|(XYNLP30)|()(),(XYHXHYXH+=),|()|()(),(111211−+++=nnnXXXHXXHXHXXHLLLNLP31)|()())|((log))((log))|(log)((log)))|()((log()),((log),(),()(),(),(),(XYHXHxypExpExypxpExypxpEyxpEYXHyxpxpyxpyxpyxp+=−−=+−=−=−=NLP32(MutualInformation)(X,Y)),(yxpX,Y)|()();(YXHXHYXI−=∑∑∈∈=XxYyypxpyxpyxpYXI)()(),(log),();(2);(YXIYXYXNLP33(MutualInformation)0)|(=XXH);()|()()(XXIXXHXHXH=−=NLP34)|(YXH)|(XYH);(YXI)(XH)(YH),(YXHNLP35(RelativeEntropyorKullback-LeiblerDivergence))(xp)(xq∑∈=XxxqxpxpqpD)()(log)()||(=∝=)0/log(,0)/0log(0ppq0)||(≥qpD0)||()||(pqDqpD≠NLP36(CrossEntropy)X)(xp)(xq)(xpXq∑−=+=xxqxpqpDXHqXH)(log)()||()(),(NLP37)(xp)(iXL=q∑→∝−=nxnnnxqxpnqLH1)(log)(1lim),(11nnxxx,,11L=L)(1nxpLnx1)(1nxqqnx1NLP38n1LstationaryergodicLq)(log1lim),(1nnxqnqLH→∝−=qLq)(xpNLP39(,,Perplexity)LnnlllL11=LnnlnqLHqlqPPn11)log(1),()]([221=≈=NLP40„:„I:O:„)|()(maxarg)()|()(maxarg))|((maxargˆIOpIpOpIOpIpOIpIIII===)(Ipi1i2i3…ino1o2o3…onNLP200611NLP2„„„„NLP3„(partsofspeech,)„„„„NLP4„()„„widewidely,difficultdifficultly()„collegedegree,overtake,madcowdiseaseNLP5„„(the,a)„„„NLP6„„NLP7„Iputthebagelsinthefreezer.„Thebagels,Iputinthefreezer.„Iputinthefridgethebagels(thatJohnhadgivenme)NLP8SheThewomanThetallwomanTheverytallwomanThetallwomanwithsadeyeshimthemantheshortmantheveryshortmantheshortmanwithredhairsawNLP9SNPVPThatmanVBDNPPPcaughtthebutterflyINNPwithanetNLP10„(NP)„ThehomelessoldmanintheparkthatItriedtohelpyesterdayNLP11„(PP)„Inthemorning,tothewest,atthesameplace,etc.NLP12„(VP)„Gettingtoschoolontimewasastruggle.„Hewastryingtokeephistemper.„Thatwomanquicklyshowedmethewaytohide.NLP13„AP„Sheisverysureofherself.„Heseemedamanwhowasquitecertaintosucceed.NLP14„„„„‘S’NLP15()S→NPVPNP→ATNNS|ATNN|NPPPVP→VPPP|VBD|VBDNPPP→I

1 / 619
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功