自然语言处理

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

1WhatisNaturalLanguageProcessing(NLP)•Theprocessofcomputeranalysisofinputprovidedinahumanlanguage(naturallanguage),andconversionofthisinputintoausefulformofrepresentation.•ThefieldofNLPisprimarilyconcernedwithgettingcomputerstoperformusefulandinterestingtaskswithhumanlanguages.•ThefieldofNLPissecondarilyconcernedwithhelpinguscometoabetterunderstandingofhumanlanguage.2FormsofNaturalLanguage•Theinput/outputofaNLPsystemcanbe:–writtentext–speech•Wewillmostlyconcernedwithwrittentext(notspeech).•Toprocesswrittentext,weneed:–lexical,syntactic,semanticknowledgeaboutthelanguage–discourseinformation,realworldknowledge•Toprocessspokenlanguage,weneedeverythingrequiredtoprocesswrittentext,plusthechallengesofspeechrecognitionandspeechsynthesis.3ComponentsofNLP•NaturalLanguageUnderstanding–Mappingthegiveninputinthenaturallanguageintoausefulrepresentation.–Differentlevelofanalysisrequired:morphologicalanalysis,syntacticanalysis,semanticanalysis,discourseanalysis,…•NaturalLanguageGeneration–Producingoutputinthenaturallanguagefromsomeinternalrepresentation.–Differentlevelofsynthesisrequired:deepplanning(whattosay),syntacticgeneration•NLUnderstandingismuchharderthanNLGeneration.But,stillbothofthemarehard.4WhyNLUnderstandingishard?•Naturallanguageisextremelyrichinformandstructure,andveryambiguous.–Howtorepresentmeaning,–Whichstructuresmaptowhichmeaningstructures.•Oneinputcanmeanmanydifferentthings.Ambiguitycanbeatdifferentlevels.–Lexical(wordlevel)ambiguity--differentmeaningsofwords–Syntacticambiguity--differentwaystoparsethesentence–Interpretingpartialinformation--howtointerpretpronouns–Contextualinformation--contextofthesentencemayaffectthemeaningofthatsentence.•Manyinputcanmeanthesamething.•Interactionamongcomponentsoftheinputisnotclear.5KnowledgeofLanguage•Phonology–concernshowwordsarerelatedtothesoundsthatrealizethem.•Morphology–concernshowwordsareconstructedfrommorebasicmeaningunitscalledmorphemes.Amorphemeistheprimitiveunitofmeaninginalanguage.•Syntax–concernshowcanbeputtogethertoformcorrectsentencesanddetermineswhatstructuralroleeachwordplaysinthesentenceandwhatphrasesaresubpartsofotherphrases.•Semantics–concernswhatwordsmeanandhowthesemeaningcombineinsentencestoformsentencemeaning.Thestudyofcontext-independentmeaning.6KnowledgeofLanguage(cont.)•Pragmatics–concernshowsentencesareusedindifferentsituationsandhowuseaffectstheinterpretationofthesentence.•Discourse–concernshowtheimmediatelyprecedingsentencesaffecttheinterpretationofthenextsentence.Forexample,interpretingpronounsandinterpretingthetemporalaspectsoftheinformation.•WorldKnowledge–includesgeneralknowledgeabouttheworld.Whateachlanguageusermustknowabouttheother’sbeliefsandgoals.7AmbiguityImadeherduck.•Howmanydifferentinterpretationsdoesthissentencehave?•Whatarethereasonsfortheambiguity?•Thecategoriesofknowledgeoflanguagecanbethoughtofasambiguityresolvingcomponents.•Howcaneachambiguouspieceberesolved?•Doesspeechinputmakethesentenceevenmoreambiguous?–Yes–decidingwordboundaries8Ambiguity(cont.)•Someinterpretationsof:Imadeherduck.1.Icookedduckforher.2.Icookedduckbelongingtoher.3.Icreatedatoyduckwhichsheowns.4.Icausedhertoquicklylowerherheadorbody.5.Iusedmagicandturnedherintoaduck.•duck–morphologicallyandsyntacticallyambiguous:nounorverb.•her–syntacticallyambiguous:dativeorpossessive.•make–semanticallyambiguous:cookorcreate.•make–syntacticallyambiguous:–Transitive–takesadirectobject.=2–Di-transitive–takestwoobjects.=5–Takesadirectobjectandaverb.=49AmbiguityinaTurkishSentence•Someinterpretationsof:Adamıgördüm.1.Isawtheman.2.Isawmyisland.3.Ivisitedmyisland.4.Ibribedtheman.•MorphologicalAmbiguity:–ada-m-ıada+P1SG+ACC–adam-ıadam+ACC•SemanticAmbiguity:–görtosee–görtovisit–görtobribe10ResolveAmbiguities•Wewillintroducemodelsandalgorithmstoresolveambiguitiesatdifferentlevels.•part-of-speechtagging--Decidingwhetherduckisverbornoun.•word-sensedisambiguation--Decidingwhethermakeiscreateorcook.•lexicaldisambiguation--Resolutionofpart-of-speechandword-senseambiguitiesaretwoimportantkindsoflexicaldisambiguation.•syntacticambiguity--herduckisanexampleofsyntacticambiguity,andcanbeaddressedbyprobabilisticparsing.11ResolveAmbiguities(cont.)ImadeherduckSSNPVPNPVPIVNPNPIVNPmadeherduckmadeDETNherduck12ModelstoRepresentLinguisticKnowledge•Wewillusecertainformalisms(models)torepresenttherequiredlinguisticknowledge.•StateMachines--FSAs,FSTs,HMMs,ATNs,RTNs•FormalRuleSystems--ContextFreeGrammars,UnificationGrammars,ProbabilisticCFGs.•Logic-basedFormalisms--firstorderpredicatelogic,somehigherorderlogic.•ModelsofUncertainty--Bayesianprobabilitytheory.13AlgorithmstoManipulateLinguisticKnowledge•Wewillusealgorithmstomanipulatethemodelsoflinguisticknowledgetoproducethedesiredbehavior.•Mostofthealgorithmswewillstudyaretransducersandparsers.–Thesealgorithmsconstructsomestructurebasedontheirinput.•Sincethelanguageisambiguousatalllevels,thesealgorithmsareneversimpleprocesses.•Categoriesofmostalgorithmsthatwillbeusedcanfallintofollowingcategories.–statespacesearch–dynamicprogramming14LanguageandIntel

1 / 30
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功