数据驱动的大规模知识图谱构建方法

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

DataDrivenApproachesforLarge-scaleKnowledgeGraphConstructionYanghuaXiaoFudanUniversityKowledgeWorksatFudan(kw.fudan.edu.cn)Knowledge Graph•Knowledge graph is a large scale semantic network consisting of entities/concepts as well as the semantic relationships among them•Higher coverage over entities and concept•Richer semantic relationships•Usually organized as RDF•Quality insurance by Crowdsourcing•Why Knowledge Graphs?•Understanding the semantic of text needs background knowledge•A robot brain needs knowledge base to understand the world•Yago,WordNet, FreeBase, Probase, NELL, CYC, DBPedia….DataDrivenvsHandCrafted•Manually constructed knowledgegraph•Examples: WordNet, Cyc•Size: Small(Huge human cost)•Quality: Almost perfect(Each relation is checked by expects)•Auto-constructed knowledgegraph•Automatically extracted from huge web corpus•Examples: Probase、WikiTaxonomy, etc•Size: Huge(From huge corpus)•Quality: Good(The accuracy can’t reach 100%)•Because of the huge size, there are many wrong factsPipelineofKGconstructionExtraction•End-to-end•DomainspecificCompletion•Collaborativefilteringbasedcompletion•TransitivityinferencebasedcompletionCorrection•GraphstructurebasedcorrectionCost:CostlyHumanEffortsQuality:MissingdataQuality:WrongdataPipelineofKGconstructionExtraction•End-to-end•DomainspecificCompletion•Collaborativefilteringbasedcompletion•TransitivityinferencebasedcompletionCorrection•GraphstructurebasedcorrectionCost:CostlyHumanEffortsQuality:MissingdataQuality:WrongdataJiaqingLiang,YanghuaXiao,eta,Probase+:InferringMissingLinksinConceptualTaxonomies,tobepublishedinTKDE2017Probase•Aweb-scale taxonomy derived from web pagesbyHearst linguistic patterns•“…famous basketball players such as Michael Jordan …” •domestic animalssuch as catsand dogs... •Chinais a developing country. •Lifeis a box of chocolate. •10M concepts, and 16M isArelationsHearst patternNP such as NP, NP, ..., and|orNP such NP as NP,* or|andNPNP, NP*, or other NPNP, NP*, and other NP NP, including NP,* or | and NP NP, especially NP,* or|andNP Missing isArelationshipsinProbase•“car” and “automobile” are synonyms •They should share hypernyms•“automobile” should beA“wheelbase vehicle”•MissingisArelaitonhurtstheunderstandingtheconceptsofentities•IsLincolnzephyracar?Solutionidea: CFbasedMissing isAinference•User-based collaborative filtering!•Hypernyms ---Items•Concepts ---Users•Synonyms or Siblings ---Similar users•Concepts with similar meanings tend to share hypernyms/hyponyms in an isA taxonomy•To find missing hypernyms for a concept c•First find c’s synonyms and siblings•Then we transport their hypernyms to cIdea: if most similar terms of c have h as the hypernym, c is likely to have the hypernym h. Problemstobesolved•Effectiveness•Sparsity:Howtodeignaneffectivesimilaritymetric?•Noisy-ormodelamplifyingtheweaksignals•Weightaware:HowtoestimateafrequencyforthenewisArelation?•Buildaregressionmodel•Diversity:Howtoselectthefinalhypernyms?•Dynamicallytuningkforthetop-kselection•Efficiency•Howtoreducethequadraticcomplexityofpairwisesimilaritycomputation?•Upper-boundpruningResults•Recover5.1Mmissingedges,withprecision87%,recall80%.•Probaseplushasaccuracy91%Case studyPrecisionandrecallPipelineofKGconstructionExtraction•End-to-end•DomainspecificCompletion•Collaborativefilteringbasedcompletion•TransitivityinferencebasedcompletionCorrection•GraphstructurebasedcorrectionCost:CostlyHumanEffortsQuality:MissingdataQuality:WrongdataJiaqingLiang,YiZhang,YanghuaXiao*,HaixunWang,WeiWangandPinpinZhu,OntheTransitivityofHypernym-hyponymRelationsinData-DrivenLexicalTaxonomies,(AAAI2017)Motivation•We can use transitivity to find many missing isA relations •Example 1•But it is not trivial, there are wrong cases•Example 2 & 3•If we can determineinwhich cases transitivityhold, we can generate many missing isA relations •There are some examples, a isA care found missing isA relationshuman-craftedtaxonomies,transitivityinalexicaltaxon-omyistakenforgranted,thatis,givenhyponym(A,B)andhyponym(B,C),weknowhyponym(A,C)(Sang2007),asshowninExample1.Transitivityisthusoneofthecorner-stonesinknowledge-basedinferencing,andmanyapplica-tionsrelyontransitivity(e.g.,findingallthesuperconceptsofaninstance).Example1IsEinsteinascientist?hyponym(einstein,physicist)hyponym(physicist,scientist))hyponym(einstein,scientist)Unfortunately,transitivitydoesnotalwaysholdindata-drivenlexicaltaxonomies.Letusconsiderthefollowingtwoexamples:Example2IsEinsteinaprofession?hyponym(einstein,scientist)hyponym(scientist,profession);hyponym(einstein,profession)Example3Isacarseatapieceoffurniture?hyponym(carseat,chair)hyponym(chair,furniture);hyponym(carseat,furniture)ItisobviousthatEinsteinisnotaprofession.However,inadata-drivenlexicaltaxonomysuchasProbase,wehavestrongevidencethathyponym(einstein,scientist)andhyponym(scientist,profession).Iftransitivityholds,wewilldrawaconclusionthatconflictswithcommonsense.Asforcarseatandfurniture,wearetrappedinasimilarsituation.Thus,itisclearthattransitivitydoesnotalwaysholdind

1 / 39
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功