知识图谱管理

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

GraphDataManagementLabatFudanUniversityGDM@FUDAN云海会,上海,SAP研究院,2012-11-20Email:shawyh@fudan.edu.cnManaging  and  Mining  Knowledge  Graphs  –  Challenges  and  Opportuni7es肖仰华  复旦大学  GDM@FUDAN  hp://gdm.fudan.edu.cnGraphDataManagementLabatFudanUniversityGDM@FUDAN云海会,上海,SAP研究院,2012-11-20Email:shawyh@fudan.edu.cnCopyrights• Fabian  Suchanek  &  Gerhard  Weikum,  Knowledge  Harves7ng  in  the  Big  Data  Era,  SIGMOD  2013  Tutorials.  • Bin  Shao,  Haixun  Wang,  Yanghua  Xiao,  Managing  and  Mining  Large  Graphs:  Systems  and  Implementa7ons,    SIGMOD  2012  Tutorials.  • 肖仰华,面向知识图谱的数据融合与管理,NSFC重点基金“网络数据融合理论与技术”研讨会,苏州,2013-­‐8-­‐14.  • 肖仰华,中文知识图谱,上海,Italk  沙龙,2012  GraphDataManagementLabatFudanUniversityGDM@FUDAN云海会,上海,SAP研究院,2012-11-20Email:shawyh@fudan.edu.cnOutline• Preliminaries  • Opportuni7es  • Managing  big  knowledge  graph  • Building  big  knowledge  graphGraphDataManagementLabatFudanUniversityGDM@FUDAN云海会,上海,SAP研究院,2012-11-20Email:shawyh@fudan.edu.cnWhat  is  knowledge  graph?knowledge  graph  contains  en77es/concepts  as  ver7ces  and  seman7c  rela7onships  as  edgesChinese  Knowledge  Graph4GraphDataManagementLabatFudanUniversityGDM@FUDAN云海会,上海,SAP研究院,2012-11-20Email:shawyh@fudan.edu.cnWhat  makes  knowledge  graph  different?• Ontology  – Domain  dependent  – Small  scale  – Edited  by  humans  • Seman7c  network  – Focus  on  concepts  instead  of  en77es  – Low  coverage• Knowledge  graph  – Large  scale  – Cover  both  en77es  and  concepts  – Cover  different    seman7c  rela7onships  – Automa7cally  harvested  from  Web  or  other  large  scale  corpusChinese  Knowledge  Graph5GraphDataManagementLabatFudanUniversityGDM@FUDAN云海会,上海,SAP研究院,2012-11-20Email:shawyh@fudan.edu.cnhp://richard.cyganiak.de/2007/10/lod/lod-­‐datasets_2011-­‐09-­‐19_colored.png  Web  of  Data:  RDF,  Tables,  Microdata  60  Bio.  SPO  triples  (RDF)  and  growing  Cyc  TextRunner/  ReVerb  WikiTaxonomy/  WikiNet  SUMO  ConceptNet  5  BabelNet  ReadTheWeb  6GraphDataManagementLabatFudanUniversityGDM@FUDAN云海会,上海,SAP研究院,2012-11-20Email:shawyh@fudan.edu.cnhp://richard.cyganiak.de/2007/10/lod/lod-­‐datasets_2011-­‐09-­‐19_colored.png  Web  of  Data:  RDF,  Tables,  Microdata  60  Bio.  SPO  triples  (RDF)  and  growing  •   10M  en77es  in      350K  classes  •   120M  facts  for      100  rela7ons  •   100  languages  •   95%  accuracy  •   4M  en77es  in      250  classes  •   500M  facts  for      6000  proper7es  •   live  updates  •   25M  en77es  in      2000  topics  •   100M  facts  for      4000  proper7es  •   powers  Google      knowledge  graph  7Ennio_MorriconetypecomposerEnnio_MorriconetypeGrammyAwardWinnercomposersubclassOfmusicianEnnio_MorriconebornInRomeRomelocatedInItalyEnnio_MorriconecreatedEcstasy_of_GoldEnnio_MorriconewroteMusicForThe_Good,_the_Bad_,and_the_UglySergio_LeonedirectedThe_Good,_the_Bad_,and_the_UglyGraphDataManagementLabatFudanUniversityGDM@FUDAN云海会,上海,SAP研究院,2012-11-20Email:shawyh@fudan.edu.cnSome  Publicly  Available  Knowledge  Bases  YAGO:      yago-­‐knowledge.org  Dbpedia:      dbpedia.org  Freebase:    freebase.com  En7tycube:              research.microsoi.com/en-­‐us/projects/en7tycube/  NELL:        rtw.ml.cmu.edu  DeepDive:  research.cs.wisc.edu/hazy/demos/deepdive/index.php/Steve_Irwin  Probase:                      research.microsoi.com/en-­‐us/projects/probase/  KnowItAll  /  ReVerb:    openie.cs.washington.edu        reverb.cs.washington.edu  PATTY:      ­‐inf.mpg.de/yago-­‐naga/pay/  BabelNet:      lcl.uniroma1.it/babelnet  WikiNet:        ­‐its.org/english/research/nlp/download/wikinet.php  ConceptNet:      conceptnet5.media.mit.edu  WordNet:      wordnet.princeton.edu    Linked  Open  Data:    linkeddata.org    8GraphDataManagementLabatFudanUniversityGDM@FUDAN云海会,上海,SAP研究院,2012-11-20Email:shawyh@fudan.edu.cnGoogle  Knowledge  Graph• Source  • CIA  Factbook  • Freebase  • Wiki  • Current  status  • 500  million  en77es  and  more  than  3.5  billion  factsChinese  Knowledge  Graph9GraphDataManagementLabatFudanUniversityGDM@FUDAN云海会,上海,SAP研究院,2012-11-20Email:shawyh@fudan.edu.cnCapture  concepts  in  human  mind  Represent  them  in  a  computable  form  Transform  them  to  machines  Machines  have  beer  understanding  of  human  world            More  than  2.7  million  concepts  automa7cally    harnessed  from  1.68  billion  documents          Computa7on/Reasoning  enabled  by  scoring:    Consensus:      e.g.,  is  there  a  company  called  Apple?                    Typicality:                        e.g.  how  likely  you  think  of  Apple  when                          you  think  about  companies?                    Ambiguity:            e.g.,  does  the  word  Apple,  sans  any  context,  represent  Apple  the  com

1 / 154
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功