Download presentation
Presentation is loading. Please wait.
Published byMavis Garrett Modified over 9 years ago
1
Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com Put conference information here Reporter: Qi Liu YAGO
2
2 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN What is YAGO? A semantic web A knowledge base A combination of WordNet and wikipedia
3
3 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN Semantic web Advocated by W3C(World Wide Web Consortium) Aimed at reconstructing the WWW A standard framework: RDF(Resource Description Framework)
4
4 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN What is YAGO? A semantic web A knowledge base A combination of WordNet and wikipedia
5
5 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN Knowledge base To be: A special database for knowledge management To do: Provides a means for collecting, organising, searching and utilising information Three types: Machine-readable knowledge bases(DBpedia) Human-readable konwledge bases(Wikipedia) Knowledge base analysis and design
6
6 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN What is YAGO? A semantic web A knowledge base A combination of WordNet and wikipedia
7
7 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN WordNet To be: A lexical database for English since 1985 To do: Groups words into synsets Provides short, general definitions Records the semantic relations between these synsets 25 basic noun groups & 15 verb groups
8
8 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN Key Concepts Ontology vs Taxonomy Lexicon:the bridge between a language and the knowledge expressed in that language Syntactic (there vs their) Semantic (sight vs site) Pragmatic (infer vs imply)
9
9 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN Figure 1: Hierarchy of top-level categories in KR ontology See also http://www.jfsowa.com/ontology/toplevel.htm
10
10 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN Semantics of YAGO Five relations: Domain Range subRelationof Type subClassOf Entities: Domain Relation Range Literal......
11
11 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN Axiomatic rules
12
12 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN Reasoning rules correctness and completeness
13
13 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN The YAGO system Knowledge extraction YAGO storage Enriching YAGO
14
14 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN Knowledge extraction TYPE relation SUBCLASSOF relation MEANS relation Other relations Meta-relations
15
15 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN TYPE relation extraction The Wikipedia Category System Types: conceptual, administrative, relational, thematic Identifying Conceptual Categories Conceptual TYPE Adm and relational ones: excluded by hand Employ a shallow linguistic parsing(Noun Group Parser) of the left two categories E.g. Naturalized citizens of United States domain and range extracted at the same time
16
16 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN SUBCLASSOF relation extraction Wikipedia categories DAG(directed acyclic graph) Reflect merely the thematic structure Use only the leaf categories of Wikipedia Integrating WordNet Synsets Match or prefer WordNet Establishing subClassOf American people in Japan Exceptions Correct manually
17
17 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN Means relation extraction Exploiting WordNet Synsets A synset{urban center,metropolis, city} Attach a class for the synset ‘city’ Exploiting Wikipedia Redirects Search “Einstein, Albert”, redirected to “Albert, Einstein” Parsing Person Names givenNameOf subRelationOf means familyNameOf subRelationOf means
18
18 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN Other relations extraction BornInYear & DiedInYear EstablisedIn & LocatedIn WrittenInYear PolitionOf HasWonPrize Filtering the Results
19
19 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN Meta-relations extraction Descriptions Individual DESCRIBES URL Witness Fact FoundIn URL(of its witness page) ExtractedBy Context Linkages btw A&B: A Context B
20
20 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN Knowledge extraction TYPE relation SUBCLASSOF relation MEANS relation Other relations Meta-relations
21
21 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN The YAGO system Knowledge extraction YAGO storage Enriching YAGO
22
22 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN YAGO storage Model independent of storage Storage: Text files, XML, database tables, RDF
23
23 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN Enriching YAGO Add the fact(x,r,y) Map x,y to existing entities(word sense disambiguation) If mapping failed, add new entity. Map r to YAGO ontology If mapping successed, add a FoundIn relation If mapping failed, add a new fact!
24
24 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN Summary on YAGO1 1M entities & 5M facts Accuracy around 95%
25
25 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN
26
26 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN YAGO2: In Time, Space and Many Languages YAGO: about 100 manually defined relations Build YAGO2 architecture based on such rules: Factual rules E.g. Exceptions,definition of all relations, domains, ranges and classes Implication rules Inferring rules from the facts in the database Replacement rules Normalize numbers, tags and other formats Extraction rules Extracting facts from a given source text
27
27 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN Temporal Dimension People wasBornOnDate & diedOnDate Groups wasCreatedOnDate&wasDestroyedOnDate Artifacts(buildings, songs,cities) [same as above] Events startedOnDate & endedOnDate =>startExistingOnDate&endExistingOnDate Facts Entities in a fact =>subjectStartRelation&objectStartRelation
28
28 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN GEO-SPATIAL Dimension All physical objects have a location in space! Define it with geographical coordinates, i.e. Latitude and longtitude =>yagoGeoCoordinates, =>hasGeoCoordinates Two sources: Wikipedia GeoNames locatedIn & hasGeoCoordinates &
29
29 Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDANhttp://gdm.fudan.edu.cn Email: zerup123@gmail.com GDM@FUDANGDM@FUDAN Textual Dimension hasWikipediaAnchorText hasWikipediaCategory hasCitationTitle subClassOf hasContext Integrating UWN to including 200 languages
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.