Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 1 YAGO – A Core of Semantic Knowledge Fabian M. Suchanek, Gjergji Kasneci, Gerhard Weikum (Max-Planck.

Similar presentations


Presentation on theme: "Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 1 YAGO – A Core of Semantic Knowledge Fabian M. Suchanek, Gjergji Kasneci, Gerhard Weikum (Max-Planck."— Presentation transcript:

1 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 1 YAGO – A Core of Semantic Knowledge Fabian M. Suchanek, Gjergji Kasneci, Gerhard Weikum (Max-Planck Institute for Computer Science Saarbrücken/Germany)

2 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 2 Overview ر Motivation ر The Yago ontology ر Content ر Model ر Extension ر Conclusion

3 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 3 The Truth about Elvis Elvis is alive!

4 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 4 The Truth about Elvis Elvis is alive! He works as an astronaut in NASA's special security program

5 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 5 Usual solution Which NASA astronaut was born when Elvis was born? Yields only rubbish. Reasons: 1. Google participates in the conspiracy 2. Google does not search knowledge, but Web sites

6 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 6 Solution: An ontology born 1935 ? born is an astronaut

7 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 7 Solution: An ontology born 1935 ? born is a astronaut person entity subclass "Elvis Presley""The King" means is a

8 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 8 Solution: An ontology born 1935 ? born is a astronaut person entity subclass "Elvis Presley""The King" means Words is a Individuals Classes Relations

9 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 9 Where do we get the ontology from? Previous approaches: ر Assemble the ontology manually (WordNet, SUMO, GeneOntology) Problems: Usually low coverage (MPI is in none of these) ر Extract the ontology from corpora (e.g. the Web) (KnowItAll, Espresso, Snowball, L EILA ) Problem: Usually low accuracy (50%-92%)

10 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 10 Where do we get the ontology from? YAGO approach: Assemble the ontology from Wikipedia (=> good coverage) Use the category system of Wikipedia (=> good accuracy)

11 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 11 Exploiting the Wikipedia category system Elvis Pr blah blah blub Elvis (don't read this! Better listen to the talk!) laber fasel suelz. Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Blub, aber blah! Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Insbesondere, blub, texte zu, und so weiter Categories: 1935_births 1935 born Exploit relational categories

12 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 12 Exploiting the Wikipedia category system Elvis Pr blah blah blub Elvis (don't read this! Better listen to the talk!) laber fasel suelz. Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Blub, aber blah! Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Insbesondere, blub, texte zu, und so weiter Categories: American_singers 1935 born Exploit relational categories Exploit conceptual categories American_singer is a

13 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 13 Exploiting the Wikipedia category system Elvis Pr blah blah blub Elvis (don't read this! Better listen to the talk!) laber fasel suelz. Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Blub, aber blah! Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Insbesondere, blub, texte zu, und so weiter Categories: Disputed_articles 1935 born Exploit relational categories Exploit conceptual categories American_singer is a Disputed_article Avoid administrational categories

14 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 14 Exploiting the Wikipedia category system Elvis Pr blah blah blub Elvis (don't read this! Better listen to the talk!) laber fasel suelz. Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Blub, aber blah! Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Insbesondere, blub, texte zu, und so weiter Categories: Rock'n_Roll_Music 1935 born Exploit relational categories Exploit conceptual categories American_singer is a Rock'n_Roll_Music Avoid administrational categories Avoid thematic categories

15 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 15 Thematic vs Conceptual Categories American singers of German origin Premodifier Head Postmodifier Shallow linguistic noun phrase parsing: Heuristics: If the head is a plural word, the category is conceptual

16 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 16 The Upper Model 1935 born American_singer is a person entity ?

17 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 17 The Upper Model: From Wikipedia? 1935 born American_singer is a People_by_occupation Business ? Social_group

18 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 18 The Upper Model: From WordNet? 1935 born American_singer is a Singer#1 Person#3 Singer#17...

19 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 19 The Upper Model: From WordNet? 1935 born American_singers_of_Jewish_origin is a Singer#1 Person#3 Singer#17... Origin#7

20 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 20 The YAGO ontology 1935 born American_singer is a Singer#1 Person#3 subclass "singer" means "Elvis Presley" means

21 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 21 The YAGO ontology: Accuracy RelationAccuracy subclass 97.70% +/- 1.59% is a 94.54% +/- 2.36% familyName 97.81% +/- 1.75% givenName 97.62% +/- 2.08% establishedIn 90.84% +/- 4.28% bornInYear 93.14% +/- 3.71% diedInYear 98.72% +/- 1.30% locatedIn 98.41% +/- 1.52% politicianOf 92.43% +/- 3.93% writtenInYear 94.35% +/- 3.33% hasWonPrize 98.47% +/- 1.57%

22 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 22 The YAGO ontology: Number of Facts KnowItAll SUMO WordNet OpenCyc Cyc 30,000 60,000 200,000 300,000 2,000,000 6,000,000 Yago Ontologies should not be judged purely by the number of facts! This is just an informational overview.

23 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 23 The Yago Model: Why binary is not enough singer is a (But only from 1953 to 1977) (We know this from Wikipedia) (Elvis, is_a, singer)

24 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 24 The Yago Model: Why binary is not enough is a 1953-1977 Wikipedia time source #1 (Elvis, is_a, singer) #2 (#1, time, 1953-1977) #3 (#1, source, Wikipedia) singer

25 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 25 The Yago model formally A YAGO ontology over ر a set of relations R ر a set of common entities C ر a set of fact identifiers I is a function I (R C I) R (R I C) #1 (Elvis, is_a, singer) #2 (#1, time, 1953-1977) #3 (#1, source, Wikipedia) We can talk about ر facts (#1, source, Wikipedia) ر additional arguments (#1, time, 1953-1977) ر relations (time, hasRange, time_interval)

26 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 26 The Yago model: Logical aspects Axioms: (x, is_a, y) (y, subclass, z) => (x, is_a, z)... singer person subclass is a

27 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 27 The Yago model: Logical aspects Axioms: (x, is_a, y) (y, subclass, z) => (x, is_a, z)... f1, f2, f3, f4, f5 f1, f2, f3 f1, f2, f3, f4, f5, f6, f7, f8, f9, f10 derive facts Eliminate facts finite, unique

28 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 28 Extending the Ontology Whom did Elvis marry? Elvis married Priscilla X married Y Priscilla

29 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 29 Extending the Ontology Whom did Elvis marry? X married Y subj obj Elvis, the great rock star, married Priscilla subj obj Priscilla with LEILA

30 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 30 Extending the Ontology Ontology (YAGO) Information Extraction (LEILA)

31 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 31 The Truth about Elvis http://www.mpi-inf.mpg.de/~suchanek/downloads/yago/ "Elvis Presley" bornInYear $year $astro bornInYear $year $astro isa astronaut Enter your Yago Query: Which astronaut was born in the same year as Elvis? 20 results

32 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 32 The Truth about Elvis http://www.mpi-inf.mpg.de/~suchanek/downloads/yago/ "Elvis Presley" bornInYear $year $astro bornInYear $year "Roger" givenNameOf $astro $astro isa astronaut Enter your Yago Query: Which astronaut codenamed "Roger" was born in the same year as Elvis? $astro = Roger_Chaffee

33 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 33 Conclusions ر Yago bases on a logically clean model ر Yago has an accuracy of around 95% ر Yago is 3 times larger than the largest competitor ر Elvis is alive

34 Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 34 Reference For all details, please refer to our technical report "Yago – A Core of Semantic Knowledge" (Fabian M. Suchanek, Gjergji Kasneci, Gerhard Weikum) available at http://www.mpii.mpg.de/~suchanekhttp://www.mpii.mpg.de/~suchanek BibTex: @TECHREPORT{yagotr, AUTHOR = {Suchanek, Fabian and Kasneci, Gjergji and Weikum, Gerhard}, TITLE = {Yago: A Core of Semantic Knowledge}, TYPE = {Research Report}, INSTITUTION = {Max-Planck-Institut f{\"u}r Informatik}, ADDRESS = {Stuhlsatzenhausweg 85, 66123 Saarbr{\"u}cken, Germany}, NUMBER = {MPI-I-2006-5-006}, YEAR = {2006} }


Download ppt "Fabian M. SuchanekYAGO - A Core of Semantic Knowledge 1 YAGO – A Core of Semantic Knowledge Fabian M. Suchanek, Gjergji Kasneci, Gerhard Weikum (Max-Planck."

Similar presentations


Ads by Google