Download presentation
Presentation is loading. Please wait.
Published byBryce Norman Modified over 9 years ago
1
Gaby Nativ, SDBI 2007
2
Motivation Other Ontologies System overview YAGO Dive IN LEILA NAGA Conclusion
3
Which NASA astronaut was born when Elvis was born?
4
Problem : Web pages are designed to be read by people, not machines Solution : Semantic-Web Meaning of information and Services is defined People and machines can use web content
5
Knowledge representation language Individuals - instances or objects Classes - concepts or types of objects Relations – ways that classes and objects can related to one another. Facts - instance of relation between individuals,classes or relations (Elvis Presley, Isa, Singer)
6
Directed Labeled Multi Graph G = ( V,E,L v,L e ) V is a set of vertices E V × V is a multi-set of edges L v is a is a set of individual and class labels L e is a set of relation labels With each edge we associate a confidence value
7
born 1935 ? born type astronaut person entity subclass "Elvis Presley""The King" means Words type Individuals Classes Relations
8
Motivation Other Ontologies System overview YAGO Dive IN LEILA NAGA Conclusion
9
Assemble the ontology manually: Wordnet SUMO GeneOntology Etc’.. Problems: Usually low coverage
10
Semantic lexicon for English language. Developed in Princeton University since 1985 Groups English words into synsets Providing short,general definition Records a various semantic relations. Contains about 150,000 words organized in over 115,000 synsets.
12
Concerned itself with meta-level concepts First released in December 2000 Maintained by Articulate Software
13
Part of large effort – Open Biomedical Ontologies. Constructed in 1998 – 3 models biological processes cellular components molecular function As of 2005 GO contained over 19,000 terms
14
Automated extraction of ontology KnowItAll University of Washington TextToOnto University of Karlsruhe Use pattern matching & machine learning techniques Problem: Usually low accuracy ( 50 %- 92 %)
15
Motivation Other Ontologies System overview YAGO Dive IN LEILA NAGA Conclusion
16
Interface Web YAGO KB LEILA Knowledge Acquisition Tools NAGA Query Processing & Ranking Browser Query Input and Output Tunable Parameters User Backend
17
Based on decidable and simple model Extensible ontology High coverage YAGO knows over 1.7 M entities,14M facts High quality Empirical evaluation : 95% accuracy
18
Assemble the ontology from Wikipedia Good Coverage, 7.83 M entities in all languages
19
Good Accuracy
20
Uses a deep linguistic analysis Machine learning techniques (SVM) Input A binary target relation A set of Web Documents Extract All pairs of entities that are in the target relation
22
1935 born American_singer type People_by_occupation Business ? Social_group Classes
23
Each synset of Word-Net becomes a class of YAGO Extract only Wikipedia’s leaf categories Exclude Known Individuals in Wordnet e.g. Albert Einstein will be excluded 15,000 cases WordNet & Wikipedia Conflict in Meaning prefer Wordnet ”Time exposure” is a common noun for WordNet, but an album title for Wikipedia.
24
Elvis Pr blah blah blub Elvis (don't read this! Better listen to the talk!) laber fasel suelz. Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Blub, aber blah! Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Insbesondere, blub, texte zu, und so weiter Categories : 1935_births 1935 bornInYear Exploit relational categories bornInYear diedInYear, EstablishedIn
25
Elvis Pr blah blah blub Elvis (don't read this! Better listen to the talk!) laber fasel suelz. Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Blub, aber blah! Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Insbesondere, blub, texte zu, und so weiter Categories : American_singers 1935 born Exploit conceptual categories subClassOf type American_singer type
26
Elvis Pr blah blah blub Elvis (don't read this! Better listen to the talk!) laber fasel suelz. Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Blub, aber blah! Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Insbesondere, blub, texte zu, und so weiter Categories : Rock'n_Roll_Music 1935 born American_singer type Rock'n_Roll_Music Avoid thematic categories
27
Shallow linguistic noun phrase parsing: American singers of German origin Premodifier Head Postmodifier Heuristics: If the head is a plural word, the category is conceptual.
28
Pling stemmer
29
1935 born American_singer type Singer Person subclass "singer" means "Elvis Presley" means
30
Storing Witness Storing each individual the URL of the corresponding Wikipedia page Storing Confidence
31
YAGO - A Core of Semantic Knowledge 31 1935 born American_singer type Singer#1 Person#3 subclass "singer" means "Elvis Presley" means wiki/Elvis_Presly FoundIn LEILA ExtactedBy
32
singer type But only from 1953 to 1977 We know this from Wikipedia Fact (Elvis, is_a,singer)
33
#1 (Elvis, is_a, singer) #2 (#1, time, 1953-1977) #3 (#1, source,Wikipedia) type 1953-1977 Wikipedia time source singer LEILA 0.93
34
A YAGO ontology over a set of relations R ( type,subClassOf) a set of common entities C ( entity, class, relation) a set of fact identifiers I Y : I (R C I) R (R I C) We can talk about : facts (#1, source, Wikipedia) additional arguments (#1, time, 1953-1977) relations (time, hasRange, time_interval)
35
= subclassOf type aCyclicTransitiveRelation Axioms & Rules: (x, is_a, y) (y, subclass, z) => (x, is_a, z)... singer person subClassOf type
36
Types Relations
37
{(r1, subRelationOf, r2), (x, r1, y)} -> (x, r2, y) {(r, type, acyclicTransitiveRelation), (x, r, y), (y, r, z)} -> (x, r, z)} {(r, domain, c), (x, r, c)} -> (x, type, c)} {(r, range, c), (x, r, y)} -> (y, type, c)} {(x, type, c1), (c1, subClassOf, c2)} -> (x, type, c2)}
38
Axioms: (x, is_a, y) (y, subclass,z) => (x, is_a, z)... f1, f2, f3, f4, f5 f1, f2, f3 f1, f2, f3, f4, f5, f6, f7, f8, f9, f10 derive facts Eliminate facts finite, unique
39
Consistency YAGO ontology is consistent iff x,r : (r,TYPE, acyclicTransitiveRelation) D(y) (x,r,x) D(y) Since D(y) is finite, the consistency of a YAGO ontology is decidable.
40
Is Lake Victoria “locatedIn” Tanzania? When entity should be an individual or a class? e.g. Physics is individual of science
41
KnowItAll SUMO WordNet OpenCyc Cyc 30,000 60,000 200,000 300,000 2,000,000 Yago 14,000,000
42
http://www.mpi- inf.mpg.de/~suchanek/downloads/yago/ http://www.mpi- inf.mpg.de/~suchanek/downloads/yago/ Which astronaut was born in the same year as Elvis? "Elvis Presley" bornInYear $year $astro bornInYear $year $astro isa astronaut 20 Results
43
Roger Bruce Chaffee February 15, 1935 was a U.S. Navy pilot who became an American astronaut in the Apollo program. Died during training in the Apollo 1 fire
44
Motivation Other Ontologies System overview YAGO Dive IN LEILA overiew NAGA overview Conclusion
45
Interface Web YAGO KB LEILA Knowledge Acquisition Tools NAGA Query Processing & Ranking Browser Query Input and Output Tunable Parameters User Backend
46
EVIDENCE QUERY Search the evidence for certain hypothesis DISCOVERY QUERY KielMaxPlanckPhysicist IsA bornIn Physicist Max Planck IsA $X $Y IsA bornInYear Discover pieces of missing information
47
REGULAR EXPRESSION QUERY An expresion user might be interested in certain Path of relations between pieces of information scientist$XLiu GivenNameOf|familyNameO f IsA river$X Afric a locatedIn* IsA
48
RELATEDNESS QUERY Find a broad relation between pieces of information. Both are physicists and both are scientists There are Moon craters and asteroid belts named after them Tom Cruise connects them by being a vegetarian Bohr Einstein connect
49
The answer to a query Q is a subgraph A of the knowledge graph that matches Q. Q: A: Physicist Max Planck type $X $Y type bornInYear Physicist Max Planck type 1858 Mihajlo Puin type bornInYear 0.98 0.95 0.96 0.97
50
Combines three measures: Extraction Confident The informativeness of a fact (e.g. the fact Albert_Einstein isA physicist is more informative than Albert_Einstein isA person) Compactness of answer graph (e.g “How are Einstein and Bohr related? Both Win Nobel then connected by Tom Cruze )
51
55 queries from TREC 2005/2006 12 queries from the work on SphereSearch 18 regular expression queries The queries were posed to Google, Yahoo! Answers, and NAGA at the same time
52
Semantic Web Vision System Overview YAGO bases on logically clean model accuracy of around 95% YAGO is 7 times larger than the largest competitor. Investigate the relationship OWL1.1 and YAGO model.
53
“YAGO – A Core of Semantic Knowledge" “NAGA: Harvesting, Searching and Ranking Knowledge” “LEILA: Learning to Extract Information by Linguistic Analysis” (Fabian M. Suchanek, Gjergji Kasneci, Gerhard Weikum …) Available at http://www.mpii.mpg.de/~suchanekhttp://www.mpii.mpg.de/~suchanek
54
Questions ?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.