Download presentation
Presentation is loading. Please wait.
Published byIsmael Courtenay Modified over 10 years ago
1
YAGO-NAGA Project Presented By: Mohammad Dwaikat To: Dr. Yuliya Lierler CSCI 8986 – Fall 2012
2
Agenda What is YAGO-NAGA? Why YAGO-NAGA? How YAGO-NAGA Works? Demonstration YAGO-NAGA Sub-Projects
3
Agenda What is YAGO-NAGA? Why YAGO-NAGA? How YAGO-NAGA Works? Demonstration YAGO-NAGA Sub-Projects
4
What is YAGO-NAGA? Harvesting, Searching, and Ranking Knowledge from the Web. Building a conveniently searchable, large-scale, highly accurate knowledge base of common facts in a machine-processable representation. Harvested knowledge about millions of entities and facts about their relationships, from Wikipedia and WordNet with careful integration of these two sources.
5
What is YAGO-NAGA? Its vision is a confluence of Semantic Web (Ontologies), Social Web (Web 2.0), and Statistical Web (Information Extraction) assets towards a comprehensive repository of human knowledge.
6
YAGO Yet Another Great Ontology (YAGO) Knowledge base. It is a huge semantic knowledge base, derived from Wikipedia, WordNet, and GeoNames.GeoNames It knows almost 10 million entities (e.g. persons, organizations, cities), and 120 million facts about these entities. It has a manually confirmed accuracy of 95%. YAGO is an ontology that is anchored in time and space. – It attaches a temporal dimension and a spacial dimension to many of its facts and entities.
7
YAGO It contains all the entities and ontological facts extracted from Wikipedia (from 2010-08-17), with categories mapped to the WordNet class hierarchy. It also contains multi-lingual data from the Universal WordNet (UWN).
8
YAGO It contains all the entities and facts from GeoNames - (from a dump of August 2010). It also contains textual and structural data from Wikipedia. All links+anchor texts between the YAGO entities. All Wikipedia category names. The titles of references.
9
YAGO It is particularly suited for disambiguation purposes, as it contains a large number of names for entities. It also knows the gender of people. YAGO is the resulting knowledge base, the facts are represented as RDF triples (Resource Description Framework). Methods and prototype systems have been developed for querying, ranking, and exploring knowledge.
10
NAGA Not Another Google Answer (NAGA) is a new semantic search engine which provides ranked answers to queries based on statistical models. It can operate on knowledge bases that are organized as graphs with labeled nodes and edges, so called relationship graphs. As of now, NAGA uses a projection of YAGO as its knowledge base. The underlying query language supports keyword search for the casual user as well as graph-based queries with regular expressions for the expert user.
11
Agenda What is YAGO-NAGA? Why YAGO-NAGA? How YAGO-NAGA Works? Demonstration YAGO-NAGA Sub-Projects
12
Consider These Questions Which German Nobel laureate survived both world wars and outlived all four of his children? – The answer is Max Planck. Which politicians are also accomplished scientists? – The German chancellor Angela Merkel and Benjamin Franklin. How are Max Planck, Angela Merkel, Jim Gray, and the Dalai Lama related? – All four have doctoral degrees from German universities.
13
Why YAGO-NAGA? Three major research: – Semantic-Web-style knowledge repositories. Such as SUMO, OpenCyc, and WordNet. – Large-scale information extraction. – Social tagging and Web 2.0 communities that constitute the social Web. Wikipedia is another example of the Social Web paradigm. The challenge is how to extract the important facts from the Web and organize them into an explicit knowledge base that captures entities and semantic relationships among them.
14
Agenda What is YAGO-NAGA? Why YAGO-NAGA? How YAGO-NAGA Works? Demonstration YAGO-NAGA Sub-Projects
15
How YAGO-NAGA Works? YAGO adopts concepts from the standardized SPARQL Protocol and RDF Query Language for RDF data but extends them through more expressive pattern matching and ranking. The prototype system that implements these features is NAGA.
16
Query for the YAGO Knowledge Base
17
A big US city with two airports, one named after a World War II hero, and one named after a World War II battle field?
18
Select Distinct ?c Where { ?c type City. ?c locatedIn USA. ?a1 type Airport. ?a2 type Airport. ?a1 locatedIn ?c. ?a2 locatedIn ?c. ?a1 namedAfter ?p. ?p type WarHero. ?a2 namedAfter ?b. ?b type BattleField. } Structured Knowledge Queries
19
Growing the Knowledge Base Word Net Wikipedia + YAGO Core Extractors YAGO Core Checker YAGO Core YAGO Gatherer YAGO Gatherer Hypotheses YAGO Gatherer YAGO Scrutinizer YAGO Web sources G r o w i n g knows all entities focus on facts 19/38
20
Information Extraction from Wikipedia
21
Combine knowledge from WordNet & Wikipedia. Additional Gazetteers (geonames.org). YAGO Knowledge Base
22
Searching & Ranking RDF Graphs in NAGA Queries with regular expressions: Discovery queries: Connectedness queries: Ling$xscientist type hasFirstName | hasLastName $yZhejiang locatedIn * worksFor Beng Chin Ooi (coAuthor | advisor) * Kiel$xscientist type bornIn Ranking based on confidence, compactness and relevance $x Nobel prize hasWon $a diedOn $y hasSon $b diedOn > Thomas MannGoethe * German novelist type
23
Agenda What is YAGO-NAGA? Why YAGO-NAGA? How YAGO-NAGA Works? Demonstration YAGO-NAGA Sub-Projects
24
YAGO Server: UI & API
25
YAGO-UI – Interactive online demo – RDF with time, space & provenance annotations – SPARQL + keywords YAGO-API Two basic WebServices: – processQuery (String query) – getYagoEntitiesByNames (String[] names) www.mpi-inf.mpg.de/yago-naga/demo.html
26
Browse through the YAGO knowledge base. – https://d5gate.ag5.mpi- sb.mpg.de/webyagospotlx/Browser https://d5gate.ag5.mpi- sb.mpg.de/webyagospotlx/Browser Ask queries on YAGO using SPOTLX patterns. View the results on a map and timeline. – https://d5gate.ag5.mpi- sb.mpg.de/webyagospotlx/WebInterface https://d5gate.ag5.mpi- sb.mpg.de/webyagospotlx/WebInterface YAGO
27
Agenda What is YAGO-NAGA? Why YAGO-NAGA? How YAGO-NAGA Works? Demonstration YAGO-NAGA Sub-Projects
28
More than 13 sub-projects of YAGO-NAGA. AIDA: is a method, implemented in an online tool, for disambiguating mentions of named entities that occur in natural-language text or Web tables. – https://d5gate.ag5.mpi-sb.mpg.de/webaida/ https://d5gate.ag5.mpi-sb.mpg.de/webaida/
29
Names, Surface Patterns & Paraphrases Which chemist was born in London? (I) Named entity disambiguation – chemist wordnet_chemist, wordnet_pharmacist – born Bertran_de_Born, Born_Identity_(Movie), Born_(Album) – London London_UK, London_Arkansas, Antonio_London (II) Mapping surface patterns onto semantic relations – was_born_in bornIn(, ) – was_born_in bornOn(, ) (III) Paraphrases of questions [was] born in -born NNVBDVBNINNNP/LOC bornIn(, )
30
References YAGO-NAGA Project: – http://www.mpi-inf.mpg.de/yago-naga/ http://www.mpi-inf.mpg.de/yago-naga/ YAGO: – http://yago-knowledge.org http://yago-knowledge.org NAGA: – http://www.mpi-inf.mpg.de/yago- naga/naga/demo.html http://www.mpi-inf.mpg.de/yago- naga/naga/demo.html
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.