Presentation is loading. Please wait.

Presentation is loading. Please wait.

YAGO-NAGA Project Presented By: Mohammad Dwaikat To: Dr. Yuliya Lierler CSCI 8986 – Fall 2012.

Similar presentations


Presentation on theme: "YAGO-NAGA Project Presented By: Mohammad Dwaikat To: Dr. Yuliya Lierler CSCI 8986 – Fall 2012."— Presentation transcript:

1 YAGO-NAGA Project Presented By: Mohammad Dwaikat To: Dr. Yuliya Lierler CSCI 8986 – Fall 2012

2 Agenda What is YAGO-NAGA? Why YAGO-NAGA? How YAGO-NAGA Works? Demonstration YAGO-NAGA Sub-Projects

3 Agenda What is YAGO-NAGA? Why YAGO-NAGA? How YAGO-NAGA Works? Demonstration YAGO-NAGA Sub-Projects

4 What is YAGO-NAGA? Harvesting, Searching, and Ranking Knowledge from the Web. Building a conveniently searchable, large-scale, highly accurate knowledge base of common facts in a machine-processable representation. Harvested knowledge about millions of entities and facts about their relationships, from Wikipedia and WordNet with careful integration of these two sources.

5 What is YAGO-NAGA? Its vision is a confluence of Semantic Web (Ontologies), Social Web (Web 2.0), and Statistical Web (Information Extraction) assets towards a comprehensive repository of human knowledge.

6 YAGO Yet Another Great Ontology (YAGO) Knowledge base. It is a huge semantic knowledge base, derived from Wikipedia, WordNet, and GeoNames.GeoNames It knows almost 10 million entities (e.g. persons, organizations, cities), and 120 million facts about these entities. It has a manually confirmed accuracy of 95%. YAGO is an ontology that is anchored in time and space. – It attaches a temporal dimension and a spacial dimension to many of its facts and entities.

7 YAGO It contains all the entities and ontological facts extracted from Wikipedia (from 2010-08-17), with categories mapped to the WordNet class hierarchy. It also contains multi-lingual data from the Universal WordNet (UWN).

8 YAGO It contains all the entities and facts from GeoNames - (from a dump of August 2010). It also contains textual and structural data from Wikipedia. All links+anchor texts between the YAGO entities. All Wikipedia category names. The titles of references.

9 YAGO It is particularly suited for disambiguation purposes, as it contains a large number of names for entities. It also knows the gender of people. YAGO is the resulting knowledge base, the facts are represented as RDF triples (Resource Description Framework). Methods and prototype systems have been developed for querying, ranking, and exploring knowledge.

10 NAGA Not Another Google Answer (NAGA) is a new semantic search engine which provides ranked answers to queries based on statistical models. It can operate on knowledge bases that are organized as graphs with labeled nodes and edges, so called relationship graphs. As of now, NAGA uses a projection of YAGO as its knowledge base. The underlying query language supports keyword search for the casual user as well as graph-based queries with regular expressions for the expert user.

11 Agenda What is YAGO-NAGA? Why YAGO-NAGA? How YAGO-NAGA Works? Demonstration YAGO-NAGA Sub-Projects

12 Consider These Questions Which German Nobel laureate survived both world wars and outlived all four of his children? – The answer is Max Planck. Which politicians are also accomplished scientists? – The German chancellor Angela Merkel and Benjamin Franklin. How are Max Planck, Angela Merkel, Jim Gray, and the Dalai Lama related? – All four have doctoral degrees from German universities.

13 Why YAGO-NAGA? Three major research: – Semantic-Web-style knowledge repositories. Such as SUMO, OpenCyc, and WordNet. – Large-scale information extraction. – Social tagging and Web 2.0 communities that constitute the social Web. Wikipedia is another example of the Social Web paradigm. The challenge is how to extract the important facts from the Web and organize them into an explicit knowledge base that captures entities and semantic relationships among them.

14 Agenda What is YAGO-NAGA? Why YAGO-NAGA? How YAGO-NAGA Works? Demonstration YAGO-NAGA Sub-Projects

15 How YAGO-NAGA Works? YAGO adopts concepts from the standardized SPARQL Protocol and RDF Query Language for RDF data but extends them through more expressive pattern matching and ranking. The prototype system that implements these features is NAGA.

16 Query for the YAGO Knowledge Base

17 A big US city with two airports, one named after a World War II hero, and one named after a World War II battle field?

18 Select Distinct ?c Where { ?c type City. ?c locatedIn USA. ?a1 type Airport. ?a2 type Airport. ?a1 locatedIn ?c. ?a2 locatedIn ?c. ?a1 namedAfter ?p. ?p type WarHero. ?a2 namedAfter ?b. ?b type BattleField. } Structured Knowledge Queries

19 Growing the Knowledge Base Word Net Wikipedia + YAGO Core Extractors YAGO Core Checker YAGO Core YAGO Gatherer YAGO Gatherer Hypotheses YAGO Gatherer YAGO Scrutinizer YAGO Web sources G r o w i n g knows  all entities focus on facts 19/38

20 Information Extraction from Wikipedia

21 Combine knowledge from WordNet & Wikipedia. Additional Gazetteers (geonames.org). YAGO Knowledge Base

22 Searching & Ranking RDF Graphs in NAGA Queries with regular expressions: Discovery queries: Connectedness queries: Ling$xscientist type hasFirstName | hasLastName $yZhejiang locatedIn * worksFor Beng Chin Ooi (coAuthor | advisor) * Kiel$xscientist type bornIn Ranking based on confidence, compactness and relevance $x Nobel prize hasWon $a diedOn $y hasSon $b diedOn > Thomas MannGoethe * German novelist type

23 Agenda What is YAGO-NAGA? Why YAGO-NAGA? How YAGO-NAGA Works? Demonstration YAGO-NAGA Sub-Projects

24 YAGO Server: UI & API

25 YAGO-UI – Interactive online demo – RDF with time, space & provenance annotations – SPARQL + keywords YAGO-API Two basic WebServices: – processQuery (String query) – getYagoEntitiesByNames (String[] names) www.mpi-inf.mpg.de/yago-naga/demo.html

26 Browse through the YAGO knowledge base. – https://d5gate.ag5.mpi- sb.mpg.de/webyagospotlx/Browser https://d5gate.ag5.mpi- sb.mpg.de/webyagospotlx/Browser Ask queries on YAGO using SPOTLX patterns. View the results on a map and timeline. – https://d5gate.ag5.mpi- sb.mpg.de/webyagospotlx/WebInterface https://d5gate.ag5.mpi- sb.mpg.de/webyagospotlx/WebInterface YAGO

27 Agenda What is YAGO-NAGA? Why YAGO-NAGA? How YAGO-NAGA Works? Demonstration YAGO-NAGA Sub-Projects

28 More than 13 sub-projects of YAGO-NAGA. AIDA: is a method, implemented in an online tool, for disambiguating mentions of named entities that occur in natural-language text or Web tables. – https://d5gate.ag5.mpi-sb.mpg.de/webaida/ https://d5gate.ag5.mpi-sb.mpg.de/webaida/

29 Names, Surface Patterns & Paraphrases Which chemist was born in London? (I) Named entity disambiguation – chemist  wordnet_chemist, wordnet_pharmacist – born  Bertran_de_Born, Born_Identity_(Movie), Born_(Album) – London  London_UK, London_Arkansas, Antonio_London (II) Mapping surface patterns onto semantic relations – was_born_in  bornIn(, ) – was_born_in  bornOn(, ) (III) Paraphrases of questions [was] born in -born NNVBDVBNINNNP/LOC  bornIn(, )

30 References YAGO-NAGA Project: – http://www.mpi-inf.mpg.de/yago-naga/ http://www.mpi-inf.mpg.de/yago-naga/ YAGO: – http://yago-knowledge.org http://yago-knowledge.org NAGA: – http://www.mpi-inf.mpg.de/yago- naga/naga/demo.html http://www.mpi-inf.mpg.de/yago- naga/naga/demo.html


Download ppt "YAGO-NAGA Project Presented By: Mohammad Dwaikat To: Dr. Yuliya Lierler CSCI 8986 – Fall 2012."

Similar presentations


Ads by Google