Web3.0 and Language Resources Marta Sabou Knowledge Media Institute (KMi) The Open University Exploiting Semantic Web Ontologies: An Experimental Report.

Slides:



Advertisements
Similar presentations
…to Ontology Repositories Mathieu dAquin Knowledge Media Institute, The Open University From…
Advertisements

Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)
Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop November 2007.
Lifecycle Support for Networked Ontologies And related research in KMi Mathieu dAquin and Marta Sabou And also Enrico Motta, Martin Dzbor, Lucia Sepia,
WP8: User Centred Applications Enrico Motta, Marta Sabou, Vanessa Lopez, Laurian Gridinoc, Lucia Specia Knowledge Media Institute The Open University Milton.
A Stepwise Modeling Approach for Individual Media Semantics Annett Mitschick, Klaus Meißner TU Dresden, Department of Computer Science, Multimedia Technology.
1 Semantically Enriching Folksonomies with Sofia Angeletou, Marta Sabou and Enrico Motta.
Helping people find content … preparing content to be found Enabling the Semantic Web Joseph Busch.
Using Watson for Building Intelligent Applications in E-learning Mathieu d’Aquin The Knowledge Media Institute, The Open University
1 Entity Ranking Using Wikipedia as a Pivot (CIKM 10’) Rianne Kaptein, Pavel Serdyukov, Arjen de Vries, Jaap Kamps 2010/12/14 Yu-wen,Hsu.
Using the Semantic Web Mathieu d’Aquin Knowledge Media Institute, the Open University
OntoBlog: Informal Knowledge Management by Semantic Blogging Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of.
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
OntoBlog: Linking Ontology and Blogs Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of Informatics, Japan 2 Asian.
Search Engines and Information Retrieval
CS652 Spring 2004 Summary. Course Objectives  Learn how to extract, structure, and integrate Web information  Learn what the Semantic Web is  Learn.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Exploiting the Semantic Web: Next Generation Semantic Web Applications in KMi Watson, PowerMagpie, PowerAqua, … Mathieu d’Aquin Laurian Gridinoc Vanessa.
Next Generation Semantic Web Applications Prof. Enrico Motta Director, Knowledge Media Institute The Open University Milton Keynes, UK.
Watson Supporting Next Generation Semantic Web Applications Mathieu d’Aquin, Claudio Baldassarre, Laurian Gridinoc, Marta Sabou, Sofia Angeletou, Enrico.
IST NeOn-project.org The Semantic Web is growing… #SW Pages Lee, J., Goodwin, R. (2004) The Semantic.
Exploiting Large-Scale Semantics on the Web Prof. Enrico Motta Director, Knowledge Media Institute The Open University Milton Keynes, UK.
Characterizing Semantic Web Applications Prof. Enrico Motta Director, Knowledge Media Institute The Open University Milton Keynes, UK.
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
Towards a new generation of semantic web applications Prof. Enrico Motta, PhD Knowledge Media Institute The Open University Milton Keynes, UK.
SemanTic Interoperability To access Cultural Heritage Frank van Harmelen Henk Matthezing Peter Wittenburg Marjolein van Gendt Antoine Isaac Lourens van.
ÆKOS: A new paradigm for discovery and access to complex ecological data David Turner, Paul Chinnick, Andrew Graham, Matt Schneider, Craig Walker Logos.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
A Statistical and Schema Independent Approach to Identify Equivalent Properties on Linked Data † Kno.e.sis Center Wright State University Dayton OH, USA.
OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR
Web Explanations for Semantic Heterogeneity Discovery Pavel Shvaiko 2 nd European Semantic Web Conference (ESWC), 1 June 2005, Crete, Greece work in collaboration.
Search Engines and Information Retrieval Chapter 1.
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
Developing an Ontology for Irrigation Information Resources *Cornejo, C., H.W. Beck, D.Z. Haman, F.S. Zazueta. University of Florida Gainesville, FL. USA.
A Framework for Examning Topical Locality in Object- Oriented Software 2012 IEEE International Conference on Computer Software and Applications p
Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher Laura Po and Sonia Bergamaschi DII, University of Modena and Reggio Emilia, Italy.
1 LIST – DTSI – Service Réalité virtuelle, Cognitique et Interfaces sensorielles A Conceptual Approach to Web Image Retrieval Adrian Popescu Gregory Grefenstette.
Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the.
Europeana and semantic alignment of vocabularies Antoine Isaac Jacco van Ossenbruggen, Victor de Boer, Jan Wielemaker, Guus Schreiber Europeana & Vrije.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
© Paul Buitelaar – November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas.
Towards an ecosystem of data and ontologies Mathieu d’Aquin and Enrico Motta Knowledge Media Institute The Open University.
June 12, 2008 The University of Mississippi Design Strategy for Knowledge Base Formation to Automate a Course Map Creation Susan Lukose
Problems in Semantic Search Krishnamurthy Viswanathan and Varish Mulwad {krishna3, varish1} AT umbc DOT edu 1.
Evaluating Semantic Metadata without the Presence of a Gold Standard Yuangui Lei, Andriy Nikolov, Victoria Uren, Enrico Motta Knowledge Media Institute,
Algorithmic Detection of Semantic Similarity WWW 2005.
Authors: Marius Pasca and Benjamin Van Durme Presented by Bonan Min Weakly-Supervised Acquisition of Open- Domain Classes and Class Attributes from Web.
Using Domain Ontologies to Improve Information Retrieval in Scientific Publications Engineering Informatics Lab at Stanford.
Automatic Video Tagging using Content Redundancy Stefan Siersdorfer 1, Jose San Pedro 2, Mark Sanderson 2 1 L3S Research Center, Germany 2 University of.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
WEB PAGE CONTENTS VERIFICATION AGAINST TAGS USING DATA MINING TOOL IKNOW VІI scientific and practical seminar with international participation "Economic.
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
Acquisition of Categorized Named Entities for Web Search Marius Pasca Google Inc. from Conference on Information and Knowledge Management (CIKM) ’04.
And the Watson Plugin for the NeOn Toolkit. IST NeOn-project.org The Semantic Web is growing… #SW Pages.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Characterizing Knowledge on the Semantic Web with Watson Mathieu d’Aquin, Claudio Baldassarre, Laurian Gridinoc, Sofia Angeletou, Marta Sabou, Enrico Motta.
NeOn Components for Ontology Sharing and Reuse Mathieu d’Aquin (and the NeOn Consortium) KMi, the Open Univeristy, UK
Information Sharing on the Social Semantic Web Aman Shakya* and Hideaki Takeda National Institute of Informatics, Tokyo, Japan The Second NEA-JC Workshop.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
SEMANTIC WEB Presented by- Farhana Yasmin – MD.Raihanul Islam – Nohore Jannat –
 GEETHA P.  Originally coined by Tim O’Reilly Publishing Media  Second generation of services available on www.  Lets people collaborate and share.
Semantically Enriching Folksonomies with
A Research Programme for the Semantic Web
Exploiting Large Scale Web Semantics
CCNT Lab of Zhejiang University
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Ontology Evolution: A Methodological Overview
Web IR: Recent Trends; Future of Web Search
Language Technologies and the Semantic Web: An Essential Relationship.
Presentation transcript:

Web3.0 and Language Resources Marta Sabou Knowledge Media Institute (KMi) The Open University Exploiting Semantic Web Ontologies: An Experimental Report

Outline The Semantic Web –Online ontologies –Gateways to the Semantic Web Exploiting the Semantic Web –Relation discovery –Open Domain Question Answering –Folksonomy Enrichment Outlook for Language Technology

Scientific American, May 2001:

The Semantic Web Tim Berners-Lee: –“an extension of the current web (1) in which information is given well-defined meaning (2), better enabling computers and people to work in cooperation (3).” 1.The SW will gradually evolve out of the existing Web, it is not a competition to the current WWW 2.Represent Web content in a form that is more easily machine-processable 3.An open platform allowing information to be shared and processed

Ontology Metadata UoD Elementaries - The Watson Blog "Oh dear! Where the Semantic Web is going to go now?" -- imaginary user 23 en Watson team Thu, 01 Mar :49:52 GMT Pebble ( … Elementaries - The Watson Blog "Oh dear! Where the Semantic Web is going to go now?" -- imaginary user 23 en Watson team Thu, 01 Mar :49:52 GMT Pebble ( … Zen wisteria Mathieu d'Aquin … Zen wisteria Mathieu d'Aquin … <rdfs:comment rdf:datatype=" >The Knoledge Media Institute of the Open University, Milton Keynes UK … <rdfs:comment rdf:datatype=" >The Knoledge Media Institute of the Open University, Milton Keynes UK … DOAP FOAF DC RSS TAP WORDNET NCI Galen Music … … … … … …

SW = A Conceptual Layer over the web

SW is Heterogeneous!

Interlinked, Semantic Data on the Web

Semantic Web Gateways Search engines for the semantic data: collect, index and provide access to online semantic data. 10K ontologies 50 million semantic documents250K ontologies and metadata

Semantic Web Status Online semantic data constitutes now the largest and most heterogeneous knowledge resource known in AI/KR. Semantic Web Gateways offer a way to access this data easily. So, the question is… How to use it? How to make the best out of it?

Next Generation Semantic Web Applications Dynamically retrieving, exploiting and combining relevant semantic resources from the SW, at large Gateway to the Semantic Web

IEEE Intelligent Systems 23(3), pp , May/June 2008 Key aspects of the paradigm Tech. Infrastructure Concrete Applications

Outline The Semantic Web –Online ontologies –Gateways to the Semantic Web Exploiting the Semantic Web –Relation discovery –Open Domain Question Answering –Folksonomy Enrichment Outlook for Language Technology

Concept_A (e.g., Supermarket) Concept_B (e.g., Building) Scarlet Semantic Web Semantic Relation ( ) Deduce Access -SCARLET - relation discovery on the SW - -Automatically selects and combines multiple online ontologies to derive a relation Relation Discovery M. Sabou, M. d’Aquin, E. Motta, “Using the Semantic Web as Background Knowledge in Ontology Mapping", Ontology Mapping Workshop, ISWC’06.

Two strategies Supermarket Building Supermarket Shop PublicBuilding Building Scarlet CholesterolOrganicChemical Cholesterol Steroid Lipid OrganicChemical Scarlet Steroid Deriving relations from (A) one ontology and (B) across ontologies. Semantic Web (A) Strategy 1(B) Strategy 2

Matching two large scale agricultural thesauri: AGROVOC UN’s Food and Agriculture Organisation (FAO) thesaurus descriptor terms non-descriptor terms NALT US National Agricultural Library Thesaurus descriptor terms non-descriptor terms Experiment M. Sabou, M. d’Aquin, E. Motta, “Exploring the Semantic Web as Background Knowledge in Ontology Matching", Journal of Data Semantics, 2008.

Results - S1

226 Used Ontologies - S htechsight/Technologies.daml

Results - S2

306 Used Ontologies - S

Evaluation Manual assessment of 1000 mappings (15%) Performed for both strategies Evaluators: –Researchers in the area of the Semantic Web –10 people split in two groups

Evaluation - Precision S1 S2

Indicative Comparison with Other Techniques Traditional Matching (only eq.): 54% - 83% Using a single, pre-selected domain ontology: 76% Using the entire Web (via Google): 38% - 50% Using pre-selected, domain texts: 53% - 75% Using dynamically selected ontologies: 70% The Semantic Web offers high quality data that can be used to improve ontology matching.

Evaluation - Error Analysis S1

Error Analysis S2 old Subsumption as generic relation. Subsumption as part-whole. Subsumption as role.

Findings(1) Online ontologies are good enough to provide performance values comparable with other methods All relations have a formal “explanation” BUT: Sparseness in domain coverage Several modeling errors, most often the miss-use of subsumption

Outline The Semantic Web –Online ontologies –Gateways to the Semantic Web Exploiting the Semantic Web –Relation discovery –Open Domain Question Answering –Folksonomy Enrichment Outlook for Language Technology

PowerAqua Natural language question Answers from online semantic data Open domain QA by exploring online available semantic data.

Findings (2) Online ontologies allowed answering 69% of our question set BUT: Weakly populated –Most ontologies do not have enough instances Sparseness in domain coverage –Only 20% of the IR TREC topics covered Limited amount of non-taxonomic relations Low quality: –Several modeling errors, most often the miss-use of subsumption –Unclear labels –Missing domain and range information

Outline The Semantic Web –Online ontologies –Gateways to the Semantic Web Exploiting the Semantic Web –Relation discovery –Open Domain Question Answering –Folksonomy Enrichment Outlook for Language Technology

Search in Tag Spaces 5/24 ≈ 21% relevant Dog Bird Tiger Cat Land scape Land scape Land scape Let’s find photos of “animals which live in the water” Query: Animal Water

Bring in the SW… DolphinSeal Marine Mammal Mammal Sea livesIn Whale Body of Water Ocean Sea Elephant Fish livesIn Animal FreshwaterFish SaltwaterFish livesIn Animal Water or

Results dolphin seal whale sea elephant 18/24 ≈ 75% relevant

FLOR - Folksonomy enrichment kitten furry pets cow whiskers whale eye cat cute feline water deer primate bear lion rodent elephant fur ocean rabbit sea grass cute tree goat seal gorilla brown marine wild white cats eyes park animals otter mammal animal zoo nature dolphin farm Dolphin Seal Marine MammalSea hasHabitat Whale Body of Water Ocean Mammal Terrestrial Mammal TigerLion Sea Elephant Animal kitten furry pets cow whiskers whale eye cat cute feline water deer primate bear lion rodent elephant fur ocean rabbit sea grass cute tree goat seal gorilla brown marine wild white cats eyes park animals otter mammal animal zoo nature dolphin farm

FLOR - Experiment kitten furry pets cow whiskers whale eye cat cute feline water deer primate bear lion rodent elephant fur ocean rabbit sea grass cute tree goat seal gorilla brown marine wild white cats eyes park animals otter mammal animal zoo nature dolphin farm Structure_WN Structure_SW Interface_WN Interface_SW Richness of structure Increase in Search results WordNet

Findings (3) SW covers (some) multilingual tags SW covers novel tags BUT: on average, SW leads to less senses than WordNet per tag on average, SW leads to a weaker structure than obtained from WordNet YET: Better results obtained when Structure_SW is used for querying –Better alignment between tags and online concepts –Less fine-grained structure

Findings Good results obtained for relation discovery, open domain QA, improvement of search in folksonomies Large scale –More than 10K ontologies and growing!!! –Larger than any knowledge source in KR/AI Heterogeneous –Wrt. Size, quality of conceptualization, e.t.c Constantly evolving –Covers new terms that don’t (yet) appear in WordNet Multi-domain Multilingual Tools and API’s exist to allow its exploration

However… Domain coverage is still rather limited Ontology quality affects some applications: –Modeling errors –Few non-taxonomic relations –Unclear labels for ontology entities –Weakly populated –Less senses than in WordNet –Lack of domain and range information

Outline The Semantic Web –Online ontologies –Gateways to the Semantic Web Exploiting the Semantic Web –Relation discovery –Open Domain Question Answering –Folksonomy Enrichment Outlook for Language Technology

The Web as a LR Web 1.0 Web-based relatedness Calibrasi & Vitanyi, 2007 Verifying semantic relations Cimiano et Al, 2004

The Web as a LR kitten furry pets cow whiskers whale eye cat cute feline water deer primate bear lion rodent elephant fur ocean rabbit sea grass cute tree goat seal gorilla brown marine wild white cats eyes park animals otter mammal animal zoo nature dolphin farm Web Wikipedia based relatedness Strube et. Al, 2006 Folksonomy based relatedness Stumme et. Al, 2008 Web-based relatedness Calibrasi & Vitanyi, 2007 Verifying semantic relations Cimiano et Al, 2004

The Web as a LR kitten furry pets cow whiskers whale eye cat cute feline water deer primate bear lion rodent elephant fur ocean rabbit sea grass cute tree goat seal gorilla brown marine wild white cats eyes park animals otter mammal animal zoo nature dolphin farm Dolphin Seal Marine MammalSea hasHabitat Whale Body of Water Ocean Mammal Terrestrial Mammal TigerLion Sea Elephant Animal Web-based relatedness Calibrasi & Vitanyi, 2007 Verifying semantic relations Cimiano et Al, 2004 Wikipedia based relatedness Strube et. Al, 2006 Folksonomy based relatedness Stumme et. Al, 2008 Besides deepening research on the frontier of Web2.0 and LRs, … the next important wave is in exploring Web3.0. resources. Web

LT SW LT <--- SW: –Complementary to existing LRs Additional senses, novel terms and relations –Combine with other LRs –How to explore redundancy of knowledge? –How to explore heterogeneity? LT ---> SW :Can LT methods help to: –Increase domain coverage? –Detect modeling errors? E.g., by checking evidence from Web, Wikipedia –Improve anchoring? E.g., WSD methods

Thank you!

Strategy 2 - Definition Principle: If no ontologies are found that contain the two terms then combine information from multiple ontologies to find a mapping. AB rel Semantic Web A’ BC C’ B’rel Details: (1) Select all ontologies containing A’ equiv. with A (2) For each ontology containing A’: (a) if find relation between C and B. (b) if find relation between C and B. Details: (1) Select all ontologies containing A’ equiv. with A (2) For each ontology containing A’: (a) if find relation between C and B. (b) if find relation between C and B.

Strategy 2 - Examples Vs. (midlevel-onto) (Tap) Ex1: Vs.Ex2: (r1) (pizza-to-go) (SUMO) (Same results for Duck, Goose, Turkey) (r1) Vs.Ex3: (pizza-to-go) (wine.owl) (r3)

–Label similarity methods e.g., Full_Professor = FullProfessor –Structure similarity methods Using taxonomic/property related information Context: Ontology Matching

New paradigm: use of background knowledge A B Background Knowledge (external source) A’ B’ R R

External Source = One Ontology Aleksovski et al. EKAW’06 Map (anchor) terms into concepts from a richly axiomatized domain ontology Derive a mapping based on the relation of the anchor terms Assumes that a suitable (rich, large) domain ontology (DO) is available.

Strategy 1 - Definition Find ontologies that contain equivalent classes for A and B and use their relationship in the ontologies to derive the mapping. AB rel Semantic Web A1’A1’ B1’B1’ A2’A2’ B2’B2’ An’An’ Bn’Bn’ O1O1 O2O2 OnOn For each ontology use these rules: … These rules can be extended to take into account indirect relations between A’ and B’, e.g., between parents of A’ and B’:

External Source = Web van Hage et al. ISWC’05 rely on Google and an online dictionary in the food domain to extract semantic relations between candidate terms using IR techniques AB rel + OnlineDictionary IR Methods Precision increases significantly if domain specific sources are used: 50% - Web; 75% - domain texts. Does not rely on a rich DO