Presentation is loading. Please wait.

Presentation is loading. Please wait.

Web3.0 and Language Resources Marta Sabou Knowledge Media Institute (KMi) The Open University Exploiting Semantic Web Ontologies: An Experimental Report.

Similar presentations


Presentation on theme: "Web3.0 and Language Resources Marta Sabou Knowledge Media Institute (KMi) The Open University Exploiting Semantic Web Ontologies: An Experimental Report."— Presentation transcript:

1 Web3.0 and Language Resources Marta Sabou Knowledge Media Institute (KMi) The Open University Exploiting Semantic Web Ontologies: An Experimental Report

2

3 Outline The Semantic Web –Online ontologies –Gateways to the Semantic Web Exploiting the Semantic Web –Relation discovery –Open Domain Question Answering –Folksonomy Enrichment Outlook for Language Technology

4 Scientific American, May 2001:

5 The Semantic Web Tim Berners-Lee: –“an extension of the current web (1) in which information is given well-defined meaning (2), better enabling computers and people to work in cooperation (3).” 1.The SW will gradually evolve out of the existing Web, it is not a competition to the current WWW 2.Represent Web content in a form that is more easily machine-processable 3.An open platform allowing information to be shared and processed

6 Ontology Metadata UoD Elementaries - The Watson Blog http://watson.kmi.open.ac.uk:8080/blog/ "Oh dear! Where the Semantic Web is going to go now?" -- imaginary user 23 en Watson team Thu, 01 Mar 2007 13:49:52 GMT Pebble (http://pebble.sourceforge.net) http://backend.userland.com/rss … Elementaries - The Watson Blog http://watson.kmi.open.ac.uk:8080/blog/ "Oh dear! Where the Semantic Web is going to go now?" -- imaginary user 23 en Watson team Thu, 01 Mar 2007 13:49:52 GMT Pebble (http://pebble.sourceforge.net) http://backend.userland.com/rss … Zen wisteria Mathieu d'Aquin … Zen wisteria Mathieu d'Aquin … <rdfs:comment rdf:datatype="http://www.w3.org/2001/XMLSchema#string" >The Knoledge Media Institute of the Open University, Milton Keynes UK … <rdfs:comment rdf:datatype="http://www.w3.org/2001/XMLSchema#string" >The Knoledge Media Institute of the Open University, Milton Keynes UK … DOAP FOAF DC RSS TAP WORDNET NCI Galen Music … … … … … …

7 SW = A Conceptual Layer over the web

8 SW is Heterogeneous!

9 Interlinked, Semantic Data on the Web 20072008 2009

10 Semantic Web Gateways Search engines for the semantic data: collect, index and provide access to online semantic data. 10K ontologies 50 million semantic documents250K ontologies and metadata

11 Semantic Web Status Online semantic data constitutes now the largest and most heterogeneous knowledge resource known in AI/KR. Semantic Web Gateways offer a way to access this data easily. So, the question is… How to use it? How to make the best out of it?

12 Next Generation Semantic Web Applications Dynamically retrieving, exploiting and combining relevant semantic resources from the SW, at large Gateway to the Semantic Web

13 IEEE Intelligent Systems 23(3), pp. 20-28, May/June 2008 Key aspects of the paradigm Tech. Infrastructure Concrete Applications

14 Outline The Semantic Web –Online ontologies –Gateways to the Semantic Web Exploiting the Semantic Web –Relation discovery –Open Domain Question Answering –Folksonomy Enrichment Outlook for Language Technology

15 Concept_A (e.g., Supermarket) Concept_B (e.g., Building) Scarlet Semantic Web Semantic Relation ( ) Deduce Access -SCARLET - relation discovery on the SW -http://scarlet.open.ac.uk/ -Automatically selects and combines multiple online ontologies to derive a relation Relation Discovery M. Sabou, M. d’Aquin, E. Motta, “Using the Semantic Web as Background Knowledge in Ontology Mapping", Ontology Mapping Workshop, ISWC’06.

16 Two strategies Supermarket Building Supermarket Shop PublicBuilding Building Scarlet CholesterolOrganicChemical Cholesterol Steroid Lipid OrganicChemical Scarlet Steroid Deriving relations from (A) one ontology and (B) across ontologies. Semantic Web (A) Strategy 1(B) Strategy 2

17

18 Matching two large scale agricultural thesauri: AGROVOC UN’s Food and Agriculture Organisation (FAO) thesaurus 28.174 descriptor terms 10.028 non-descriptor terms NALT US National Agricultural Library Thesaurus 41.577 descriptor terms 24.525 non-descriptor terms Experiment M. Sabou, M. d’Aquin, E. Motta, “Exploring the Semantic Web as Background Knowledge in Ontology Matching", Journal of Data Semantics, 2008.

19 Results - S1

20 226 Used Ontologies - S1 http://139.91.183.30:9090/RDF/VRP/Examples/tap.rdf http://reliant.teknowledge.com/DAML/SUMO.daml http://reliant.teknowledge.com/DAML/Mid-level-ontology.daml http://reliant.teknowledge.com/DAML/Economy.damlhttp://gate.ac.uk/projects/ htechsight/Technologies.daml

21 Results - S2

22 306 Used Ontologies - S2 http://139.91.183.30:9090/RDF/VRP/Examples/tap.rdf http://reliant.teknowledge.com/DAML/SUMO.daml http://a.com/ontology http://reliant.teknowledge.com/DAML/Mid-level-ontology.daml http://www.dannyayers.com/2003/08/udef.rdfs http://gate.ac.uk/projects/htechsight/Technologies.daml http://reliant.teknowledge.com/DAML/Economy.daml

23 Evaluation Manual assessment of 1000 mappings (15%) Performed for both strategies Evaluators: –Researchers in the area of the Semantic Web –10 people split in two groups

24 Evaluation - Precision S1 S2

25 Indicative Comparison with Other Techniques Traditional Matching (only eq.): 54% - 83% Using a single, pre-selected domain ontology: 76% Using the entire Web (via Google): 38% - 50% Using pre-selected, domain texts: 53% - 75% Using dynamically selected ontologies: 70% The Semantic Web offers high quality data that can be used to improve ontology matching.

26 Evaluation - Error Analysis S1

27 Error Analysis S2 old Subsumption as generic relation. Subsumption as part-whole. Subsumption as role.

28 Findings(1) Online ontologies are good enough to provide performance values comparable with other methods All relations have a formal “explanation” BUT: Sparseness in domain coverage Several modeling errors, most often the miss-use of subsumption

29 Outline The Semantic Web –Online ontologies –Gateways to the Semantic Web Exploiting the Semantic Web –Relation discovery –Open Domain Question Answering –Folksonomy Enrichment Outlook for Language Technology

30 PowerAqua Natural language question Answers from online semantic data Open domain QA by exploring online available semantic data.

31 Findings (2) Online ontologies allowed answering 69% of our question set BUT: Weakly populated –Most ontologies do not have enough instances Sparseness in domain coverage –Only 20% of the IR TREC topics covered Limited amount of non-taxonomic relations Low quality: –Several modeling errors, most often the miss-use of subsumption –Unclear labels –Missing domain and range information

32 Outline The Semantic Web –Online ontologies –Gateways to the Semantic Web Exploiting the Semantic Web –Relation discovery –Open Domain Question Answering –Folksonomy Enrichment Outlook for Language Technology

33 Search in Tag Spaces 5/24 ≈ 21% relevant Dog Bird Tiger Cat Land scape Land scape Land scape Let’s find photos of “animals which live in the water” Query: Animal Water

34 Bring in the SW… DolphinSeal Marine Mammal Mammal Sea livesIn Whale Body of Water Ocean Sea Elephant Fish livesIn Animal FreshwaterFish SaltwaterFish livesIn Animal Water or

35 Results dolphin seal whale sea elephant 18/24 ≈ 75% relevant

36 FLOR - Folksonomy enrichment kitten furry pets cow whiskers whale eye cat cute feline water deer primate bear lion rodent elephant fur ocean rabbit sea grass cute tree goat seal gorilla brown marine wild white cats eyes park animals otter mammal animal zoo nature dolphin farm Dolphin Seal Marine MammalSea hasHabitat Whale Body of Water Ocean Mammal Terrestrial Mammal TigerLion Sea Elephant Animal kitten furry pets cow whiskers whale eye cat cute feline water deer primate bear lion rodent elephant fur ocean rabbit sea grass cute tree goat seal gorilla brown marine wild white cats eyes park animals otter mammal animal zoo nature dolphin farm

37 FLOR - Experiment kitten furry pets cow whiskers whale eye cat cute feline water deer primate bear lion rodent elephant fur ocean rabbit sea grass cute tree goat seal gorilla brown marine wild white cats eyes park animals otter mammal animal zoo nature dolphin farm Structure_WN Structure_SW Interface_WN Interface_SW Richness of structure Increase in Search results WordNet

38 Findings (3) SW covers (some) multilingual tags SW covers novel tags BUT: on average, SW leads to less senses than WordNet per tag on average, SW leads to a weaker structure than obtained from WordNet YET: Better results obtained when Structure_SW is used for querying –Better alignment between tags and online concepts –Less fine-grained structure

39 Findings Good results obtained for relation discovery, open domain QA, improvement of search in folksonomies Large scale –More than 10K ontologies and growing!!! –Larger than any knowledge source in KR/AI Heterogeneous –Wrt. Size, quality of conceptualization, e.t.c Constantly evolving –Covers new terms that don’t (yet) appear in WordNet Multi-domain Multilingual Tools and API’s exist to allow its exploration

40 However… Domain coverage is still rather limited Ontology quality affects some applications: –Modeling errors –Few non-taxonomic relations –Unclear labels for ontology entities –Weakly populated –Less senses than in WordNet –Lack of domain and range information

41 Outline The Semantic Web –Online ontologies –Gateways to the Semantic Web Exploiting the Semantic Web –Relation discovery –Open Domain Question Answering –Folksonomy Enrichment Outlook for Language Technology

42 The Web as a LR Web 1.0 Web-based relatedness Calibrasi & Vitanyi, 2007 Verifying semantic relations Cimiano et Al, 2004

43 The Web as a LR kitten furry pets cow whiskers whale eye cat cute feline water deer primate bear lion rodent elephant fur ocean rabbit sea grass cute tree goat seal gorilla brown marine wild white cats eyes park animals otter mammal animal zoo nature dolphin farm Web 2.0 + Wikipedia based relatedness Strube et. Al, 2006 Folksonomy based relatedness Stumme et. Al, 2008 Web-based relatedness Calibrasi & Vitanyi, 2007 Verifying semantic relations Cimiano et Al, 2004

44 The Web as a LR kitten furry pets cow whiskers whale eye cat cute feline water deer primate bear lion rodent elephant fur ocean rabbit sea grass cute tree goat seal gorilla brown marine wild white cats eyes park animals otter mammal animal zoo nature dolphin farm Dolphin Seal Marine MammalSea hasHabitat Whale Body of Water Ocean Mammal Terrestrial Mammal TigerLion Sea Elephant Animal Web-based relatedness Calibrasi & Vitanyi, 2007 Verifying semantic relations Cimiano et Al, 2004 Wikipedia based relatedness Strube et. Al, 2006 Folksonomy based relatedness Stumme et. Al, 2008 Besides deepening research on the frontier of Web2.0 and LRs, … the next important wave is in exploring Web3.0. resources. Web 3.0 + +

45 LT SW LT <--- SW: –Complementary to existing LRs Additional senses, novel terms and relations –Combine with other LRs –How to explore redundancy of knowledge? –How to explore heterogeneity? LT ---> SW :Can LT methods help to: –Increase domain coverage? –Detect modeling errors? E.g., by checking evidence from Web, Wikipedia –Improve anchoring? E.g., WSD methods

46 Thank you!

47 Strategy 2 - Definition Principle: If no ontologies are found that contain the two terms then combine information from multiple ontologies to find a mapping. AB rel Semantic Web A’ BC C’ B’rel Details: (1) Select all ontologies containing A’ equiv. with A (2) For each ontology containing A’: (a) if find relation between C and B. (b) if find relation between C and B. Details: (1) Select all ontologies containing A’ equiv. with A (2) For each ontology containing A’: (a) if find relation between C and B. (b) if find relation between C and B.

48 Strategy 2 - Examples Vs. (midlevel-onto) (Tap) Ex1: Vs.Ex2: (r1) (pizza-to-go) (SUMO) (Same results for Duck, Goose, Turkey) (r1) Vs.Ex3: (pizza-to-go) (wine.owl) (r3)

49 1 0.9 1 0.5 –Label similarity methods e.g., Full_Professor = FullProfessor –Structure similarity methods Using taxonomic/property related information Context: Ontology Matching

50 New paradigm: use of background knowledge A B Background Knowledge (external source) A’ B’ R R

51 External Source = One Ontology Aleksovski et al. EKAW’06 Map (anchor) terms into concepts from a richly axiomatized domain ontology Derive a mapping based on the relation of the anchor terms Assumes that a suitable (rich, large) domain ontology (DO) is available.

52 Strategy 1 - Definition Find ontologies that contain equivalent classes for A and B and use their relationship in the ontologies to derive the mapping. AB rel Semantic Web A1’A1’ B1’B1’ A2’A2’ B2’B2’ An’An’ Bn’Bn’ O1O1 O2O2 OnOn For each ontology use these rules: … These rules can be extended to take into account indirect relations between A’ and B’, e.g., between parents of A’ and B’:

53 External Source = Web van Hage et al. ISWC’05 rely on Google and an online dictionary in the food domain to extract semantic relations between candidate terms using IR techniques AB rel + OnlineDictionary IR Methods Precision increases significantly if domain specific sources are used: 50% - Web; 75% - domain texts. Does not rely on a rich DO


Download ppt "Web3.0 and Language Resources Marta Sabou Knowledge Media Institute (KMi) The Open University Exploiting Semantic Web Ontologies: An Experimental Report."

Similar presentations


Ads by Google