Download presentation
Presentation is loading. Please wait.
Published byNathaniel Monroe Modified over 11 years ago
1
Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS) University of Zaragoza, Spain Knowledge Media Institute (KMi) Open University, United Kingdom Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS) University of Zaragoza, Spain Knowledge Media Institute (KMi) Open University, United Kingdom 18th International World Wide Web Conference Madrid, Spain, 20th-24th April 2009
2
WWW 20092 Outline Introduction Method Optimization study Experiments Conclusions
3
WWW 20093 Introduction Current Semantic Web Favoured by the increasing amount of online ontologies already available on the Web Hampered by the high heterogeneity that this growing semantic content introduces The redundancy problem Excess of different semantic descriptions, coming from different sources, to describe the same intended meaning Our proposal A method to cluster the ontology terms that one can find on the Semantic Web, according to the meaning that they intend to represent
4
WWW 20094 Introduction
5
WWW 20095 Introduction
6
WWW 20096 Redundancy problem: many representations of the same meanings ? Watson apple Introduction The Semantic Web
7
WWW 20097 Proposed solution: pool of cross-ontology integrated senses clustered Watson apple Introduction The Semantic Web The Fruit The Tree The Company
8
WWW 20098 Introduction Watson The Semantic Web Multiontology Semantic Disambiguator Ontology Evolution Semantic Browsing Scarlet Ontology Matching Folksonomy Enrichment QueryGen Semantic Query Generation Question Answering
9
WWW 20099 Ontology terms Synonym expansion integration Sense clustering Keyword maps Synonym maps Senses (each synonym map) Watson Similarity > threshold? more ont. terms? yes no Extraction Similarity Computation rise threshold? Integration Senses Clustering Disintegration yesno Modify integration degree CIDER Modify integration? yes Method OFF-LINE RUN-TIME
10
WWW 200910 Keyword maps: ontology terms with identical label Watson Method apple
11
WWW 200911 Synonym maps: ontology terms with synonym labels apple apple tree Apple Inc. apple tree manzana Watson Method
12
WWW 200912 Method Agglomerative clustering CIDER a b c d a d a b c a d a b c... e e e
13
WWW 200913 Sense maps: semantically equivalent terms grouped apple Apple Inc. apple tree manzana apple Apple Inc. apple The Fruit The Tree The Company apple tree apple CIDER Method
14
WWW 200914 Falling threshold (Integration) Rising threshold (Disintegration) Optimal threshol d Method
15
WWW 200915 Integration level varies with similarity threshold Optimization study Integration Level = 1 - # finalSenses / # initialOntologyTerms
16
WWW 200916 Which similarity threshold is the best one? Three exploration ways: Experimenting with ontology matching benchmarks Obtained 0.13 lower bound for optimal threshold Contrasting with human opinion Range of good values between 0.2 and 0.3 Optimizing time response. Because: It will reduce the response time of the overall system Compatible with the other two ways It is not always feasible to have a large enough number of humans to ask or reference alignments Optimization study
17
WWW 200917 Response time varies with threshold Optimal value around 0.22 Optimization study
18
WWW 200918 Scalability study 9156 keywords, 73169 different ontology terms to be clustered, Processing time is linear with number of ontology terms Experiments
19
WWW 200919 Scalability study Processing time is independent of ontology size Experiments
20
WWW 200920 Illustrative example Keyword = turkey Synonym map = turkey, Türkei, Türkiye Nº ontology terms = 58 Nº Integrated senses = 9 (threshold = 0.27) Experiments
21
WWW 200921 Experiments More examples (threshold = 0.19) Keyword#initial terms#final senses appalachian71 apple397 free512 mace73 plant5218 poll54 stein51 turkey588
22
WWW 200922 Experiments Positive facts Terms from different versions of the same ontology are easily detected Very different meanings are not wrongly integrated (e.g., plant as living organism with plant as industrial buildings) Negative facts Hard to obtain a total integration of the same meanings (caused by very different semantic descriptions)
23
WWW 200923 Conclusions Redundancy of semantic descriptions on the Web can be significantly reduced Our integration technique scales when used on a large body of knowledge The proposed method is flexible enough to configure and adapt our integration level to the necessities of client applications Future work More advanced prototype More extensive human-based evaluation Study and evaluation of impact on other systems Conclusions
24
WWW 200924 END of presentation Thank you!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.