Retrieval Effectiveness of an Ontology-based Model for Information Selection Khan, L., McLeod, D. & Hovy, E. Presented by Danielle Lee.

Slides:



Advertisements
Similar presentations
ISDSI 2009 Francesco Guerra– Università di Modena e Reggio Emilia 1 DB unimo Searching for data and services F. Guerra 1, A. Maurino 2, M. Palmonari.
Advertisements

Multilinguality & Semantic Search Eelco Mossel (University of Hamburg) Review Meeting, January 2008, Zürich.
Lukas Blunschi Claudio Jossen Donald Kossmann Magdalini Mori Kurt Stockinger.
Multimedia Database Systems
Using language services to enrich the LOs' descriptions Dr. Vassilis Protonotarios University of Alcala, Spain 10 th Strategic Seminar / Conference 6-7.
DYNAMIC ELEMENT RETRIEVAL IN A STRUCTURED ENVIRONMENT MAYURI UMRANIKAR.
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
Mobile Web Search Personalization Kapil Goenka. Outline Introduction & Background Methodology Evaluation Future Work Conclusion.
Gimme’ The Context: Context- driven Automatic Semantic Annotation with CPANKOW Philipp Cimiano et al.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Detecting Economic Events Using a Semantics-Based Pipeline 22nd International Conference on Database and Expert Systems Applications (DEXA 2011) September.
Visual Information Retrieval Chapter 1 Introduction Alberto Del Bimbo Dipartimento di Sistemi e Informatica Universita di Firenze Firenze, Italy.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Presented by Zeehasham Rasheed
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
Information Retrieval
Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.
DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose DIVINES SRIV Workshop The Influence of Word Detection Variability on IR Performance.
Information Retrieval in Practice
Improving Data Discovery in Metadata Repositories through Semantic Search Chad Berkley 1, Shawn Bowers 2, Matt Jones 1, Mark Schildhauer 1, Josh Madin.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Title Extraction from Bodies of HTML Documents and its Application to Web Page Retrieval Microsoft Research Asia Yunhua Hu, Guomao Xin, Ruihua Song, Guoping.
Information Need Question Understanding Selecting Sources Information Retrieval and Extraction Answer Determina tion Answer Presentation This work is supported.
Machine Learning Approach for Ontology Mapping using Multiple Concept Similarity Measures IEEE/ACIS International Conference on Computer and Information.
Name : Emad Zargoun Id number : EASTERN MEDITERRANEAN UNIVERSITY DEPARTMENT OF Computing and technology “ITEC547- text mining“ Prof.Dr. Nazife Dimiriler.
Exploiting Ontologies for Automatic Image Annotation M. Srikanth, J. Varner, M. Bowden, D. Moldovan Language Computer Corporation
1 Searching through the Internet Dr. Eslam Al Maghayreh Computer Science Department Yarmouk University.
Outline Quick review of GS Current problems with GS Our solutions Future work Discussion …
A Novel Framework for Semantic Annotation and Personalized Retrieval of Sports Video IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 10, NO. 3, APRIL 2008.
1 Applying Collaborative Filtering Techniques to Movie Search for Better Ranking and Browsing Seung-Taek Park and David M. Pennock (ACM SIGKDD 2007)
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Ontology-Driven Automatic Entity Disambiguation in Unstructured Text Jed Hassell.
Query Processing In Multimedia Databases Dheeraj Kumar Mekala Devarasetty Bhanu Kiran.
The PATENTSCOPE search system: CLIR February 2013 Sandrine Ammann Marketing & Communications Officer.
1 Technologies for (semi-) automatic metadata creation Diana Maynard.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Of 33 lecture 10: ontology – evolution. of 33 ece 720, winter ‘122 ontology evolution introduction - ontologies enable knowledge to be made explicit and.
WEB SEARCH PERSONALIZATION WITH ONTOLOGICAL USER PROFILES Data Mining Lab XUAN MAN.
1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)
TOPIC CENTRIC QUERY ROUTING Research Methods (CS689) 11/21/00 By Anupam Khanal.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials: Informedia.
Knowledge based Personalization by Wonjung Kim. Outline Introduction Background – InfoQuilt system Personalization in InfoQuilt Related Work Conclusions.
Benchmarking ontology-based annotation tools for the Semantic Web Diana Maynard University of Sheffield, UK.
Using Several Ontologies for Describing Audio-Visual Documents: A Case Study in the Medical Domain Sunday 29 th of May, 2005 Antoine Isaac 1 & Raphaël.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
Progress Report (Concept Extraction) Presented by: Mohsen Kamyar.
Digital libraries and web- based information systems Mohsen Kamyar.
Multi-object Similarity Query Evaluation Michal Batko.
Information Retrieval
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
An Ontological Approach to Financial Analysis and Monitoring.
Semantic Data Extraction for B2B Integration Syntactic-to-Semantic Middleware Bruno Silva 1, Jorge Cardoso 2 1 2
Contextual Text Cube Model and Aggregation Operator for Text OLAP
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Integrated Departmental Information Service IDIS provides integration in three aspects Integrate relational querying and text retrieval Integrate search.
Efficient Semantic Web Service Discovery in Centralized and P2P Environments Dimitrios Skoutas 1,2 Dimitris Sacharidis.
AQUAINT Mid-Year PI Meeting – June 2002 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
1 Query Directed Web Page Clustering Daniel Crabtree Peter Andreae, Xiaoying Gao Victoria University of Wellington.
Visual Information Retrieval
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Improving Data Discovery Through Semantic Search
Multimedia Information Retrieval
ece 627 intelligent web: ontology and beyond
CS 620 Class Presentation Using WordNet to Improve User Modelling in a Web Document Recommender System Using WordNet to Improve User Modelling in a Web.
Multimedia Information Retrieval
CSE 635 Multimedia Information Retrieval
Content Augmentation for Mixed-Mode News Broadcasts Mike Dowman
Combining Keyword and Semantic Search for Best Effort Information Retrieval  Andrew Zitzelberger 1.
Presentation transcript:

Retrieval Effectiveness of an Ontology-based Model for Information Selection Khan, L., McLeod, D. & Hovy, E. Presented by Danielle Lee

Agenda Study Purpose Target Data Processing Ontology Development Metadata Query Mechanism Experiment Results Discussion

Study Purpose Keyword based search retrieves documents containing user’s specified keyword. To find the documents containing the desired semantic information, even though they don’t have the user-specific keywords To develop and evaluate disambiguation algorithm pruning irrelevant concepts and expand information selection using ontology. – Extraction of the semantic concepts from the keywords. – Document Indexing.

Target Data Processing Sports news audio clip from CNN and FOX. – Segmentation of audio Identify entry points/jump location as the boundaries of news items of interest. – Content extraction Closed captions from CNN Web site and Fox Sports rather than labor intensive speech recognition – Definition of an audio object To specify the content of segments. A sequence of contiguous segments is defined as an audio objects Each object specified as metadata such as object ID, starting time, ending time, description.

Ontology Development (1) Sport news domain dependent ontology Each concept has unique name and synonyms list – The synonyms list is used for disambiguation of keyword. Interrelationships – Specialization /concept inclusion (IS-A) Exclusive and non-exclusive Generalized concept (super concept) in exclusive relation is called as nonparticipant concept (NPC), which is not utilized in metadata generation and SQL query generation ex) “NFL” is a kind of “Professional” league. “Professional” is NPC. – Instantiation (Instance-Of) ex) All players and teams are instances of the concepts “Player” and “Team.” – Component membership (Part-Of) ex) “NFL” is part-of the concept “football.”

Ontology Development (2) Disjunct concepts – A number of concepts associated with a parent concept through IS-A interrelationships. – Object having a disjunct concept as the metadata cannot associated with another object having another disjunct concept.  this is helpful for disambiguation of concepts. ex) Object having “NBA” and object having “NFL”. – This grouping makes regions. Creating an ontology – All possible concepts are listed, concepts are grouped using Yahoo’s hierarchy (Team, player, manager, etc.) – Max depth of concept is six, and max no. of branching factor is 28.

Test for Ontology Coverage Aim : to select concepts from ontologies for the annotated text of audio clips from CNN and Fox sports news. – 90.5% of the clips are associated with concepts of ontologies. – 9.5% of the clips failed to find the relevant concepts. – It is due to the incompleteness of ontology.

Developed Ontology

Metadata To name the concepts of audio objects. Using a descriptive keywords acquired by content extraction, they made a connection the description with terms in ontology – Descriptive keywords : concept = 1 : many – Disambiguation algorithm is needed. Co-occurrence Semantic closeness

Query Mechanism (1) After keywords in the user request are matched to concepts, the generation of a DB query takes place. Through the list of synonyms of each keyword, the related concepts are found. 1) Pruning irrelevant concepts 2) Query expansion and SQL query generation

Query Mechanism (2) 1) Pruning irrelevant concepts – Element-score and Concept score : to choose the most appropriate concept. – Semantic distance : the shortest path between two concepts in ontology – Propagated-score

Partial Ontology of Selected Concepts NBA Vancouver Grizzlies Cleveland Cavaliers Los Angeles Lakers New Jersey Nets Bryant Reeves Mark Bryant Kobe Bryant Score = 0.5 Propagated Score = 1.5 Score = 1.0 P-Score = 1.5 Score = 0.5 P-Score = 0.5 Score = 0.5 P-Score = 0.5

Query Mechanism (3) 2) Query expansion and SQL query generation If the selected concept is not NPC type nor leaf node concept, no further progress. Otherwise each concept is added in disjunctive form. ex) Tell me about Los Angeles Lakers SELECT Time_start, Time_end FROM Audio_news a, Meta_news m WHERE a.Id = m.Id AND (Label = “NBATeam 11” OR Label = “NBAPlayer9” OR Label = “NBAPlayer10” …)

Experiment Independent Variables – Search mechanism (Keyword based search vs. Query expansion using ontology) – Vector space model based keyword search – The kind of queries Broader/generic queries (ex. “tell me about basketball”) Narrow/specific queries (ex. “tell me about Los Angeles Lakers”) Context query formation (ex. “Tell me Laker’s Kobe/Boxer Mike Tyson”) Dependent variables – Precision – Recall – F value (the combined value of precision and recall) 2481 Audio clips (usually less than 5 min.wav or.ram files with closed captions) and around 7000 concepts in ontology.

Analytical Results Equation – The higher Only BOTH precision and recall are, the higher F score is – When no relevant documents have been retrieved, the F score is 0. – When all retrieved documents are relevant, the F score is 1.

Result (Recall) Generic/Broader Queries Specific/Narrow Queries Context Queries

Result (Precision) Generic/Broader Queries Specific/Narrow Queries Context Queries

Result (F Score) Generic/Broader Queries Specific/Narrow Queries Context Queries

Results (Total)

Conclusion & Discussion Ontology-based query expansion outperformed over keyword based search in three kinds of query Fully automatic and applicable in some systems using ontology such as job recommender system. Precondition of this research is to have well defined ontology hierarchy.