Cross-language Information Retrieval

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst Information Semantics Command & Control Center July 17, 2007 Ontologies Can't Help Records Management Or Can They?
Advertisements

A centralized approach to language resources Piek Vossen S&T Forum on Multilingualism, Luxembourg, June 6th 2005.
Multilinguality & Semantic Search Eelco Mossel (University of Hamburg) Review Meeting, January 2008, Zürich.
YAGO: A Large Ontology from Wikipedia and WordNet Fabian M. Suchanek, Gjergji Kasneci, Gerhard Weikum Max-Planck-Institute for Computer Science, Saarbruecken,
Chapter 5: Introduction to Information Retrieval
Schema Matching and Data Extraction over HTML Tables Cui Tao Data Extraction Research Group Department of Computer Science Brigham Young University supported.
Ontologies for multilingual extraction Deryle W. Lonsdale David W. Embley Stephen W. Liddle Supported by the.
Hermes: News Personalization Using Semantic Web Technologies
Ontology-Based Free-Form Query Processing for the Semantic Web by Mark Vickers Supported by:
Human Language Technologies. Issue Corporate data stores contain mostly natural language materials. Knowledge Management systems utilize rich semantic.
HyKSS: A Multiple Ontology Approach to Hybrid Search Andrew Zitzelberger Brigham Young University MS Thesis Proposal.
Xyleme A Dynamic Warehouse for XML Data of the Web.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Query Rewriting for Extracting Data Behind HTML Forms Xueqi Chen Department of Computer Science Brigham Young University March, 2003 Funded by National.
Ontology-Based Free-Form Query Processing for the Semantic Web Thesis proposal by Mark Vickers.
6/17/20151 Table Structure Understanding by Sibling Page Comparison Cui Tao Data Extraction Group Department of Computer Science Brigham Young University.
March 17, 2008SAC WT Hermes: a Semantic Web-Based News Decision Support System* Flavius Frasincar Erasmus University Rotterdam.
Gimme’ The Context: Context- driven Automatic Semantic Annotation with CPANKOW Philipp Cimiano et al.
1 Extending PRIX for Similarity-based XML Query Group Members: Yan Qi, Jicheng Zhao, Dan Situ, Ning Liao.
Annotating Documents for the Semantic Web Using Data-Extraction Ontologies Dissertation Proposal Yihong Ding.
Integration of Information Extraction with an Ontology M. Vargas-Vera, J.Domingue, Y.Kalfoglou, E. Motta and S. Buckingham Sum.
By ANDREW ZITZELBERGER A Framework for Extraction Ontology Based Information Management.
1 CIS607, Fall 2005 Semantic Information Integration Presentation by Zebin Chen Week 7 (Nov. 9)
Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
Semantic Web Queries by Mark Vickers Funded by NSF.
Xiaomeng Su & Jon Atle Gulla Dept. of Computer and Information Science Norwegian University of Science and Technology Trondheim Norway June 2004 Semantic.
Data Frame Augmentation of Free Form Queries for Constraint Based Document Filtering Andrew Zitzelberger.
1 Cui Tao PhD Dissertation Defense Ontology Generation, Information Harvesting and Semantic Annotation For Machine-Generated Web Pages.
Query Rewriting for Extracting Data Behind HTML Forms Xueqi Chen Department of Computer Science Brigham Young University March 31, 2004 Funded by National.
HyKSS: Hybrid Keyword and Semantic Search Andrew Zitzelberger 1.
Cross-Language Hybrid Keyword and Semantic Search David W. Embley, Stephen W. Liddle, Deryle W. Lonsdale, Joseph S. Park, Andrew Zitzelberger Brigham Young.
Classifying Tags Using Open Content Resources Simon Overell, Borkur Sigurbjornsson & Roelof van Zwol WSDM ‘09.
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
Deryle W. Lonsdale, David W. Embley, Stephen W. Liddle, and Joseph Park BYU Data Extraction Research Group.
WebODE and its Ontology Management APIs. April 8th © Ontology Engineering Group WebODE and its Ontology Management APIs Ontology Engineering Group.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
10/22/2015ACM WIDM'20051 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis Voutsakis.
An Iterative Approach to Extract Dictionaries from Wikipedia for Under-resourced Languages G. Rohit Bharadwaj Niket Tandon Vasudeva Varma Search and Information.
Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #4 Vision for Semantic Web.
Shridhar Bhalerao CMSC 601 Finding Implicit Relations in the Semantic Web.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Similarity Measures for Query Expansion in TopX Caroline Gherbaoui Universität des Saarlandes Naturwissenschaftlich-Technische Fak. I Fachrichtung 6.2.
Acquisition of Categorized Named Entities for Web Search Marius Pasca Google Inc. from Conference on Information and Knowledge Management (CIKM) ’04.
Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:
2/10/2016Semantic Similarity1 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis.
Multilingual Information Retrieval using GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of Kaohsiung.
A Multilingual Hierarchy Mapping Method Based on GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
1 CS 8803 AIAD (Spring 2008) Project Group#22 Ajay Choudhari, Avik Sinharoy, Min Zhang, Mohit Jain Smart Seek.
Semantic search-based image annotation Petra Budíková, FI MU CEMI meeting, Plzeň,
Improvement of Semantic Interoperability based on Metadata Registry(MDR) Doo-Kwon Baik Dept. of CSE Korea University.
Yoon kyoung-a A Semantic Match Algorithm for Web Services Based on Improved Semantic Distance Gongzhen Wang, Donghong Xu, Yong Qi, Di Hou School.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
SEMANTIC WEB Presented by- Farhana Yasmin – MD.Raihanul Islam – Nohore Jannat –
SERVICE ANNOTATION WITH LEXICON-BASED ALIGNMENT Service Ontology Construction Ontology of a given web service, service ontology, is constructed from service.
Multi-Source Information Extraction Valentin Tablan University of Sheffield.
Mohammad Alqahtani, Dr. Eric Atwell
Karpicke, J. , & Roediger, H. L. (2008)
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Integrating SysML with OWL (or other logic based formalisms)

David W. Embley Brigham Young University Provo, Utah, USA
CS 620 Class Presentation Using WordNet to Improve User Modelling in a Web Document Recommender System Using WordNet to Improve User Modelling in a Web.
Combining Keyword and Semantic Search for Best Effort Information Retrieval  Andrew Zitzelberger 1.
Semantic Similarity Methods in WordNet and their Application to Information Retrieval on the Web Yizhe Ge.
Query Optimization.
Giannis Varelas Epimenidis Voutsakis Paraskevi Raftopoulou
Chaitali Gupta, Madhusudhan Govindaraju
Presentation transcript:

Cross-language Information Retrieval Joseph Park After explaining how it works: display Extraction results Selection and projection transformation vs translation of query – flaw is keyword portion of search In-language querying doesn’t work as well Meta-word and stop word removal Currency and unit conversion Transliteration of names and places Translation results Works better with the whole system Explain why the two experiments are enough Conclusion & future work

Motivation 11,000,000원 보다 싸고 마일리지가 320,000km보다 적은 4륜구동 다지 자동차를 찾아라0 Find me a Dodge, less than $10,000, less than 200k miles, four wheel drive korea

Key Concepts Extraction Ontology – conceptual model for extracting and storing data ML-HyKSS – MultiLingual Hybrid Keyword and Semantic Search Query Transformation – Semantic rewrite of search query from one language to another

Language-Agnostic Ontology ML-HyKSS Find me a Dodge, less than $10,000, less than 200k miles, four wheel drive Dodge < 10000 < 200000 Language-Agnostic Ontology 닷지 < 11257000 < 124274 제조사 가격 마일리지 닷지 8100000 148000 7100000 148988 9000000 106707 6500000 44799 9500000 3500

Evaluation Results Validation + Test Sets  Korean Car Ads Declared Extracted Correct Precision Recall F-Measure 모델 (Model) 107 106 0.99 가격 (Price) 1.00 마일리지 (Mileage) 102 0.95 제조사 (Make) 년식 (Year) 색상 (Color) selection and projection translations are always correct because ML-HyKSS translates them at the conceptual level by matching methods and object sets respectively, which are necessarily in a one-to-one correspondence 20 validation pages + 80 blind test pages Car Ad Queries Recall Precision σ π κ Korean-to-English 98% 100% 93% 99% 52% 10 validation queries + 40 blind test queries

Conclusions Cross-language query transformation retains semantics Extensive knowledge-base required for lexicon mappings Keyword transformation may be difficult Future Work: Dynamic augmentation of language agnostic ontology Integration of WordNet for meta-word synset