By Chung-Hong Lee ( 李俊宏 ) Assistant Professor Dept. of Information Management Chang Jung Christian University 資料庫與資訊檢索系統的整合 - 一個文件資料庫系統的開發研究.

Slides:



Advertisements
Similar presentations
Chapter 5: Introduction to Information Retrieval
Advertisements

Key-word Driven Automation Framework Shiva Kumar Soumya Dalvi May 25, 2007.
Object Databases Baochuan Lu. outline Concepts for Object Databases Object Database Standards, Languages, and Design Object-Relational and Extended-Relational.
Introduction to Databases
Management Information Systems, Sixth Edition
Information Retrieval in Practice
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) Classic Information Retrieval (IR)
Basic IR: Queries Query is statement of user’s information need. Index is designed to map queries to likely to be relevant documents. Query type, content,
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
Intelligent Information Retrieval CS 336 Lisa Ballesteros Spring 2006.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
Relevance Feedback based on Parameter Estimation of Target Distribution K. C. Sia and Irwin King Department of Computer Science & Engineering The Chinese.
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
Shared Ontology for Knowledge Management Atanas Kiryakov, Borislav Popov, Ilian Kitchukov, and Krasimir Angelov Meher Shaikh.
FACT: A Learning Based Web Query Processing System Hongjun Lu, Yanlei Diao Hong Kong U. of Science & Technology Songting Chen, Zengping Tian Fudan University.
Spatial Query by Sketch 研一 張永昌. Agenda Abstract Introduction Spatial Query Spatial Query Languages Visual Spatial Query Languages Sketching.
Automating Keyphrase Extraction with Multi-Objective Genetic Algorithms (MOGA) Jia-Long Wu Alice M. Agogino Berkeley Expert System Laboratory U.C. Berkeley.
Architecture & Data Management of XML-Based Digital Video Library System Jacky C.K. Ma Michael R. Lyu.
Internet Resources Discovery (IRD) IBM DB2 Digital Library Thanks to Zvika Michnik and Avital Greenberg.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
CH 11 Multimedia IR: Models and Languages
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
Object Oriented Databases - Overview
Outline of Presentation Introduction of digital video libraries Introduction of the CMU Informedia Project Informedia: user perspective Informedia:
Overview of Search Engines
Database Management COP4540, SCS, FIU An Introduction to database system.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Advanced Database CS-426 Week 2 – Logic Query Languages, Object Model.
Week 1 Lecture MSCD 600 Database Architecture Samuel ConnSamuel Conn, Asst. Professor Suggestions for using the Lecture Slides.
OBJECT-ORIENTED APPROACH TO GIS DATA MANAGEMENT Tomáš Richta, Jiří Žára Computer Graphics Group Department of Computer Science and Engineering Czech Technical.
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)
Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science.
The Anatomy of a Large-Scale Hyper textual Web Search Engine S. Brin, L. Page Presenter :- Abhishek Taneja.
GUIDED BY DR. A. J. AGRAWAL Search Engine By Chetan R. Rathod.
Object Oriented Multi-Database Systems An Overview of Chapters 4 and 5.
BioRAT: Extracting Biological Information from Full-length Papers David P.A. Corney, Bernard F. Buxton, William B. Langdon and David T. Jones Bioinformatics.
Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.
1 CS 430: Information Discovery Lecture 19 User Interfaces.
2005/12/021 Fast Image Retrieval Using Low Frequency DCT Coefficients Dept. of Computer Engineering Tatung University Presenter: Yo-Ping Huang ( 黃有評 )
March 31, 1998NSF IDM 98, Group F1 Group F Multi-modal Issues, Systems and Applications.
1 TOPIC 6 DATABASE 6.1 Introduction to Database 6.2 Basic Concept of Database 6.3 Database Object DATABASE.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Information Retrieval
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Content-Based MP3 Information Retrieval Chueh-Chih Liu Department of Accounting Information Systems Chihlee Institute of Technology 2005/06/16.
1 CS 430: Information Discovery Lecture 14 Usability I.
DISTRIBUTED INFORMATION RETRIEVAL Lee Won Hee.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
E.Bertino, L.Matino Object-Oriented Database Systems 1 Chapter 9. Systems Seoul National University Department of Computer Engineering OOPSLA Lab.
The Object-Oriented Database System Manifesto Malcolm Atkinson, François Bancilhon, David deWitt, Klaus Dittrich, David Maier, Stanley Zdonik DOOD'89,
The Development of a search engine & Comparison according to algorithms Sung-soo Kim The final report.
Integrated Departmental Information Service IDIS provides integration in three aspects Integrate relational querying and text retrieval Integrate search.
Database Environment Chapter 2. The Three-Level ANSI-SPARC Architecture External Level Conceptual Level Internal Level Physical Data.
Apache Solr Dima Ionut Daniel. Contents What is Apache Solr? Architecture Features Core Solr Concepts Configuration Conclusions Bibliography.
Faeez, Franz & Syamim.   Database – collection of persistent data  Database Management System (DBMS) – software system that supports creation, population,
September 2003, 7 th EDG Conference, Heidelberg – Roberta Faggian, CERN/IT CERN – European Organization for Nuclear Research The GRACE Project GRid enabled.
Definition, purposes/functions, elements of IR systems Lesson 1.
國立臺北科技大學 課程:資料庫系統 Chapter 2 Database Environment.
義守大學資訊工程學系 作者:郭東黌, 張佑康 報告人:徐碩利 Date: 2006/11/01
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
The Object-Oriented Database System Manifesto
Introduction Multimedia initial focus
Dr. Bhavani Thuraisingham The University of Texas at Dallas
Information Retrieval and Web Design
Presentation transcript:

by Chung-Hong Lee ( 李俊宏 ) Assistant Professor Dept. of Information Management Chang Jung Christian University 資料庫與資訊檢索系統的整合 - 一個文件資料庫系統的開發研究

AGENDA Introduction Comparison of the DBMS and IR approaches for document retrieval Proposed signature based IR technique System architecture Integration method Conclusions

User query Query formulation Signature of query Matching of similarity Results of retrieval Relevance feedback Signature of documents Indexing Document collections A Document Retrieval Model

Information Retrieval Database proprietary application dependent rich search capability standardization application independent limited search capability standardization application independent powerful modeling capability powerful search capability rich application development tools Convergence of Information Retrieval and Database

A text-search extension to the ORION OODBMS developed by Lee (1991). The integration of the INQUERY text retrieval system and the IRIS OODBMS proposed by Croft (1992). Mapping the SGML document structures into OODBMS ’ s data models: –Christophides (1994). –Macleod (1995). –Volz (1996), etc. Differing from some of the above efforts with the aims to model only SGML documents in DBMS, our system is particularly aimed at handling heterogeneous types of documents, such as textual and multimedia documents, and providing content-based retrieval functions to describe the stored document objects. Related work

The core features of OODBMS supported by most such systems are: 1. Complex objects 2. Object identity 3. Encapsulation 4. Types and Classes 5. Class or Type Inheritance 6. Overriding, overloading and late binding 7. Computational Completeness Why OODBMS ?

Signature file approach

document signatures are generated according to their composed Chinese characters the document signatures are divided into two segments: the first segment represents the occurrence of commonly-used Chinese characters, while the second segment represents the occurrence of the remaining Chinese characters and the English character bigrams the signature size can be adjusted with the average length of each document Concept of the scalable signature file approach

document input retrieved document objects interface for full-text search OODBMS GUI System Architecture (1)

IR queries (word, term OODBMS phrase) OQL queries retrieved document objects Search Engine & OODB Search engine Signature file Key features:-  Two stage search  Both IR and OQL queries are available  Signature file as a preprocessor for IR queries  Documents are stored as BLOB object representation in the OODBMS System Architecture (2)

Signature file as a pre-processor of the database queries

How the system formulates the query:- The system transforms Quasi-Natural language queries incrementally into complex structured queries in the query language. Goal: Free format queries Related techniques:- Key term extraction from the queries IR-queries-to-OQL conversion Query optimization User interface NLP Query text processing

The distinctive features of underlying system developed :- IR-OODBMS integration –OODBMS based document repository –a loose coupling approach –signature file filter as a preprocessor for query processing –two stage search –a novel query model –easy to maintain, including the signature file and database schema Signature generation –a character based signature method designed for Chinese and English documents Applicable to a digital library infrastructure Conclusions