DB2 Net Search Extender Presenter: Sudeshna Banerji (CIS 595: Bioinformatics)

Slides:



Advertisements
Similar presentations
Copyright © 2003 Pearson Education, Inc. Slide 8-1 The Web Wizards Guide to PHP by David Lash.
Advertisements

Structured Query Language (SQL)
IMPLEMENTATION OF INFORMATION RETRIEVAL SYSTEMS VIA RDBMS.
C6 Databases.
R2 Library Features and Functionality Overview. The R2 Library  The R2 Library is an electronic database that enables access to digital book content.
Integrated Imaging and Document Management System Product Demonstration.
Information Retrieval in Practice
PowerPoint Presentation for Dennis & Haley Wixom, Systems Analysis and Design Copyright 2000 © John Wiley & Sons, Inc. All rights reserved. Slide 1 Key.
Chapter 3 Database Management
Parametric search and zone weighting Lecture 6. Recap of lecture 4 Query expansion Index construction.
11 3 / 12 CHAPTER Databases MIS105 Lec14 Irfan Ahmed Ilyas.
Copyright 2003 The McGraw-Hill Companies, Inc CHAPTER Application Software computing ESSENTIALS    
Attribute databases. GIS Definition Diagram Output Query Results.
Oracle Text Operations J. Molka-Danielsen Sept. 30, 2002.
Overview of Search Engines
Chapter 4 Relational Databases Copyright © 2012 Pearson Education 4-1.
Confidential ODBC May 7, Features What is ODBC? Why Create an ODBC Driver for Rochade? How do we Expose Rochade as Relational Transformation.
Introduction To Databases IDIA 618 Fall 2014 Bridget M. Blodgett.
IT – DBMS Concepts Relational Database Theory.
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
Session 5: Working with MySQL iNET Academy Open Source Web Development.
UNESCO ICTLIP Module 4. Lesson 4 Database Design, and Information Storage and Retrieval Lesson 4. Advanced features of WinISIS.
ASP.NET Programming with C# and SQL Server First Edition
1 Overview of Databases. 2 Content Databases Example: Access Structure Query language (SQL)
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
PowerPoint Presentation for Dennis & Haley Wixom, Systems Analysis and Design, 2 nd Edition Copyright 2003 © John Wiley & Sons, Inc. All rights reserved.
Understanding System_T By Mao Xianling
CHAPTER:14 Simple Queries in SQL Prepared By Prepared By : VINAY ALEXANDER ( विनय अलेक्सजेंड़र ) PGT(CS),KV JHAGRAKHAND.
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
1 Working with MS SQL Server Textbook Chapter 14.
© FPT SOFTWARE – TRAINING MATERIAL – Internal use 04e-BM/NS/HDCV/FSOFT v2/3 Working with MSSQL Server Code:G0-C# Version: 1.0 Author: Pham Trung Hai CTD.
Database Fred Durao What is a database? A database is any organized collection of data. Some examples of databases you may encounter in.
IBM DB2 UD & XML Extender IBM DB2 UD & XML Extender AstroGrid Project Registry Group Pedro Contreras 14 August 2003.
Discovering Computers Fundamentals Fifth Edition Chapter 9 Database Management.
Professor Michael J. Losacco CIS 1110 – Using Computers Database Management Chapter 9.
Session 8: Databases Teaching Computing to GCSE Level with Python.
1.file. 2.database. 3.entity. 4.record. 5.attribute. When working with a database, a group of related fields comprises a(n)…
CS240A Notes on DB Extenders a.k.a. Data Blades, Cartridge, Snapins Carlo Zaniolo Department of Computer Science University of California, Los Angeles.
Instructor: Dema Alorini Database Fundamentals IS 422 Section: 7|1.
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
ITGS Databases.
SQL Fundamentals  SQL: Structured Query Language is a simple and powerful language used to create, access, and manipulate data and structure in the database.
DAT602 Database Application Development Lecture 2 Review of Relational Database.
Data resource management
Using SQL Connecting, Retrieving Data, Executing SQL Commands, … Svetlin Nakov Technical Trainer Software University
Database Management System. DBMS A software package that allows users to create, retrieve and modify databases. A database is a collection of related.
Information Retrieval
Chapter 10 Database Management. Data and Information How are data and information related? p Fig Next processing data stored on disk Step.
3/6: Data Management, pt. 2 Refresh your memory Relational Data Model
ITS232 Introduction To Database Management Systems Siti Nurbaya Ismail Faculty of Computer Science & Mathematics, Universiti Teknologi MARA (UiTM), Kedah.
1 CS 430 Database Theory Winter 2005 Lecture 10: Introduction to SQL.
© 2003 Prentice Hall, Inc.3-1 Chapter 3 Database Management Information Systems Today Leonard Jessup and Joseph Valacich.
Date: 13/03/2015 Training Reference: 2015 GIS_01 Document Reference: 2015GIS_01/PPT/L4 Issue: 2015/L4/1/V1 Addis Ababa, Ethiopia GIS Data Base Management.
IBM Software Group | DB2 Data Management Software IBM DB2 Net Search Extender © 2003 IBM Corporation IBM logo must not be moved, added to, or altered in.
NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. DATABASE.
BIT 3193 MULTIMEDIA DATABASE CHAPTER 4 : QUERING MULTIMEDIA DATABASES.
Presentation on Database management Submitted To: Prof: Rutvi Sarang Submitted By: Dharmishtha A. Baria Roll:No:1(sem-3)
Introduction to Core Database Concepts Getting started with Databases and Structure Query Language (SQL)
 CONACT UC:  Magnific training   
1 Section 1 - Introduction to SQL u SQL is an abbreviation for Structured Query Language. u It is generally pronounced “Sequel” u SQL is a unified language.
Information Retrieval in Practice
Search Engine Architecture
SQL and SQL*Plus Interaction
Prepared by : Moshira M. Ali CS490 Coordinator Arab Open University
MANAGING DATA RESOURCES
Data Model.
Information Retrieval and Web Design
Information Retrieval and Web Design
Presentation transcript:

DB2 Net Search Extender Presenter: Sudeshna Banerji (CIS 595: Bioinformatics)

Sudeshna Banerji (CIS 595: Bioinformatics)  Topics to discuss: – Information retrieval – Text-indexing – DB2 Text Extenders – DB2 Net Search Extender – References – Questions

Sudeshna Banerji (CIS 595: Bioinformatics) A Little Background…  Information Retrieval(IR): Extraction of “relevant” information from huge volumes of data scattered across different databases. Examples: Textual search, image search, video search etc. Efficiency(time and speed) of IR is based on different INDEXING technologies. Indexing increases performance of system. An example of indexing technology: Text-indexing used for textual-search.

Sudeshna Banerji (CIS 595: Bioinformatics) A Little Background…  Text-Indexing : Process of deciding what will be used to represent a given document. A text index consists of significant terms extracted from the text documents, each term stored together with information about the document that contains it. The search is then handled as a query to look up the index.

Sudeshna Banerji (CIS 595: Bioinformatics) A Little Background…  Text-Indexing (continued): Involves the following: –Parsing the documents to recognize the structure. E.g title, date, other fields. –Scan for word tokens: numbers, special characters, hyphenation, capitalization etc. –Stopword removal: based on short list of common words like “the”, “and”, “or”.

Sudeshna Banerji (CIS 595: Bioinformatics) Indexing only Significant Terms

Sudeshna Banerji (CIS 595: Bioinformatics) DB2 Extenders – Product of IBM family that provide support to data beyond traditional character and numeric data types. – Extenders available for images, voice, video, complex documents (full-text search), spatial objects etc. – Trial and beta versions available for testing. – Link for extenders:

Sudeshna Banerji (CIS 595: Bioinformatics) DB2 Text Extenders – To meet the increasing demands of content management, IBM has introduced 3 full-text retrieval applications available for DB2 Universal Database (DB2 UDB). DB2 Net Search Extender DB2 Text Information Extender DB2 Text Extender – When to use what? Link for comparisons of the above:

Sudeshna Banerji (CIS 595: Bioinformatics) DB2 Net Search Extender  Replaces DB2 Text Information Extender Version 7.2  Some important features: – Indexing speed of about 1GB per hour. – Different text formats: ASCII Plain text, HTML,XML, GPP – Base support for 37 languages including English, Spanish, French, Japanese and Chinese. – Sub-second search response times. – No decrease in search performance with up to 1000 concurrent queries per second.

Sudeshna Banerji (CIS 595: Bioinformatics) DB2 Net Search Extender  Some text-search capabilities: – Search can be performed using SQL (fourth generation language…almost like English query). – Searches can include: Boolean operations. Proximity search for words in the same sentence or paragraph: for HTML,XML and GPP. “Fuzzy” searches for words having a similar spelling as the search term: Andrew & Andru Thesaurus related search. Restrict searching to sections within documents. User can limit the search results with a “hit count”, and can also specify how the results are to be sorted.

Sudeshna Banerji (CIS 595: Bioinformatics) DB2 Net Search Extender  System requirements – DB2 Version 8.1 – Java Runtime Environment (JRE) Version  Windows Installation – Administrative rights required. – Call db2text start to start the DB2 Net Search Extender Instance Services.

Sudeshna Banerji (CIS 595: Bioinformatics) DB2 Net Search Extender  Simple example with the SQL queries – Following steps are required to do a basic textual- search in DB2 Net Search Extender: 1. Creating a database 2. Enabling a database for text search 3. Creating a table 4. Creating a full-text index 5. Loading sample data 6. Synchronizing the text index 7. Searching with the text index

Sudeshna Banerji (CIS 595: Bioinformatics) DB2 Net Search Extender 1. Creating a database: db2 "create database sample" 2. Enabling a database for text search: To start Net Search Extender Service db2text "START “ To prepare the database for use with DB2 Net Search Extender: db2text "ENABLE DATABASE FOR TEXT CONNECT TO sample"

Sudeshna Banerji (CIS 595: Bioinformatics) DB2 Net Search Extender 3. Creating a table: db2 "CREATE TABLE books (isbn VARCHAR(18) not null PRIMARY KEY, author VARCHAR(30), story LONG VARCHAR, year INTEGER)" 4. Creating a full-text index: db2text "CREATE INDEX db2ext.myTextIndex FOR TEXT ON books (story) CONNECT TO sample"

Sudeshna Banerji (CIS 595: Bioinformatics) DB2 Net Search Extender 5. Loading sample data: db2 "INSERT INTO books VALUES (‘ ’,’John’,’ A man was running down the street.’,2001)“ db2 "INSERT INTO books VALUES (‘ ’, ‘Mike’, ’The cat hunts some mice.’, 2000)“ 6. Synchronizing the text index: db2text "UPDATE INDEX db2ext.myTextIndex FOR TEXT CONNECT TO sample“

Sudeshna Banerji (CIS 595: Bioinformatics) DB2 Net Search Extender 7. Searching with the text index: Using CONTAINS scalar search function: db2 "SELECT author, story FROM books WHERE CONTAINS (story, ‘”cat“’) = 1 AND year >= 2000" The following result table is returned: AUTHOR STORY Mike The cat hunts some mice.  NOTE: – To create a text-index, the text columns must be one of the following data types: CHAR, VARCHAR, LONG VARCHAR, CLOB.

Sudeshna Banerji (CIS 595: Bioinformatics) DB2 Net Search Extender  Thesaurus Support: – A thesaurus is structured like a network of nodes linked together by relations: Associative relations: RELATED_TO Synonym relations: SYNONYM_OF Hierarchical relations: LOWER_THAN, HIGHER_THAN – Creating and compiling a thesaurus: 1. Create a thesaurus definition file (explained below). 2. Compile the definition file into a thesaurus dictionary using DB2EXTTH utility.

Sudeshna Banerji (CIS 595: Bioinformatics) DB2 Net Search Extender  Create a thesaurus definition file. – Define its content in a definition file using a text editor. Example of some definition groups: :WORDS football.RELATED_TO goal.SYNONYM_OF soccer :WORDS chapel.LOWER_THAN skyscraper.HIGHER_THAN house

Sudeshna Banerji (CIS 595: Bioinformatics) DB2 Net Search Extender  An example of a structure of a Thesaurus: Game Ball Game Tennis Soccer HIGHER_THAN Football HIGHER_THAN SYNONYM_OF

Sudeshna Banerji (CIS 595: Bioinformatics) DB2 Net Search Extender  References: - document.d2w/report?fn=desu9m03.htm#ToC -Information Retrieval Site containing good lecture slides: -Net Search Extender Administration and User’s Guide, Version 8.1 (can be downloaded with the software)

Sudeshna Banerji (CIS 595: Bioinformatics)  ANY QUESTIONS????