Keyword Searching and Browsing in Databases using BANKS

Slides:



Advertisements
Similar presentations
R2 Library Features and Functionality Overview. The R2 Library  The R2 Library is an electronic database that enables access to digital book content.
Advertisements

Improved TF-IDF Ranker
Crawling, Ranking and Indexing. Organizing the Web The Web is big. Really big. –Over 3 billion pages, just in the indexable Web The Web is dynamic Problems:
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
Keyword Searching in Relational Databases
Effective Keyword Search in Relational Databases Fang Liu (University of Illinois at Chicago) Clement Yu (University of Illinois at Chicago) Weiyi Meng.
Trust and Profit Sensitive Ranking for Web Databases and On-line Advertisements Raju Balakrishnan (Arizona State University)
Intranet Mediator Clement Yu Department of Computer Science University of Illinois at Chicago.
6/16/20151 Recent Results in Automatic Web Resource Discovery Soumen Chakrabartiv Presentation by Cui Tao.
FACT: A Learning Based Web Query Processing System Hongjun Lu, Yanlei Diao Hong Kong U. of Science & Technology Songting Chen, Zengping Tian Fudan University.
The Euler-tour technique
Quality-driven Integration of Heterogeneous Information System by Felix Naumann, et al. (VLDB1999) 17 Feb 2006 Presented by Heasoo Hwang.
Bidirectional Expansion for Keyword Search on Graph Databases Varun Kacholia Shashank Pandit Soumen Chakrabarti S. Sudarshan.
Authors: Bhavana Bharat Dalvi, Meghana Kshirsagar, S. Sudarshan Presented By: Aruna Keyword Search on External Memory Data Graphs.
Keyword Search in Relational Databases Jaehui Park Intelligent Database Systems Lab. Seoul National University
ASP.NET Programming with C# and SQL Server First Edition
PHP Programming with MySQL Slide 8-1 CHAPTER 8 Working with Databases and MySQL.
Page 1 WEB MINING by NINI P SURESH PROJECT CO-ORDINATOR Kavitha Murugeshan.
Keyword Search on External Memory Data Graphs Bhavana Bharat Dalvi, Meghana Kshirsagar, S. Sudarshan PVLDB 2008 Reported by: Yiqi Lu.
DAY 14: ACCESS CHAPTER 1 Tazin Afrin October 03,
DBXplorer: A System for Keyword- Based Search over Relational Databases Sanjay Agrawal Surajit Chaudhuri Gautam Das Presented by Bhushan Pachpande.
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
DBXplorer: A System for Keyword- Based Search over Relational Databases Sanjay Agrawal, Surajit Chaudhuri, Gautam Das Cathy Wang
Harikrishnan Karunakaran Sulabha Balan CSE  Introduction  Database and Query Model ◦ Informal Model ◦ Formal Model ◦ Query and Answer Model 
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Graphs.
Keyword Searching and Browsing in Databases using BANKS Seoyoung Ahn Mar 3, 2005 The University of Texas at Arlington.
Haggle Architecture and Reference Implementation Uppsala, September Erik Nordström, Christian Rohner.
Keyword Search in Databases using PageRank By Michael Sirivianos April 11, 2003.
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
Search Engines1 Searching the Web Web is vast. Information is scattered around and changing fast. Anyone can publish on the web. Two issues web users have.
WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.
ITGS Databases.
Mining real world data Web data. World Wide Web Hypertext documents –Text –Links Web –billions of documents –authored by millions of diverse people –edited.
Date: 2012/08/21 Source: Zhong Zeng, Zhifeng Bao, Tok Wang Ling, Mong Li Lee (KEYS’12) Speaker: Er-Gang Liu Advisor: Dr. Jia-ling Koh 1.
Keyword Search on Graph-Structured Data
New COOL Tag Browser Release 10 Giorgi BATIASHVILI Georgian Engineering Center 23/10/2012
Date: 2013/4/1 Author: Jaime I. Lopez-Veyna, Victor J. Sosa-Sosa, Ivan Lopez-Arevalo Source: KEYS’12 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang KESOSD.
Block-level Link Analysis Presented by Lan Nie 11/08/2005, Lehigh University.
Keyword Searching and Browsing in Databases using BANKS Charuta Nakhe, Arvind Hulgeri, Gaurav Bhalotia, Soumen Chakrabarti, S. Sudarshan Presented by Sushanth.
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
Web Page Clustering using Heuristic Search in the Web Graph IJCAI 07.
Database (Microsoft Access). Database A database is an organized collection of related data about a specific topic or purpose. Examples of databases include:
XRANK: RANKED KEYWORD SEARCH OVER XML DOCUMENTS Lin Guo Feng Shao Chavdar Botev Jayavel Shanmugasundaram Abhishek Chennaka, Alekhya Gade Advanced Database.
Glencoe Introduction to Multimedia Chapter 2 Multimedia Online 1 Internet A huge network that connects computers all over the world. Show Definition.
Data Visualization with Tableau
Working in the Forms Developer Environment
HITS Hypertext-Induced Topic Selection
Lesson 6: Databases and Web Search Engines
TU170 Learning online and computing with confidence
Search Engine Architecture
Microsoft Word 2003 Illustrated Complete
The Anatomy of a Large-Scale Hypertextual Web Search Engine
Keyword Searching and Browsing in Databases using BANKS
Chapter 8 Working with Databases and MySQL
Submitted By: Usha MIT-876-2K11 M.Tech(3rd Sem) Information Technology
Information Retrieval
Introduction to EBSCOhost
Declarative Creation of Enterprise Applications
Objective Understand web-based digital media production methods, software, and hardware. Course Weight : 10%
CS 440 Database Management Systems
Lesson 6: Databases and Web Search Engines
Introduction of Week 11 Return assignment 9-1 Collect assignment 10-1
Keyword Searching and Browsing in Databases using BANKS
Bidirectional Query Planning Algorithm
Search Engine Architecture
The ultimate in data organization
Topic 11 Lesson 1 - Analyzing Data in Access
COMP5331 Web databases Prepared by Raymond Wong
Lesson 2: Gathering and Organizing Information Using ICT KEY QUESTION: HOW DO YOU GATHER AND ORGANIZE INFORMATION USING THE COMPUTER AND INTERNET?
Presentation transcript:

Keyword Searching and Browsing in Databases using BANKS  Gaurav Bhalotia  Arvind Hulgeri  Charuta Nakhe  Soumen Chakrabarti  S. Sudarshan 18th International Conference on Data Engineering (ICDE'02), 2002 Kushal Bansal Today, we are going to see the paper “Keyword Searching and Browsing in Databases using BANKS” This paper was presented by Gaurav, Arvind, Charuta, Soumen and Sudarshan in the 18th ICDE in the year 2002.

Outline Introduction Database and Query Model Searching for the Best Answers Interface and Templates of BANK System Experiment and Performance Conclusion My outline of the talk will be on the following topics Introduction – In this we will see the need for the BANK system Different types of Database and Query Models How to search for the best answers Various features of BANKS A small experiment Conclusion

Introduction Web Search engines make use of unstructured queries Users have to type in keywords and follow hyperlinks Relational databases use structured query languages like SQL Users need to know the schema of the database Difficult for naïve users For data stored in databases, keyword based techniques is not much useful Data often splits across the tables due to normalization Read the first bullet What do we mean by unstructured query? In case of search engines like Google, there is no particular format in which we have to enter the search terms. Such style of querying is called unstructured querying. After entering the search terms when we hit enter, we get a list of links ordered based on the PageRank algorithm. In relational databases, information (needed to answer a keyword query) / is often split across the tables/tuples

Introduction BANKS (Browsing And Keyword Searching) It is a system which provides search engine type interface to search and browse relational databases. Allows interaction with controls on the displayed results. No query language or programming required. The full form for BANKS is Browsing and Keyword Searching It is a system which provides us search engine type interface to search and browse relational databases. A user connects to the Bank system using HTTP protocol and then interacts with the database with the help of the JDBC.

Outline Introduction Database and Query Model Informal Model Formal Model Query and Answer Model Searching for the Best Answers Interface and Templates of BANK System Experiment and Performance Conclusion

Database and Query Model Informal Model Each database is modeled as a directed graph Each tuple in the database is modeled as a node in the graph. Every Primary – Foreign key relation is modeled as a directed edge. In Informal model, the entire database is represented as a directed graph. Each tuple in the database is represented as a node in the graph and Every foreign key – primary key is represented as a directed edge.

Database and Query Model Informal Model 4. An answer to a query is a subgraph connecting nodes matching the keywords. 5. The importance of a link depends upon the relations it connects and on its semantics

Database and Query Model The Schema

Database and Query Model Fragment of the Database

Database and Query Model Formal Database Model Node Weight Each node u in the graph is assigned a weight N(u) Node weight is also known as the node prestige N (u) = Indegree of the node Node score N = Root node weight + Sum of leaf node weights

Database and Query Model Formal Database Model Edge Weights Weight of the directed edge (u,v) given by (u,v) exists but (v,u) does not = s (R(u), R(v)) (v,u) exists but (u,v) does not = IN(u) s (R(v),R(u)) If both exists = min [ s(R(u),R(v)), IN(u) s (R(v),R(u)) ]

Database and Query Model Formal Database Model Edge Weights Escore(e) of an edge = w(e)/w min Escore overall = 1/ (1 + ∑ Escore(e)) Escore overall is in the range [0,1]

Database and Query Model Formal Database Model Overall relevance score = Node weights + Edge Weight Using weighting factor  Additive: (1- ) E + N multiplicative: E * N 

Database and Query Model Query and Answer Model Query Query consists of search terms t1 ,t2, ……tn For each term ti we find set of nodes Si that are relevant to ti S = {S1,S2,…Sn} Answer Model An answer to a query is a rooted directed tree connecting keyword nodes Relevance score of an answer tree Relevance scores of it nodes and its edge weight

Database and Query Model Result of query “soumen and sunita”

Outline Introduction Database and Query Model Searching for the best answers Backward expanding search algorithm Interface and Templates of BANKS Experiment and Performance Conclusion

Searching for the Best Answer Backward expanding search algorithm Assumes that the graph of the database fits in memory Starts at leaf nodes each containing a query keyword Run concurrent single source shortest path algorithm from each such node Traverses the graph edges in reverse direction Common vertex along the backward paths identify answer tree roots Tree formed is a connection tree and root of tree is information node.

Outline Introduction Database and Query Model Searching for the best answers Interface and Templates of BANKS Experiment and Performance Conclusion

Interface BANKS system provides A rich interface to browse data stored in a relational database Schema browsing and data browsing Hyperlink to the referenced tuple Columns can be projected away (dropped) Selections can be imposed on any column Tuples can be sorted by a specified column

Templates BANKS system provides several predefined templates Cross – tabs Group by Folder Views Graphical Interface for display in bar, line or pie chart

Outline Introduction Database and Query Model Searching for the best answers Interface and Templates of BANKS Experiment and Performance Conclusion

Experiment and Performance Computed absolute value of the rank difference of the ideal answer and answer for each parameter setting. Sum of the rank differences gives the raw Error score Setting  = 0.2 with log scaling of edge weights did best, with an error score of 0.0

Error scores vs. parameter choices

Outline Introduction Database and Query Model Searching for the best answers Interface and Templates of BANKS Experiment and Performance Conclusion

Conclusion BANKS system Provides an integrated browsing and keyword querying system for relational databases Allows users with no knowledge of database systems or schema to query and browse relational database with ease Reduces the effort involved in publishing relational data on the web and makes it searchable.