Automated Creation of a Forms- based Database Query Interface Magesh Jayapandian H.V. Jagadish Univ. of Michigan VLDB 2008 1.

Slides:



Advertisements
Similar presentations
Lukas Blunschi Claudio Jossen Donald Kossmann Magdalini Mori Kurt Stockinger.
Advertisements

Processing XML Keyword Search by Constructing Effective Structured Queries Jianxin Li, Chengfei Liu, Rui Zhou and Bo Ning Swinburne University of Technology,
Date : 2013/05/27 Author : Anish Das Sarma, Lujun Fang, Nitin Gupta, Alon Halevy, Hongrae Lee, Fei Wu, Reynold Xin, Gong Yu Source : SIGMOD’12 Speaker.
Crawling, Ranking and Indexing. Organizing the Web The Web is big. Really big. –Over 3 billion pages, just in the indexable Web The Web is dynamic Problems:
TI: An Efficient Indexing Mechanism for Real-Time Search on Tweets Chun Chen 1, Feng Li 2, Beng Chin Ooi 2, and Sai Wu 2 1 Zhejiang University, 2 National.
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
VisualRank: Applying PageRank to Large-Scale Image Search Yushi Jing, Member, IEEE, and Shumeet Baluja, Member, IEEE.
Library website Log in Ez-Proxy Read/evaluate and Use HOW HOW TO FIND E-JOURNALS Let’s try the following assignment / research topic as an example: The.
Web Mining Research: A Survey Authors: Raymond Kosala & Hendrik Blockeel Presenter: Ryan Patterson April 23rd 2014 CS332 Data Mining pg 01.
Schema Summarization cong Yu Department of EECS University of Michigan H. V. Jagadish Department of EECS University of Michigan
Personalizing Search via Automated Analysis of Interests and Activities Jaime Teevan Susan T.Dumains Eric Horvitz MIT,CSAILMicrosoft Researcher Microsoft.
J. Chen, O. R. Zaiane and R. Goebel An Unsupervised Approach to Cluster Web Search Results based on Word Sense Communities.
Trip Planning Queries F. Li, D. Cheng, M. Hadjieleftheriou, G. Kollios, S.-H. Teng Boston University.
Search engines fdm 20c introduction to digital media lecture warren sack / film & digital media department / university of california, santa.
Chapter 8 Web Structure Mining Part-1 1. Web Structure Mining Deals mainly with discovering the model underlying the link structure of the web Deals with.
Design of a Click-tracking Network for Full-text Search Engine Group 5: Yuan Hu, Yu Ge, Youwen Gong, Zenghui Qiu and Miao Liu.
Page 1 ISMT E-120 Introduction to Microsoft Access & Relational Databases The Influence of Software and Hardware Technologies on Business Productivity.
Databases & Data Warehouses Chapter 3 Database Processing.
Page 1 ISMT E-120 Desktop Applications for Managers Introduction to Microsoft Access.
LifeLogOn: Log on to Your Lifelog Ontology! Introduction & Demonstration Sangkeun Lee, Gihyun Gong, Sang-goo Lee Intelligent Database Systems Lab Seoul.
Attention and Event Detection Identifying, attributing and describing spatial bursts Early online identification of attention items in social media Louis.
Enterprise & Intranet Search How Enterprise is different from Web search What to think about when evaluating Enterprise Search How Intranet use is different.
Page 1 WEB MINING by NINI P SURESH PROJECT CO-ORDINATOR Kavitha Murugeshan.
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
Web Search. Structure of the Web n The Web is a complex network (graph) of nodes & links that has the appearance of a self-organizing structure  The.
Pete Bohman Adam Kunk. What is real-time search? What do you think as a class?
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Querying Structured Text in an XML Database By Xuemei Luo.
Text Mining In InQuery Vasant Kumar, Peter Richards August 25th, 1999.
RCDL Conference, Petrozavodsk, Russia Context-Based Retrieval in Digital Libraries: Approach and Technological Framework Kurt Sandkuhl, Alexander Smirnov,
GEORGIOS FAKAS Department of Computing and Mathematics, Manchester Metropolitan University Manchester, UK. Automated Generation of Object.
Keyword Searching and Browsing in Databases using BANKS Seoyoung Ahn Mar 3, 2005 The University of Texas at Arlington.
LATENT SEMANTIC INDEXING Hande Zırtıloğlu Levent Altunyurt.
Mining Topic-Specific Concepts and Definitions on the Web Bing Liu, etc KDD03 CS591CXZ CS591CXZ Web mining: Lexical relationship mining.
Clustering XML Documents for Query Performance Enhancement Wang Lian.
--He Xiangnan PhD student Importance Estimation of User-generated Data.
GUIDED BY DR. A. J. AGRAWAL Search Engine By Chetan R. Rathod.
Search Engines1 Searching the Web Web is vast. Information is scattered around and changing fast. Anyone can publish on the web. Two issues web users have.
3 & 4 1 Chapters 3 and 4 Drawing ERDs October 16, 2006 Week 3.
Making Database Systems Usable H.V. Jagadish Adriane Chapman Aaron Elkiss Magesh Jayapandian Yanyao Li Arnab Nandi Cong Yu By Shahana Shamim.
WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.
Database Systems Basic Data Management Concepts
Facilitating Document Annotation using Content and Querying Value.
Mobile Search Engine Based on idea presented in paper Data mining for personal navigation, Hariharan, G., Fränti, P., Mehta S. (2002)
Kevin C. Chang. About the collaboration -- Cazoodle 2 Coming next week: Vacation Rental Search.
1 1 COMP5331: Knowledge Discovery and Data Mining Acknowledgement: Slides modified based on the slides provided by Lawrence Page, Sergey Brin, Rajeev Motwani.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Improving the performance of personal name disambiguation.
Date: 2012/08/21 Source: Zhong Zeng, Zhifeng Bao, Tok Wang Ling, Mong Li Lee (KEYS’12) Speaker: Er-Gang Liu Advisor: Dr. Jia-ling Koh 1.
Ranking of Database Query Results Nitesh Maan, Arujn Saraswat, Nishant Kapoor.
Date: 2013/4/1 Author: Jaime I. Lopez-Veyna, Victor J. Sosa-Sosa, Ivan Lopez-Arevalo Source: KEYS’12 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang KESOSD.
Trustworthy Semantic Web Dr. Bhavani Thuraisingham The University of Texas at Dallas Inference Problem March 4, 2011.
Keyword Searching and Browsing in Databases using BANKS Charuta Nakhe, Arvind Hulgeri, Gaurav Bhalotia, Soumen Chakrabarti, S. Sudarshan Presented by Sushanth.
Authors: Magesh Jayapandian and H.V. Jagadish Chris Truszkowski.
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Facilitating Document Annotation Using Content and Querying Value.
Usefulness of Quality Click- through Data for Training Craig Macdonald, ladh Ounis Department of Computing Science University of Glasgow, Scotland, UK.
On the Intersection of Inverted Lists Yangjun Chen and Weixin Shen Dept. Applied Computer Science, University of Winnipeg 515 Portage Ave. Winnipeg, Manitoba,
PAIR project progress report Yi-Ting Chou Shui-Lung Chuang Xuanhui Wang.
XRANK: RANKED KEYWORD SEARCH OVER XML DOCUMENTS Lin Guo Feng Shao Chavdar Botev Jayavel Shanmugasundaram Abhishek Chennaka, Alekhya Gade Advanced Database.
Neighborhood - based Tag Prediction
HITS Hypertext-Induced Topic Selection
Map Reduce.
Prepared by Rao Umar Anwar For Detail information Visit my blog:
Associative Query Answering via Query Feature Similarity
A Schema and Instance Based RDF Dataset Summarization Tool
Keyword Searching and Browsing in Databases using BANKS
MCN: A New Semantics Towards Effective XML Keyword Search
Metadata use in the Statistical Value Chain
Trustworthy Semantic Web
Presentation transcript:

Automated Creation of a Forms- based Database Query Interface Magesh Jayapandian H.V. Jagadish Univ. of Michigan VLDB

Outline Motivation Database Analysis Queriability Generate Query-Forms Experiments Future Works (My ideas) 2

Forms-based Database Query 1. What’s Forms-based database query? 2. Why we need Forms-based database query?

Why we need Forms-based database query? for $a in doc()//author, $s in doc()//store let $b in $s/book where = “Amazon” and $b/author = $a/id return { $a/name, count($b) } $a ?? What is let? Do I need a semi-colon? How do I start writing a query?

What’s Forms-based Database Query?

Automated Forms-based Database Query Why we need automated Forms-based database query?

Automated Forms-based Database Query Why we need automated Forms-based database query? Too many tables in database ! Too many attributes in each table !

Automated Forms-based Database Query Why we need automated Forms-based database query? Too many tables in database ! Too many attributes in each table ! Design simple forms to cover most queries required by users ?

Automated Forms-based Database Query Why we need automated Forms-based database query? Too many tables in database ! Too many attributes in each table ! difficulty to human Design simple forms to cover most queries required by users ?

Survey on real-world’s websites Page 2 10

Survey on real-world’s websites Even those complex query-forms cannot handle these users’ queries: Page 2 11

Outline Motivation Database Analysis Queriability Generate Query-Forms Experiments Future Works (My ideas) 12

Schema Analysis E is a set of entities A is a set of attributes, each belonging to a single entity L is a set of links between nodes (entities or attributes) in graph. 13

The Graph of Schema 14

The Task Find the most queriable entities and attributes !! 15

What’s Queriability? The possibility of a node appears in a query. 16

The Basic Idea This queriabilty is inspired by the approach taken by several search engines to rank web documents. A document is considered “important” if it is connected (linked) to other “important” documents 17

The Basic Idea This queriabilty is inspired by the approach taken by several search engines to rank web documents. A document is considered “important” if it is connected (linked) to other “important” documents 18 PageRank?

Outline Motivation Database Analysis Queriability Generate Query-Forms Experiments Future Works (My ideas) 19

Two Postulates POSTULATE 1. The query relevance of an entity depends on how well-connected it is to other parts of the schema.(similar to PageRank) POSTULATE 2. The query relevance of an entity depends on how many instances (records) of it occur in the database. In Page 2. 20

Absolute Cardinality Absolute Cardinality : C(n) =The number of instances contain this node n in database. 21

Relative Cardinality Relative Cardinality : RC(n i ->n) = C(n i ->n) / C(n) Here, C(n i ->n) = The number of instances contain both n i and n in database. 22

Queriability of Entities 23 In Page 3

Queriability of Entities 24 In Page 3 p is a user-defined parameter between 0 and 1

Queriability of Entities 25 In Page 3 The initial importance of n : I n 0 = C(n)

Queriability of Entities 26 In Page 3 Measured by the weight of the link from node n i to node n

Queriability of Entities 27 In Page 3 The sum of all nodes’ absolute cardinality C(n i ) I e c = I n r, when I n r converges.

Queriability of Related Entities 28 a single entity per query-form?

Queriability of Related Entities 29 a single entity per query-form? Not appropriate!

Queriability of Related Entities 30 POSTULATE 3. The queriability of a collection of related entities depends on the individual queriabilities of entities in it. POSTULATE 4. The queriability of a collection of related entities depends on the data cardinality of all pair-wise relationships between the entities in it. In Page 4

Queriability of Related Entities 31 In Page 4 N(e i -> e j ) is the number of instances of entity e i connected to some instance of entity e j

Queriability of Related Entities 32 In Page 5 Considering more than 2 entities’ relationship

Queriability of Related Entities 33 In Page 5 Considering more than 2 entities’ relationship The number of permutations of m objects, i.e. m!

Queriability of Attributes 34 POSTULATE 5. The queriability of an attribute depends on its necessity, i.e., how frequently it appears in the data relative to its parent entity. In Page 5

Queriability of Attributes 35 a is an attribute of entity e In Page 5

The Queriability of Operator- Specific Attribute 36 Operations: Selection, Projection, Sorting, Aggregation. The queriability of different operation is different.

The Queriability of Operator- Specific Attribute 37 In Page 6

The Queriability of Operator- Specific Attribute 38 Selection Projection Sorting Aggregation In Page 6,7

Outline Motivation Database Analysis Queriability Generate Query-Forms Experiments Future Works (My ideas) 39

Choosing Form Fields 40 k a most queriable attributes. k f the number of fields (of any type) per entity in a form. k e the number of entities in a form. k r the number of related-entities in a form In Page 7

Outline Motivation Database Analysis Queriability Generate Query-Forms Experiments Future Works (My ideas) 41

Experimental Methodology Evaluate the usefulness of generated query- forms. – See how many real users’ queries in can be satisfied by generated query-forms. 42

Testing Datasets MiMI. Geoquery Jobsquery. 43

Form Usefulness Testing 44

Form Usefulness Testing 45

Form Usefulness Testing 46

Form Usefulness Testing 47

Form Usefulness Testing 48

Form Usefulness Testing 49

Effect of Postulates Testing 50

Effect of Postulates Testing 51

Effect of Postulates Testing 52

Effect of Postulates Testing 53

Outline Motivation Database Analysis Queriability Generate Query-Forms Experiments Future Works (My ideas) 54

Future Works (My ideas) 1. How to use history log? 2. Can we use association mining for Related Entities? 3. The application in BCiN Project 55

(1) How to use history data? Besides PageRank, the search log also helps search engines to make better ranking of pages. Therefore, can we use database query log to make better query-form? Personalized query-form (different roles in application)? 56

The Graph of Schema 57 09/12/09: SELECT * FROM profile WHERE profile.income > 1W

(2) Can we use association mining for Related Entities? 58 This paper tries all the possible combinations of entities. We can use Apriori algorithm.

(3) The application in BCiN Project 59 Generate different query-forms for different device ( Type keywords in Mobile Phone is hard).

(3) The application in BCiN Project 60 Generate different query-forms in different period (hurricane coming, hurricane leaving, disaster recovery)

End 61 Thank you! Any question?