Query Formation From High-Level Concepts for Relational Databases Guogen Zhang Wesley Chu Frank Meng Gladys Kong Computer Science Department University.

Slides:



Advertisements
Similar presentations
Entity Relationship Diagrams
Advertisements

+ Multi-label Classification using Adaptive Neighborhoods Tanwistha Saha, Huzefa Rangwala and Carlotta Domeniconi Department of Computer Science George.
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
Weak Entity Sets An entity set that does not have a primary key is referred to as a weak entity set. The existence of a weak entity set depends on the.
An Approach to Evaluate Data Trustworthiness Based on Data Provenance Department of Computer Science Purdue University.
Information Retrieval in Practice
A S EMANTIC A PPROACH TO D ISCOVERING S CHEMA M APPING Yuan An, Alex Borgida, Renee J. Miller, and John Mylopoulos Presented by: Kristine Monteith.
Interactive Generation of Integrated Schemas Laura Chiticariu et al. Presented by: Meher Talat Shaikh.
Software Testing and Quality Assurance
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Chapter Object-Oriented Practices. Agenda Object-Oriented Concepts Terminology Object-Oriented Modeling Tips Object-Oriented Data Models and DBMSs.
1 CoBase: Scalable and Extensible Cooperative Information System Wesley W. Chu Computer Science Department University of California, Los Angeles
Software Metrics II Speaker: Jerry Gao Ph.D. San Jose State University URL: Sept., 2001.
Database Management: Getting Data Together Chapter 14.
1 Draft of a Matchmaking Service Chuang liu. 2 Matchmaking Service Matchmaking Service is a service to help service providers to advertising their service.
Geographic Information Systems
1 System: Mecano Presenters: Baolinh Le, [Bryce Carder] Course: Knowledge-based User Interfaces Date: April 29, 2003 Model-Based Automated Generation of.
Object-Oriented Databases
Slides adapted from A. Silberschatz et al. Database System Concepts, 5th Ed. Entity-Relationship Model Database Management Systems I Alex Coman, Winter.
1 CoBase: Scalable and Extensible Cooperative Information System Wesley W. Chu Computer Science Department University of California, Los Angeles
1 System: Teallach Presenters: Baolinh Le, [Bryce Carder] Course: Knowledge-based User Interfaces Date: April 29, 2003 Teallach: A Model-Based User Interface.
Sangam: A Transformation Modeling Framework Kajal T. Claypool (U Mass Lowell) and Elke A. Rundensteiner (WPI)
Modeling & Designing the Database
Intelligent Tutoring Systems Traditional CAI Fully specified presentation text Canned questions and associated answers Lack the ability to adapt to students.
Information Retrieval in Practice
UML - Development Process 1 Software Development Process Using UML (2)
Overview of the Database Development Process
Lecture 2 The Relational Model. Objectives Terminology of relational model. How tables are used to represent data. Connection between mathematical relations.
Chapter 4 The Relational Model.
Ontology Development Kenneth Baclawski Northeastern University Harvard Medical School.
Database Design - Lecture 2
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
Computer Science CPSC 322 Lecture 3 AI Applications 1.
Copyright 2002 Prentice-Hall, Inc. Modern Systems Analysis and Design Third Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich Chapter 20 Object-Oriented.
Database Management System Prepared by Dr. Ahmed El-Ragal Reviewed & Presented By Mr. Mahmoud Rafeek Alfarra College Of Science & Technology Khan younis.
Computer Science 101 Database Concepts. Database Collection of related data Models real world “universe” Reflects changes Specific purposes and audience.
Harikrishnan Karunakaran Sulabha Balan CSE  Introduction  Database and Query Model ◦ Informal Model ◦ Formal Model ◦ Query and Answer Model 
CIS552Relational Model1 Structure of Relational Database Relational Algebra Extended Relational-Algebra-Operations Modification of the Database.
Lecture2: Database Environment Prepared by L. Nouf Almujally & Aisha AlArfaj 1 Ref. Chapter2 College of Computer and Information Sciences - Information.
Dimitrios Skoutas Alkis Simitsis
1.file. 2.database. 3.entity. 4.record. 5.attribute. When working with a database, a group of related fields comprises a(n)…
Keyword Searching and Browsing in Databases using BANKS Seoyoung Ahn Mar 3, 2005 The University of Texas at Arlington.
Chapters 15 &16 Conceptual and Logical Database Design Methodology.
Lecture2: Database Environment Prepared by L. Nouf Almujally 1 Ref. Chapter2 Lecture2.
EXAMPLE. Subclasses and Superclasses Entity type may have sub-grouping that need to be represented explicitly. –Example: Employee may grouped into.
Database Systems: Enhanced Entity-Relationship Modeling Dr. Taysir Hassan Abdel Hamid.
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
©Silberschatz, Korth and Sudarshan2.1Database System Concepts Chapter 2: Entity-Relationship Model Entity Sets Relationship Sets Design Issues Mapping.
Chapter 2 : Entity-Relationship Model Entity Sets Relationship Sets Design Issues Mapping Constraints Keys E-R Diagram Extended E-R Features Design of.
The Volcano Optimizer Generator Extensibility and Efficient Search.
CS499 Project #3 XML mySQL Test Generation Members Erica Wade Kevin Hardison Sameer Patwa Yi Lu.
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Computer Science and Engineering Computer System Security CSE 5339/7339 Session 21 November 2, 2004.
Benjamin Post Cole Kelleher.  Availability  Data must maintain a specified level of availability to the users  Performance  Database requests must.
12 Chapter 12: Advanced Topics in Object-Oriented Design Systems Analysis and Design in a Changing World, 3 rd Edition.
1 KMeD: A Knowledge-Based Multimedia Medical Database System Wesley W. Chu Computer Science Department University of California, Los Angeles
Chapter 5 System Modeling. What is System modeling? System modeling is the process of developing abstract models of a system, with each model presenting.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan Lecture-03 Introduction –Data Models Lectured by, Jesmin Akhter.
©Silberschatz, Korth and Sudarshan2.1Database System Concepts Chapter 2: Entity-Relationship Model Entity Sets Relationship Sets Mapping Constraints Keys.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe Chapter 12 Outline Overview of Object Database Concepts Object-Relational Features Object Database.
Data Models. 2 The Importance of Data Models Data models –Relatively simple representations, usually graphical, of complex real-world data structures.
Entity-Relationship Model
Outline of the ER Model By S.Saha
Computing Full Disjunctions
Associative Query Answering via Query Feature Similarity
Entity Relationship Diagrams
Chapter 20 Object-Oriented Analysis and Design
Keyword Searching and Browsing in Databases using BANKS
A Graph-Based Approach to Learn Semantic Descriptions of Data Sources
Introduction Dataset search
Presentation transcript:

Query Formation From High-Level Concepts for Relational Databases Guogen Zhang Wesley Chu Frank Meng Gladys Kong Computer Science Department University of California Los Angeles, CA

Outlines Overview Semantic Graph Model High-Level Query Formation for SPJ queries Incremental Query Formation for Complex Queries Conclusions

Overview: Query Formation Based on semantic graph model, including user-defined relationships User specifies requests and constraints Formulate simple query by graph search technique –Candidates ranked by information measure –English-like query description A complex query can be formulated by a series of simple queries

Related Work Query formulation as Steiner tree problem (Wald and Sorenson, 1984) –limited to partial 2-tree graphs Formulate simple Select-Project-Join (SPJ) queries via Universal Relation Model: no need to specify natural joins (Ullman 1988, Vardi, 1988) Object-oriented query path expression completion: partial order relationship between different path for ranking (Ioannidis and Lashkari, 1994) Query-by-Icon (QBI) [Massari and Chrysanthis, 1995] Natural language interfaces (text/voice): logical form to query

Semantic Graph Model Weighted graph G=(V,E): Nodes: entities -- strong, weak, user-defined Links: relationships -- ISA, HAS, simple, complex, user-defined –For relational databases: nodes: relations links: natural and user-defined joins Weight: information measure of a node or link

Semantic Graph Example

Query Feature Query expression in a semantic graph –Query Topic, T : A set of Joins represented by links –Query Constraints, C : Query Conditions –Query Aspect, A : Attribute list

A query topic for “aircraft can land on airports at geographical locations of countries” airports runways can land have is a located airfield_chars geoloc country

Semi-Automatic Generation of Semantic Model Find natural joins through key and foreign key between nodes. User-defined links can be added into the graph model. Designers need to specify link types and assign names to all the elements in the graph.

Example of Semantic Model Generation AIRPORT: APORT_NM, GEOLOC_TYPE, GLC_CD, ELEV_FT, …; key: APORT_NM. RUNWAY: APORT_NM, RUNWAY_NM, GLC_CD, RUNWAY_LENGTH_FT, RUNWAY_WIDTH_FT, …; key: RUNWAY_NM. GEOLOC: GLC_CD, GLC_NM, CY_CD, LATITUDE, LONGITUDE, …; key: GLC_CD. COUNTRY: CY_CD, CY_NM, …; key: CY_CD. Links: AIRPORT--RUNWAY: APORT_NM; AIRPORT--GEOLOC: GLC_CD; RUNWAY--GEOLOC: GLC_CD; GEOLOC--COUNTRY: CY_CD;

Information Measure Information measure of a node or link, a I ( a ) = - log P ( a ) where P ( a ) is the probability of a being used in queries. Assume nodes and links are independent, for a subgraph with a set of elements A ={ a i | i = 1, …, n }, information measure is additive: n I ( A ) = SUM I ( a i ) i = 1

Information Measure (cont.) Initial Information Measure: all the nodes = 1 different nodes have a different value Information measure is normalized and converted into counts Probability of a node or a link is P ( a i ) = c i /c Update Information measure Ranking based on Information measure, thus adapt to user feedback

Query Formulation To formulate (simple) queries without knowledge of query language or database schema Example: Find airports in Tunisia that can land a C-5 cargo plane User input: Query aspect: AIRPORTS.APORT_NM Constraints: AIRCRAFT_AIRFIELD_CHARS.AC_TYPE_NAME = ‘C-5’ COUNTRY_STATE.CY_NM = ‘Tunisia’ Links: CAN LAND

Formulated Query SELECT R3.APORT_NM FROMAIRCRAFT_AIRFIELD_CHARS R0 AIRPORTS R3, COUNTRY_STATE R11 GEOLOC R12, RUNWAYS R16 WHERER0.AC_TYPE_NM = ‘C-5’ AND R11.CY_NM = ‘Tunisia’ AND R0.WT_MIN_AVG_LAND_DIST_FT <= R16.RUNWAY_LENGTH-FT AND R0.WT_MIN_RUNWAY_WIDTH_FT <= R16.RUNWAY_WIDTH_FT AND R11.GLC_CD = R3. GLC_CD AND R3.APORT_NM = R16.APORT_NM AND R11.CY_CD = R11.CY_CD

Query Completion as Graph Search Problem Given: An incomplete input query topic T i Find a set of links to complete the topic (to make T i connected) Minimum Missing Information principle: The query completion candidate T c (the missing links and nodes) for an incomplete input topic T i contains the minimum information

Query Formulation Algorithm Input: subgraph T of the semantic graph G –Find candidates with the minimum Information measure Two methods used to limit the search scope: –L-step-bound paths: paths that connect two components with at most L links, to limit search within the neighborhood of the input subgraph –k-minimum completion candidates: only at most k candidates with minimum Information measure are kept (alpha-beta pruning)

Initial Components and 2-Step-Bound Paths For the “CAN LAND” Query airports repair (1) 2 aircraftsairports haveauthorize 12 (2) runways can land airports country geoloc atis a 11 geoloc atlocated 11 geoloc is alocated 11 airports have 1 (3) (4) (5) (6) (a) Initial components (b) 2-step-bound paths airfield_chars airports runways airfield_chars country airports

The Semantic Graph For the Transportation Domain airports runways can land Relation Node at have is a located weather airfield_chars geoloccountry

Incremental Query Formulation –To assist user reach a complex query goal with a series of simple queries –The subsequent queries may depend on results of preceding queries (derived relations) Issues –Incorporate derived relations into the semantic graph –Suggest missing attributes to link isolated derived nodes to the graph Incremental Query Formulation

Incremental Query Examples Find airports in Tunisia. Which of these airports can land a C-5? What is the weather at these airports?

Incorporating Derived Relations Source relation: contributes attributes to the derived relations Derived relation: inherits properties of attributes from their source relations Deriving link: links to the source relations through inherited keys Inherited link: inherits links from the source relations

Extended semantic graph showing derived nodes, derived links and inherited links airports runways can land Relation Node at have is a located Derived Node Derived Link Inherited Link airfield_chars weather geoloc country airporttunisiacanlandairporttunisiacanlandweather airporttunisia

Suggesting Key Attributes for a Query Find source relations for the isolated derived relation. Suggest key of the source relations as attributes to include.

Concept and Attribute Specification Interface

Query Constraint Specification

Action Specification

English-Like Query Description and the Formulated Query

Conclusions Semantic graph model provides a basis for query formulation search Ranking of query candidates by information measure in formulation provides adaptive behavior Incremental query formulation is effective for complex queries GUI and voice interface can be built for query formulation from high-level concepts