VLDB ‘99 Edinbugh, Scotland Capturing and Querying Multiple Aspects of Semistructured Data Curtis Dyreson (formerly) Dept. of Comp. Sci., James Cook University.

Slides:



Advertisements
Similar presentations
Symmetrically Exploiting XML Shuohao Zhang and Curtis Dyreson School of E.E. and Computer Science Washington State University Pullman, Washington, USA.
Advertisements

1 ICS-FORTH & Univ. of Crete SeLene November 15, 2002 A View Definition Language for the Semantic Web Maganaraki Aimilia.
TU/e eindhoven university of technology PACIS'03 July Engineering Semantic Web Information Systems Richard Vdovjak Flavius Frasincar Geert-Jan Houben.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
The Semantic Web – WEEK 4: RDF
Introduction to RDF Based on tutorial at
Z39.50 and the Web ZIG July 2000 Poul Henrik Jørgensen, Danish Bibliographic Centre,
GridVine: Building Internet-Scale Semantic Overlay Networks By Lan Tian.
CS570 Artificial Intelligence Semantic Web & Ontology 2
Provenance in Open Distributed Information Systems Syed Imran Jami PhD Candidate FAST-NU.
Jennifer Widom NoSQL Systems Overview (as of November 2011 )
Web-site Management System Strudel Presented by: LAKHLIFI Houda Instructor: Dr. Haddouti.
From Semistructured Data to XML: Migrating The Lore Data Model and Query Language Roy Goldman, Jason McHugh, Jennifer Widom Stanford University
Database Systems: A Practical Approach to Design, Implementation and Management International Computer Science S. Carolyn Begg, Thomas Connolly Lecture.
Advanced Topics COMP163: Database Management Systems University of the Pacific December 9, 2008.
ER Tucson Schema Mediated Exchange of Temporal XML Data Curtis Dyreson – Washington State University Richard T. Snodgrass – University of Arizona.
Physical Database Monitoring and Tuning the Operational System.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Database Systems and XML David Wu CS 632 April 23, 2001.
Towards a Temporal World-wide Web: A Transaction-time Server Curtis Dyreson Electrical Engineering and Computer Science Washington State University, USA.
Putting Semi-structured Data to Practice Alon Levy Seattle, Washingon University of Washington.
Chapter 17 Methodology – Physical Database Design for Relational Databases Transparencies © Pearson Education Limited 1995, 2005.
IS432: Semi-Structured Data Dr. Azeddine Chikh. 1. Semi Structured Data Object Exchange Model.
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
Lecture 9 Methodology – Physical Database Design for Relational Databases.
Logics for Data and Knowledge Representation
Database Management System Prepared by Dr. Ahmed El-Ragal Reviewed & Presented By Mr. Mahmoud Rafeek Alfarra College Of Science & Technology Khan younis.
Chapter 16 Methodology – Physical Database Design for Relational Databases.
1 Semi-structured data Patrick Lambrix Department of Computer and Information Science Linköpings universitet.
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
Personal Activity Coordinator Shelley Zhuang Computer Science Division U.C. Berkeley Ericsson Workshop August 2000.
Brian Matthews, DeFINE, Pisa 26/11/02 Trust and the Semantic Web Brian Matthews, Business & Information Technology Dept, CLRC
The WWW as a Database: WWW Query Languages Curtis Dyreson James Cook University ( Townsville, Australia ) Aalborg University.
Dimitrios Skoutas Alkis Simitsis
Web Data Management Indexes. In this lecture Indexes –XSet –Region algebras –Indexes for Arbitrary Semistructured Data –Dataguides –T-indexes –Index Fabric.
Resource Description Framework (RDF) Course: Electronic Document Team member: Ding Feng Ding Wei Wang Ling Date:
© 2011 Pearson Education, Inc. Publishing as Prentice Hall 1 Chapter 13 (Online): Object-Oriented Data Modeling Modern Database Management 10 th Edition.
Dr. Bhavani Thuraisingham August 2006 Building Trustworthy Semantic Webs Unit #1: Introduction to The Semantic Web.
Autumn Web Information retrieval (Web IR) Handout #1:Web characteristics Ali Mohammad Zareh Bidoki ECE Department, Yazd University
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
Evidence from Metadata INST 734 Doug Oard Module 8.
Dr. Bhavani Thuraisingham The University of Texas at Dallas Trustworthy Semantic Webs March 25, 2011 Data and Applications Security Developments and Directions.
Methodology – Physical Database Design for Relational Databases.
C-Store: RDF Data Management Using Column Stores Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Apr. 24, 2009.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
Ranking objects based on relationships Computing Top-K over Aggregation Sigmod 2006 Kaushik Chakrabarti et al.
VLDB2005 CMS-ToPSS: Efficient Dissemination of RSS Documents Milenko Petrovic Haifeng Liu Hans-Arno Jacobsen University of Toronto.
Dr. Bhavani Thuraisingham September 2006 Building Trustworthy Semantic Webs Lecture #5 ] XML and XML Security.
The RDF meta model Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations of XML compared.
Scalable Hybrid Keyword Search on Distributed Database Jungkee Kim Florida State University Community Grids Laboratory, Indiana University Workshop on.
Dr. Bhavani Thuraisingham September 24, 2008 Building Trustworthy Semantic Webs Lecture #9: RDF and RDF Security.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Temporal Data Modeling
MMM2005The Chinese University of Hong Kong MMM2005 The Chinese University of Hong Kong 1 Video Summarization Using Mutual Reinforcement Principle and Shot.
Selected Semantic Web UMBC CoBrA – Context Broker Architecture  Using OWL to define ontologies for context modeling and reasoning  Taking.
CSCI 6315 Applied Database Systems Review for Midterm Exam I Xiang Lian The University of Texas Rio Grande Valley Edinburg, TX 78539
RDF languages and storages part 2 - indexing semi-structure data Maciej Janik Conrad Ibanez CSCI 8350, Fall 2004.
Author: Akiyoshi Matonoy, Toshiyuki Amagasay, Masatoshi Yoshikawaz, Shunsuke Uemuray.
SEMI-STRUCTURED DATA (XML) 1. SEMI-STRUCTURED DATA ER, Relational, ODL data models are all based on schema Structure of data is rigid and known is advance.
LECTURE TWO Introduction to Databases: Data models Relational database concepts Introduction to DDL & DML.
Setting the stage: linked data concepts Moving-Away-From-MARC-a-thon.
Building Trustworthy Semantic Webs
Building Trustworthy Semantic Webs
Computing Full Disjunctions
Probabilistic Data Management
Research Issues in Electronic Commerce
Lecture #6: RDF and RDF Security Dr. Bhavani Thuraisingham
Building Trustworthy Semantic Webs
A Framework for Testing Query Transformation Rules
Introduction to XML IR XML Group.
Presentation transcript:

VLDB ‘99 Edinbugh, Scotland Capturing and Querying Multiple Aspects of Semistructured Data Curtis Dyreson (formerly) Dept. of Comp. Sci., James Cook University Michael Böhlen, Christian S. Jensen Nykredit Center for Database Research Department of Computer Science, Aalborg University

VLDB ‘99 Edinburgh, Scotland2 Outline meta-data representation properties queries collapse match coalesce AUCQL summary

VLDB ‘99 Edinburgh, Scotland3 Meta-data database meta-data schema, security, transaction time web meta-data author, language, subject (Dublin Core), privacy web `meta-data’ standards RDF, P3P intrinsic informational, but also exclusional irregular ad-hoc

VLDB ‘99 Edinburgh, Scotland4 Movie database movie data Bruce Willis stars in Colour of Night. Colour of Night premiered 1/Jul/1995. publication meta-data language English URL publication date 2/Apr/1997 privacy/security ‘over 18’ publication history v1.2, modified 31/Jul/1998 subject Film, Suspense, Thriller queries Retrieve information published at Danish web sites. Find reviews published in the first week of the movie’s release. Get suspense films starring Bruce Willis.

VLDB ‘99 Edinburgh, Scotland5 database edges with labels nodes values A semistructured database &1 Bruce Willis meta-data schema security language URL subject time &2... movie name star age Oscars

VLDB ‘99 Edinburgh, Scotland6 Properties property name: property value default name property A label is a set of properties. Colour of Night &1 title Colour of Night &1 name: title Colour of Night &1 name: title URL:

VLDB ‘99 Edinburgh, Scotland7 name: title URL: Label semistructure Colour of Night &1 title URL name URLJoe author name meta-meta-data: Joe authored the URL meta-data

VLDB ‘99 Edinburgh, Scotland8 Properties (continued) required properties missing properties Colour of Night &2 name: title URL: &1 name: movie security! over 18 required missing the URL property missing the security property

VLDB ‘99 Edinburgh, Scotland9 Property semantics transaction time example Color of Night &2 &3 Colour of Night name: title trans. time: [1/Aug/ uc] &1 name: reviewed trans. time: [1/Sep/ uc] name: movie name: title trans. time: [2/Apr/ /Jul/1998] &1 &2 &3 Not a path!

VLDB ‘99 Edinburgh, Scotland10 Using an existing model meta-data and data edges retrieve titles of reviewed movies SELECT X.data FROM reviewed R, R.movie M, M.title X WHERE R.metadata.transtime INTERSECT M.metadata.transtime AND M.metadata.transtime INTERSECT X.metadata.transtime Colour of Night &1 &2 title 1/Aug/ uc data &3 metadata transtime

VLDB ‘99 Edinburgh, Scotland11 Design flaws query must enforce semantics to avoid fictive results SELECT X.data FROM *. title X wildcard unintentionally accesses meta-data no means of enforcing required properties even correctly formed queries are brittle user guesses at meta-data encoding

VLDB ‘99 Edinburgh, Scotland12 Outline meta-data representation properties queries collapse match coalesce AUCQL summary

VLDB ‘99 Edinburgh, Scotland13 Shortest paths Coalesce min sum Collapse

VLDB ‘99 Edinburgh, Scotland14 Collapse Collapse the information along a path to a single edge. Color of Night &1 Colour of Night name: reviewed trans. time: [1/Sep/ uc] &2 &3 name: title trans. time: [2/Apr/ /Jul/1998] name: title trans. time: [1/Aug/ uc] name: movie ? ?

VLDB ‘99 Edinburgh, Scotland15 Collapse example PropertyCollapse for name is concatenation, for trans. time it is temporal intersection. Color of Night &1 Colour of Night name: reviewed trans. time: [1/Sep/ uc] &2 &3 name: title trans. time: [2/Apr/ /Jul/1998] name: title trans. time: [1/Aug/ uc] name: movie name: reviewed.movie.title trans. time: [1/Sep/ uc] name: reviewed.movie.title trans. time: undefined

VLDB ‘99 Edinburgh, Scotland16 Match (retrieval) find paths that meet some condition(s) path regular expression role - exact match, e.g., title regular expression operators (.|?*+) (reviewed.movie)*.(title | name) only label matching changes labels are sets of properties required properties values may be from non-string domains, use PropertyMatch

VLDB ‘99 Edinburgh, Scotland17 name! movie trans. time: [now - now] LabelMatch example name property - `movie’ compares to `movie’, continue transaction time property - missing in target, continue URL property - missing in query, continue security property - required by database, no match! name: movie security! over 18 URL: query rolelabel in database ? ? ?

VLDB ‘99 Edinburgh, Scotland18 Retrieval queries retrieval queries replace only LabelMatch test validity of each path with Collapse cost LabelMatch now O (m) where m is number of properties Collapse is O (m*n) where n is length of path backwards compatible implicit name property LabelMatch is string comparison Collapse can be ignored both kinds of labels can coexist

VLDB ‘99 Edinburgh, Scotland19 Additional operations Coalesce - compute a distributed property value &1 &2 name: review security! developer trans. time: [1/Jul/ /Jul/1999] name: review security! subscriber trans. time: [16/Jul/ uc] trans. time: [1/Jul/ uc]

VLDB ‘99 Edinburgh, Scotland20 Meta-data modification framework is extensible specify the semantics and domain. Or just use it, default semantics. PropertyCollapse PropertyMatch PropertySlice PropertyCoalesce concatenation = semantic error union intersect overlaps intersect coalesce last = semantic error nametrans. timedefault domainstringstime intervalsobjects

VLDB ‘99 Edinburgh, Scotland21 Outline meta-data representation properties queries collapse match coalesce AUCQL summary

VLDB ‘99 Edinburgh, Scotland22 AUCQL Lorel SELECT statement derivative example, retrieve all movie titles. SELECT Title FROM movie.title Title; AUCQL replaces role with unordered list of properties SELECT Title FROM (name! movie).(name! title) Title; default to required name property

VLDB ‘99 Edinburgh, Scotland23 AUCQL (continued) can use any property, retrieve current movie titles SELECT Title FROM (name! movie, trans. time: [now - now]). (name! title, trans.time: [now - now]) Title; can set properties for entire query SET PROPERTY (trans. time: [now - now]); SELECT Title FROM movie.title Title;

VLDB ‘99 Edinburgh, Scotland24 AUCQL (continued) can use MATCH, COALESCE, COLLAPSE example, show names along all current paths in the database SELECT PROPERTY(name, COLLAPSE(All)) FROM (trans. time: [now - now])* All; result, e.g, reviewed reviewed.movie reviewed.movie.title …

VLDB ‘99 Edinburgh, Scotland25 Summary meta-data representation labels with properties property semantics new query operations extensible AUCQL website implemented research prototype free, downloadable, Unix environment interactive query engine tutorials

VLDB ‘99 Edinburgh, Scotland26 Related work Lorel (Abiteboul et al., JDL 97) non-simple labels Chlorel/DOEM (Chawathe et al., ICDE ‘98) Deterministic Paths (Buneman et al., ICDT ‘99) RDF query languages (QL ‘98) Query Service for RDF (Decker et al.) P3P (Cranor) RDF Query Specification (Malhotra and Sundaresan)

VLDB ‘99 Edinburgh, Scotland27 Future work XML/RDF/DCD translation labels can share common properties no container termination property terminators? recursive semi-structured labels heterogeneous meta-data does security mean security? AUCQL has single property name space dynamic scoping of properties property semantics keyed to single property name

VLDB ‘99 Edinburgh, Scotland28 Future work (continued) soundness and completeness incomplete with respect to graph operations minimal set of operations information preserving? property-specific basis design guidelines for property semantics implementation path indexing (when labels have properties) query optimization

VLDB ‘99 Edinburgh, Scotland29 Collapse mechanics collapse pair-wise along path LabelCollapse: Label X Label -> Label for each property in both labels if property is in both then apply PropertyCollapse else add to result PropertyCollapse is a property-specific constructor T X T --> T U {undefined} required properties stay required path is valid if no property is undefined