The Trio System for Data, Uncertainty, and Lineage: Overview and Demo

Slides:



Advertisements
Similar presentations
Uncertainty in Data Integration Ai Jing
Advertisements

Slide 1 of 18 Uncertainty Representation and Reasoning with MEBN/PR-OWL Kathryn Blackmond Laskey Paulo C. G. da Costa The Volgenau School of Information.
Dr. Leo Obrst Information Semantics Command & Control Center July 17, 2007 Ontologies Can't Help Records Management Or Can They?
The 20th International Conference on Software Engineering and Knowledge Engineering (SEKE2008) Department of Electrical and Computer Engineering
BI Web Intelligence 4.0. Business Challenges Incorrect decisions based on inadequate data Lack of Ad hoc reporting and analysis Delayed decisions.
LIVE A lineage-supported, versioned DBMS  Anish Das Sarma  Martin Theobald  Jennifer Widom.
BY ANISH D. SARMA, XIN DONG, ALON HALEVY, PROCEEDINGS OF SIGMOD'08, VANCOUVER, BRITISH COLUMBIA, CANADA, JUNE 2008 Bootstrapping Pay-As-You-Go Data Integration.
Cleaning Uncertain Data with Quality Guarantees Dr. Reynold Cheng Department of Computer Science The University of Hong Kong
Research Principles Revealed Jennifer Widom Stanford University.
Jennifer Widom Querying XML XSLT. Jennifer Widom XSLT Querying XML Not nearly as mature as Querying Relational  Newer  No underlying algebra Sequence.
“Lineage/Provenance” Workgroup Report Birgitta, Amol, Ihab, Thomas, Anish, Martin, Matthias.
Top-K Query Evaluation on Probabilistic Data Christopher Ré, Nilesh Dalvi and Dan Suciu University of Washington.
ANHAI DOAN ALON HALEVY ZACHARY IVES Chapter 13: Incorporating Uncertainty into Data Integration PRINCIPLES OF DATA INTEGRATION.
Uncertainty Lineage Data Bases Very Large Data Bases
Indexing the imprecise positions of moving objects Xiaofeng Ding and Yansheng Lu Department of Computer Science Huazhong University of Science & Technology.
LUDWIG- MAXIMILIANS- UNIVERSITY MUNICH DATABASE SYSTEMS GROUP DEPARTMENT INSTITUTE FOR INFORMATICS Probabilistic Similarity Queries in Uncertain Databases.
Rough Sets Theory Speaker:Kun Hsiang.
Data Management for XML: Research Directions By: Jennifer Widom Stanford University Reviewer: Kristin Streilein.
Trio: A System for Data, Uncertainty, and Lineage Search “stanford trio”
Trio A System for Integrated Management of Data, Accuracy, and Lineage Jennifer Widom Stanford University.
Trio: A System for Data, Uncertainty, and Lineage Search “stanford trio”
Semantics For the Semantic Web: The Implicit, the Formal and The Powerful Amit Sheth, Cartic Ramakrishnan, Christopher Thomas CS751 Spring 2005 Presenter:
Trio: A System for Data, Uncertainty, and Lineage Jennifer Widom et al Stanford University.
Dataspaces: A New Abstraction for Data Management Mike Franklin, Alon Halevy, David Maier, Jennifer Widom.
MystiQ The HusQies* *Nilesh Dalvi, Brian Harris, Chris Re, Dan Suciu University of Washington.
Representation Formalisms for Uncertain Data Jennifer Widom with Anish Das Sarma Omar Benjelloun Alon Halevy Trio and other participants in the Trio Project.
Trio: A System for Data, Uncertainty, and Lineage Jennifer Widom Stanford University.
Jennifer Widom SQL Data Modification Statements. Jennifer Widom Insert Into Table Values(A 1,A 2,…,A n ) SQL: Modifications Inserting new data (2 methods)
Jennifer Widom Constraints & Triggers Motivation and overview.
A Generic Provenance Middleware for Database Queries, Updates, and Transactions Bahareh Sadat Arab 1, Dieter Gawlick 2, Venkatesh Radhakrishnan 2, Hao.
Sensor Data Management: Challenges and (some) Solutions Amol Deshpande, University of Maryland.
ULDBs: Databases with Uncertainty and Lineage O. Benjelloun, A. Das Sarma, A. Halevy, J. Widom.
Knowledge Fusion Research WorkshopNovember 29 - December 1, Knowledge Fusion Education Richard Scherl Computer Science Department Monmouth University.
DBrev: Dreaming of a Database Revolution Gjergji Kasneci, Jurgen Van Gael, Thore Graepel Microsoft Research Cambridge, UK.
LIS618 lecture 1 Thomas Krichel economic rational for traditional model In olden days the cost of telecommunication was high. database use.
Towards Low Overhead Provenance Tracking in Near Real-Time Stream Filtering Nithya N. Vijayakumar, Beth Plale DDE Lab, Indiana University {nvijayak,
Data-Centric Human Computation Jennifer Widom Stanford University.
The τ - Synopses System Yossi Matias Leon Portman Tel Aviv University.
Ocean Observatories Initiative Data Management (DM) Subsystem Overview Michael Meisinger September 29, 2009.
Book: Bayesian Networks : A practical guide to applications Paper-authors: Luis M. de Campos, Juan M. Fernandez-Luna, Juan F. Huete, Carlos Martine, Alfonso.
SOFT COMPUTING TECHNIQUES FOR STATISTICAL DATABASES Miroslav Hudec INFOSTAT – Bratislava MSIS 2009.
Trio-One: Layering Uncertainty and Lineage on a Conventional DBMS Martin Theobald Jennifer Widom Stanford University.
1 Announcements Reading for next week: Chapter 4 Your first homework will be assigned as soon as your database accounts have been set up.  Expect an .
OOI Cyberinfrastructure and Semantics OOI CI Architecture & Design Team UCSD/Calit2 Ocean Observing Systems Semantic Interoperability Workshop, November.
Jennifer Widom Relational Databases The Relational Model.
Analysis of Uncertain Data: Tools for Representation and Processing Bin Fu Eugene Fink Jaime G. Carbonell.
Jennifer Widom Recursion in SQL Basic recursive WITH statement.
1 Working Models for Uncertain Data Anish Das Sarma, Omar Benjelloun, Alon Halevy, Jennifer Widom Stanford InfoLab.
CS 405G: Introduction to Database Systems
CHAPTER 5 Handling Uncertainty BIC 3337 EXPERT SYSTEM.
Querying Relational Databases
Approximate Lineage for Probabilistic Databases
TRIO Data Uncertainty Lineage Data Model Query Language System
Trio A System for Data, Uncertainty, and Lineage
Data Management = Data Integration: A Proof
MIS Professor Sandvig MIS 324 Professor Sandvig
Probabilistic Data Management
Querying Relational Databases
Non-Standard-Datenbanken
DBMS with probabilistic model
Trio A System for Integrated Management of Data, Accuracy, and Lineage
Introduction to Database Management Systems
Schema translation and data quality Sven Schade
Instructor 彭智勇 武汉大学软件工程国家重点实验室 电话:
Sunita Sarawagi IIT Bombay Team: Rahul Gupta (PhD)
Chapter 13 The Data Warehouse
MIS Professor Sandvig MIS 324 Professor Sandvig
Probabilistic Databases
Non-Standard-Datenbanken
On Provenance of Queries on Linked Web Data
Presentation transcript:

The Trio System for Data, Uncertainty, and Lineage: Overview and Demo Anish Das Sarma Stanford University DATA UNCERTAINTY LINEAGE

Original Motivation for the Project New Application Domains Many involve data that is uncertain (approximate, probabilistic, inexact, incomplete, imprecise, fuzzy, inaccurate,...) Many of the same ones need to track the lineage (provenance) of their data

Original Motivation for the Project New Application Domains Many involve data that is uncertain (approximate, probabilistic, inexact, incomplete, imprecise, fuzzy, inaccurate,...) Many of the same ones need to track the lineage (provenance) of their data Neither uncertainty nor lineage is supported in current database systems

Sample Applications Data integration Information extraction Scientific experiments Sensor data management Deduplication (“data cleaning”) Approximate query processing

Our Goal Develop a new kind of database management system (DBMS) in which: Data Uncertainty Lineage are all first-class interrelated concepts With all the “usual” DBMS features

Another “Trio” in Trio Data Model Query Language System Simplest extension to relational model that’s sufficiently expressive Query Language Simple extension to SQL with well-defined semantics and intuitive behavior System A complete open-source DBMS that people want to use

Another “Trio” in Trio Data Model Query Language System Uncertainty-Lineage Databases (ULDBs) Query Language TriQL System Trio-One — built on top of standard DBMS

Demo

Ongoing and Future Work Efficient Confidence Computation Top-K Queries Aggregation External Lineage Data Modifications and Versioning Continuous Uncertainty Dependency Theory for ULDBs Marrying Trio and Bayes Nets System Development and Applications

Trio Players, Present and Past Current Jennifer Widom, Jeffrey Ullman Parag Agrawal, Anish Das Sarma, Raghotham Murthy, Martin Theobald Alums Omar Benjelloun, Ashok Chandra, Julien Chaumond, Alon Halevy, Chris Hayworth, Ander de Keijzer, Michi Mutsuzaki, Shubha Nabar, Tomoe Sugihara

Thank you! Search “stanford trio”