Auditing Batches of SQL Queries Rajeev Motwani Shubha Nabar Dilys Thomas Stanford University.

Slides:



Advertisements
Similar presentations
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification.
Advertisements

พีชคณิตแบบสัมพันธ์ (Relational Algebra) บทที่ 3 อ. ดร. ชุรี เตชะวุฒิ CS (204)321 ระบบฐานข้อมูล 1 (Database System I)
Query Folding Xiaolei Qian Presented by Ram Kumar Vangala.
Online Auditing Kobbi Nissim Microsoft Based on a position paper with Nina Mishra.
Kaushik Chakrabarti(Univ Of Illinois) Minos Garofalakis(Bell Labs) Rajeev Rastogi(Bell Labs) Kyuseok Shim(KAIST and AITrc) Presented at 26 th VLDB Conference,
A Paper on RANDOM SAMPLING OVER JOINS by SURAJIT CHAUDHARI RAJEEV MOTWANI VIVEK NARASAYYA PRESENTED BY, JEEVAN KUMAR GOGINENI SARANYA GOTTIPATI.
Wang, Lakshmanan Probabilistic Privacy Analysis of Published Views, IDAR'07 Probabilistic Privacy Analysis of Published Views Hui (Wendy) Wang Laks V.S.
COMP 3715 Spring 05. Working with data in a DBMS Any database system must allow user to  Define data Relations Attributes Constraints  Manipulate data.
Database Management COP4540, SCS, FIU Functional Dependencies (Chapter 14)
Database Security CS461/ECE422 Spring Overview Database model – Relational Databases Access Control Inference and Statistical Databases Database.
CS4432: Database Systems II Query Operator & Algebraic Expressions 1.
Auditing Compliance with a Hippocratic Database Javier Salinas Martín.
© The McGraw-Hill Companies, Inc., Chapter 8 The Theory of NP-Completeness.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Algebra Chapter 4, Part A Modified by Donghui Zhang.
INFS614, Fall 08 1 Relational Algebra Lecture 4. INFS614, Fall 08 2 Relational Query Languages v Query languages: Allow manipulation and retrieval of.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Algebra Chapter 4.
Data Warehousing/Mining 1 Data Warehousing/Mining Comp 150 Aggregation in SQL (not in book) Instructor: Dan Hebert.
Efficient Query Evaluation on Probabilistic Databases
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification.
By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Relational Algebra Chapter 4, Part A.
Relational Algebra Chapter 4 - part I. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational.
Cs3431 Relational Algebra : #I Based on Chapter 2.4 & 5.1.
Database Systems More SQL Database Design -- More SQL1.
1 Relational Algebra and Calculus Yanlei Diao UMass Amherst Feb 1, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Relational Algebra Chapter 4 - part I. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational.
Chapter 7 Reasoning about Knowledge by Neha Saxena Id: 13 CS 267.
View n A single table derived from other tables which can be a base table or previously defined views n Virtual table: doesn’t exist physically n Limitation.
Relational Algebra, R. Ramakrishnan and J. Gehrke (with additions by Ch. Eick) 1 Relational Algebra.
Computer Science 101 Web Access to Databases SQL – Extended Form.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
1 Relational Algebra and Calculus Chapter 4. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.
CSE314 Database Systems More SQL: Complex Queries, Triggers, Views, and Schema Modification Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson.
Statistical Databases – Query Auditing Li Xiong CS573 Data Privacy and Anonymity Partial slides credit: Vitaly Shmatikov, Univ Texas at Austin.
Rakesh Agrawal Roberto Bayardo Christos Faloutsos Jerry Kiernan Ralf Rantzau Ramakrishnan Srikant Intelligent Information Systems Research IBM Almaden.
Relational Algebra - Chapter (7th ed )
1 Two Can Keep a Secret: A Distributed Architecture for Secure Database Services Gagan Aggarwal, Mayank Bawa, Prasanna Ganesan, Hector Garcia-Molina, Krishnaram.
1 Relational Algebra. 2 Relational Query Languages v Query languages: Allow manipulation and retrieval of data from a database. v Relational model supports.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Relational Algebra.
1 Relational Algebra and Calculas Chapter 4, Part A.
1.1 CAS CS 460/660 Introduction to Database Systems Relational Algebra.
ICS 321 Fall 2011 The Relational Model of Data (i) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 8/29/20111Lipyeow.
IS 230Lecture 6Slide 1 Lecture 7 Advanced SQL Introduction to Database Systems IS 230 This is the instructor’s notes and student has to read the textbook.
Chapter 6 The Relational Algebra Copyright © 2004 Ramez Elmasri and Shamkant Navathe.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 4 Relational Algebra.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 6- 1.
NP-Complete Problems. Running Time v.s. Input Size Concern with problems whose complexity may be described by exponential functions. Tractable problems.
1 CSCE Database Systems Anxiao (Andrew) Jiang The Database Language SQL.
Mauro Mezzini ANSWERING SUM-QUERIES : A SECURE AND EFFICIENT APPROACH University of Rome “La Sapienza” Computer Science Department.
CMPT 258 Database Systems Relational Algebra (Chapter 4)
Mining real world data RDBMS and SQL. Index RDBMS introduction SQL (Structured Query language)
Inference Problem. Access Control Policies Direct access Information flow Not addressed: indirect data access CSCE Farkas 2 Lecture 19.
NP-Completeness Note. Some illustrations are taken from (KT) Kleinberg and Tardos. Algorithm Design (DPV)Dasgupta, Papadimitriou, and Vazirani. Algorithms.
Towards Robustness in Query Auditing Shubha U. Nabar Stanford University VLDB 2006 Joint Work With B. Marthi, K. Kenthapadi, N. Mishra, R. Motwani.
More SQL: Complex Queries, Triggers, Views, and Schema Modification
More SQL: Complex Queries,
Outerjoins, Grouping/Aggregation Insert/Delete/Update
Fundamental of Database Systems
NP-Completeness Yin Tat Lee
Discrete Structures for Computer Science
Inference and Flow Control
More SQL: Complex Queries, Triggers, Views, and Schema Modification
NP-Complete Problems.
NP-Completeness Yin Tat Lee
CSE 6408 Advanced Algorithms.
Ensuring Correctness over Untrusted Private Database
Presentation transcript:

Auditing Batches of SQL Queries Rajeev Motwani Shubha Nabar Dilys Thomas Stanford University

Database Query Auditing Auditing Aggregate (Sum, Max, Median) queries Perfect Privacy Auditing SQL Queries Auditing a Batch of SQL Queries

Aggregate Queries [C86] Chin: Security problems on inference control for sum, max and min queries. JACM 1986 [CO82] Chin, Ozsoyglu: Auditing and inference control in statistical databases. TSE 1982 [DJL79] Dobkin, Jones, Lipton: Secure Databases: Protection against user influence. TODS 1979 [KMN05] Kenthapadi, Mishra, Nissim: Simulatable auditing. PODS 2005 [KPR00] Kleinberg, Papadimitriou, Raghavan: Auditing Boolean Attributes. PODS 2000 [R79] S. P. Reiss. Security in databases: A combinatorial study. JACM 1979

Aggregate Queries How many aggregate queries: sum / max / median queries can you pose to a database of numbers before you find out the value of an element Some amount of work in the 80’s Theoretically interesting and basis of more practical schemes today

Perfect Privacy [MS04] Miklau, Suciu: A formal analysis of information disclosure in data exchange. SIGMOD 2004 [MG06] Machanavajhala, Gehrke: On the efficiency of checking perfect privacy. PODS 2006

Perfect Privacy[MS04,MG06] Table Patient(Name, Phone number) Want to keep secret: All phone-numbers in the database Query: select name from Patient Perfect Privacy violation! Reveals some information --- the phone database is not empty. Too strong

SQL Auditing: Single Table Audit for address, SSN and phone numbers of all patients with diabetes Say Alice has diabetes Then any query that returns the address, SSN and phone number of Alice is suspicious wrt to the audit expression [ABFKRS04] Agrawal, Bayardo, Faloutsos, Kiernan, Rantzau, Srikant: Auditing compliance with a Hippocratic Database VLDB2004

Auditing SQL Queries[ABFKRS04] An audit expression is like a SQL Query AUDIT audit list FROM table list WHERE condition list

Example SELECT zipcode FROM Patients p WHERE p.disease = ‘diabetes’ AUDIT zipcode FROM Patients p WHERE p.disease = ‘high blood pressure’ AUDIT disease FROM Patients p WHERE p.zipcode = Suspicious if someone in has diabetes Not Suspicious wrt this

Formally, SQL Auditing Query Q=  C OQ  P Q (T £ R)) Audit expression A=  C OA (  P A ( T £ S)) Where, T =T 1 £ T 2 £ T 2 …. T n R=R 1 £ R 2 £ R 2 …. R n S=S 1 £ S 2 £ S 2 …. S n

SQL Auditing: Q suspicious wrt A £ £  T (T)  R (R)  S (S) (1) 9 v 2 T : (a)  R Æ T (R £ {v} )   (b)  S Æ T ({v} £ S )   (2) All audited columns are projected by the query Requires execution of queries on the database v

Auditing a Batch of SQL Queries Previous work for (1)Batch of queries like sum, max and median --can answers be stitched together to reveal more than what a single query can reveal? (2)Singleton SQL queries We introduce the notion of auditing a batch of SQL queries

SQL Auditing Batch of SQL queries, each of form Project col 1 col 2 col 3 …. col k From R Where C 1 and C 2 and C 3 and … C j Each C i : (col m = value), (col m <= value), (col m >= value), (value 1 <= col m <= value 2 ) col 1, col 2,.. col k includes primary key so that result of query can be joined with other results

Semantically Suspicious A query batch Q 1, Q 2,.. Q n is said to be suspicious wrt to an audit expression A if an expression combining the results of these queries as base tables is suspicious wrt A Natural extension of a suspicious query to a query batch

Syntactically Suspicious A query batch is said to be syntactically suspicious with respect to an audit expression A if there exists an instantiation of the database tables for which it is suspicious wrt A Does not require execution of the queries against the table

SQL Batch Auditing Query 1 Query 2 Query 3 Audited tuple columns are covered Query 4 Audit expression Query batch suspicious wrt audit expression iff queries together cover all audited columns of at least audited tuple syntacticallysemantically on table Ton some table T

Syntactic and Semantic Auditing Syntactially suspicious implies semantically suspicious To check semantic suspiciousness check for syntactic suspiciousness first and then execute the queries on tables to verify How to check syntactic suspiciousness covered next

Compatible Queries A batch of queries is compatible if the conjunction of their selection conditions is satisfiable. To test compatibility of a set of queries you only need to check pairwise compatibility [Helley’s Theorem]

Syntactic Auditing: Graph Problem Query 1 Query 2 Query 3 Query 4=AuditExp { } Suspicious iff there exists an independent set, including audit expression that covers all audited colors

Syntactic Auditing Query batch suspicious iff there is a subset of queries compatible with the audit expression and they cover all audited columns. Need not consider hyperedges as due to Helley’s Theorem you only need to check pairwise compatibility Independent set implies the query batch is compatible Has all audited colors implies that all audited columns are covered

Syntactic Auditing is NP complete Reduction from 3-SAT X1X1 X1X1 X2X2 X3X3 X4X4 X2X2 X3X3 X4X4 X1X1 X2X2 X4X4 ÇÇ X2X2 X3X3 X4X4 ÇÇ

Semantic Auditing If table given an implicit representation then NP complete Explicitly mentioned table, polynomial time algorithm

THANK YOU! Questions?