Joint work with Emilien Antoine, Gerome Miklau, Julia Stoyanovich and Vera Zaychik Moffitt ICDE 2012Mai 30, 2012 Introducing Access Control in Webdamlog.

Slides:



Advertisements
Similar presentations
Viewing the Web as a Distributed Knowledge Base Serge Abiteboul INRIA Saclay, Collège de France and ENS Cachan ICDE 2012Mai 30, 2012.
Advertisements

Provenance-Aware Storage Systems Margo Seltzer April 29, 2005.
Lecture 11: Datalog Tuesday, February 6, Outline Datalog syntax Examples Semantics: –Minimal model –Least fixpoint –They are equivalent Naive evaluation.
Logic Programming Automated Reasoning in practice.
Peer-to-Peer Networking for Distributed Learning Repositories: The Edutella Network Diplomarbeit von Boris Wolf.
University of Washington Database Group Tiresias The Database Oracle for How-To Queries Alexandra Meliou § ✜ Dan Suciu ✜ § University of Massachusetts.
1 Intelligent Agents Software analog to human agents real estate agent, librarian, salesperson Perform tasks individually, or in collaboration Static and.
28.2 Functionality Application Software Provides Applications supply the high-level services that user access, and determine how users perceive the capabilities.
The KB on its way to Web 2.0 Lower the barrier for users to remix the output of services. Theo van Veen, ELAG 2006, April 26.
1 WebdamExchange and WebdamLog: some models for web data management Emilien Antoine, Meghyn Bienvenu, Alban Galland Webdam WS, 04/03/2011.
1 WebdamExchange and WebdamLog: some models for web data management Alban Galland INRIA Saclay & ENS Cachan Grenoble, 10/12/2010.
Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.
Visual Web Information Extraction With Lixto Robert Baumgartner Sergio Flesca Georg Gottlob.
Chapter 16 The World Wide Web Chapter Goals Compare and contrast the Internet and the World Wide Web Describe general Web processing Write basic.
Exploiting Preferences for Minimal Credential Disclosure in Policy-Driven Trust Negotiations Philipp Kärger, Daniel Olmedilla, Wolf-Tilo Balke L3S Research.
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 7: Planning a DNS Strategy.
Building Knowledge-Driven DSS and Mining Data
Cambodia-India Entrepreneurship Development Centre - : :.... :-:-
Installing software on personal computer
Audumbar Chormale Advisor: Dr. Anupam Joshi M.S. Thesis Defense
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Elements of Computing Systems, Nisan & Schocken, MIT Press, 2005, Chapter 10: Compiler I: Syntax Analysis slide 1www.idc.ac.il/tecs.
INTRODUCTION TO WEB DATABASE PROGRAMMING
Object Oriented Software Development
Katanosh Morovat.   This concept is a formal approach for identifying the rules that encapsulate the structure, constraint, and control of the operation.
Imperative Programming
1 Serge Abiteboul - Monitoring 1 Monitoring of distributed applications (in P2P) Serge Abiteboul, Pierre Bourhis, Bogdan Marinoiu, INRIA Saclay and Université.
Chapter 16 The World Wide Web. 2 The Web An infrastructure of information combined and the network software used to access it Web page A document that.
16-1 The World Wide Web The Web An infrastructure of distributed information combined with software that uses networks as a vehicle to exchange that information.
CPS120: Introduction to Computer Science The World Wide Web Nell Dale John Lewis.
C Copyright © 2009, Oracle. All rights reserved. Appendix C: Service-Oriented Architectures.
Unit B: Expanding Your Productivity Page: 24 to 37.
Tim Finin University of Maryland, Baltimore County 29 January 2013 Joint work with Anupam Joshi, Laura Zavala and our students SRI Social Media Workshop.
Chapter 16 The World Wide Web. 2 The Web is an infrastructure of distributed information combined with software that uses networks as a vehicle to exchange.
Benjamin Gamble. What is Time?  Can mean many different things to a computer Dynamic Equation Variable System State 2.
Introduction To System Analysis and Design
DATA-DRIVEN UNDERSTANDING AND REFINEMENT OF SCHEMA MAPPINGS Data Integration and Service Computing ITCS 6010.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
The Data Ring: Community Content Sharing Serge Abiteboul (INRIA) Alkis Polyzotis (UC Santa Cruz)
1 Security on Social Networks Or some clues about Access Control in Web Data Management with Privacy, Time and Provenance Serge Abiteboul, Alban Galland.
© 2006 BBN Technologies 1SPINDLE Project Intentional Naming and Deferred Binding in DTN Prithwish Basu BBN Technologies DTNRG meeting, IETF.
1 Problem Solving with C++ The Object of Programming Walter Savitch Chapter 1 Introduction to Computers and C++ Programming Slides by David B. Teague,
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
Future and Emerging Technologies (FET) Future and Emerging Technologies (FET) The roots of innovation Proactive initiative on: Global Computing (GC) Proactive.
SECURE WEB APPLICATIONS VIA AUTOMATIC PARTITIONING S. Chong, J. Liu, A. C. Myers, X. Qi, K. Vikram, L. Zheng, X. Zheng Cornell University.
SEMANTIC AGENT SYSTEMS Towards a Reference Architecture for Semantic Agent Systems Applied to Symposium Planning Usman Ali.
Simultaneously Learning and Filtering Juan F. Mancilla-Caceres CS498EA - Fall 2011 Some slides from Connecting Learning and Logic, Eyal Amir 2006.
Modeling Speech Acts and Joint Intentions in Modal Markov Logic Henry Kautz University of Washington.
INRIA - Progress report DBGlobe meeting - Athens November 29 th, 2002.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
An answer to your common XACML dilemmas Asela Pathberiya Senior Software Engineer.
13-1 Chapter 13 Concurrency Topics Introduction Introduction to Subprogram-Level Concurrency Semaphores Monitors Message Passing Java Threads C# Threads.
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
Raluca Paiu1 Semantic Web Search By Raluca PAIU
The International RuleML Symposium on Rule Interchange and Applications Visualization of Proofs in Defeasible Logic Ioannis Avguleas 1, Katerina Gkirtzou.
1 An infrastructure for context-awareness based on first order logic 송지수 ISI LAB.
A formal study of collaborative access control in distributed datalog Serge Abiteboul – Inria & ENS Cachan Pierre Bourhis CNRS & Lille Univ. & Inria Victor.
Chapter 8 E-Commerce Technologies Introduction to Business Information Systems by Mark Huber, Craig Piercy, Patrick McKeown, and James Norrie.
Types for Programs and Proofs
Prof. Leonardo Mostarda University of Camerino
Knowledge out there on the Web
Arab Open University 2nd Semester, M301 Unit 5
Service-centric Software Engineering
Lecture 16: Probabilistic Databases
Patterns.
Probabilistic Databases
Datalog Inspired by the impedance mismatch in relational databases.
Data Provenance.
Objectives Explain the role of computers in client-server and peer-to-peer networks Explain the advantages and disadvantages of client- server and peer-to-peer.
Presentation transcript:

Joint work with Emilien Antoine, Gerome Miklau, Julia Stoyanovich and Vera Zaychik Moffitt ICDE 2012Mai 30, 2012 Introducing Access Control in Webdamlog Serge Abiteboul INRIA Saclay & ENS Cachan

Abiteboul – DBPL The Web as a distributed knowledge base Webdamlog: a rule-based language for the Web Access control in Webdamlog The Webdamlog system Conclusion

Abiteboul – DBPL A typical Web user’s data What kinds of data?  data: photos, music, movies, reports,  metadata: photo taken by Alice in Paris on...  ontologies: Alice’s ontology and mapping with other ontologies  localization: Alice’s pictures are on Picasa, back-ups are at INRIA  security: Facebook credentials (Alice, )  annotations: Alice likes Elvis’ website  beliefs: Alice believes Elvis is alive  external knowledge: Bob keeps copies of Alice’s pictures  time, provenance,... all kinds Social data

Abiteboul – DBPL A typical Web user’s data What kinds of data? Where is the data?  laptop, desktop, smartphone, tablet, car computer  mail, address book, agenda  Facebook, LinkedIn, Picasa, YouTube, Tweeter  svn, Google docs  also access to data / information of family, friends, companies associations all kinds everywhere

Abiteboul – DBPL A typical Web user’s data What kinds of data? Where is the data? all kinds everywhere What kind of organization?  terminology: different ontologies  systems: personal machines, social networks  distribution: different localization  security: different protocols  quality: incomplete / inconsistent information heterogeneous

Abiteboul – DBPL Example of processing Alice and Bob are getting engaged. Their friends want to offer them an album of photos where they are together To make such a photo album Find friends of Alice & Bob (say with Facebook) for each friend, find where she keeps her photos (say, Picassa) find the means to access her photos possibly via friends find the photos that feature Bob and Alice together, e.g., using tags or face recognition software possibly ask someone to verify the results Some reasoning is needed to execute these tasks automatically!

Abiteboul – DBPL A typical Web user Overwhelmed by the mass of information Cannot find the information needed Is not aware of important events Cannot manage/control how others access and use his/her own data 7

Abiteboul – DBPL YOU need help! How can systems help? We need to move from a Web of text to a Web of knowledge  In the spirit of semantic Web To better support user needs,  Systems need to analyze what is happening and construct knowledge  Systems should exchange knowledge  Systems should reason and infer knowledge 8

Abiteboul – DBPL Thesis All this forms a distributed knowledge base with processing based on automated reasoning 9

Abiteboul – DBPL Our topic Distributed reasoning Exchanging facts and rules Webdamlog Access control with access control

Abiteboul – DBPL The Web as a distributed knowledge base Webdamlog: a rule-based language for the Web Access control in Webdamlog The Webdamlog system Conclusion

Abiteboul – DBPL Webdamlog: a datalog-style language Datalog A prehistoric language by Web time... + nice and compact syntax + well-studied with many extensions + recursion essential: network cycles Webdamlog Not as simple/beautiful & procedural Needed for real Web applications! Webdamlog is not datalog

Abiteboul – DBPL Webdamlog: an extension of datalog Datalog programfof(x,y) :- friend(x,y) fof(x,y) :- friend(x,z), fof(z,y) Extensional facts(stored in the database) friend(“peter”,”paul”) friend(“paul”, “mary”) friend(“mary”,”sue”) Intentional facts (derived) fof(“peter”,”paul”) fof(“peter”,”mary”) fof(“peter”, “sue”) fof(“paul”, “mary”)fof(“paul”, “sue”) fof(“mary”,”sue”) 13

Abiteboul – DBPL Webdamlog: an extension of datalog Extends datalog negation, updates, distribution, delegation, time For a world that is distributed: autonomous and asynchronous peers dynamic: knowledge evolves; peers come and go Influenced by Active XML (INRIA) - for distribution & intentional data Dedalus (UC Berkeley) - for time & implementation

Abiteboul – DBPL Facts Facts are of the form 1,..., a n ), where m is a relation name& p is a peer name a 1,..., a n are data values (n is the arity of the set of data values includes the relations and peer names Examples “paul”) extensional “paul”) intentional

Abiteboul – DBPL Examples of facts data & metadata: 1771.jpg, “Paris”, 11/11/2011 ) ontology: theKing) annotations: encyclopedia) localization: picasa/alice) access rights: friends, read) security:

Abiteboul – DBPL Rules Rules are of the form :- (not) $R 1 ($U 1 ),..., (not) $R n ($U n ) where $R, $R i are relation terms $P, $P i are peer terms $U, $U i are tuples of terms Safety condition $R and $P must appear positively bound in the body each variable in a negative literal must appear positively bound in the body A term is a variable or a constant Examples coming up, stay tuned

Abiteboul – DBPL State transition Choose some peer p randomly – asynchronously Compute the transition of p the database updates at p the messages sent to other peers the delegations of rules to other peers Keep going forever (I 0, Γ 0, ∅ ) ➝ (I 1, Γ 1, Γ 1 * ) ➝... ➝ (I n, Γ n, Γ n * ) ➝... Fair sequence: each peer is selected infinitely often

Abiteboul – DBPL The semantics of rules Classification based on locality and nature of head predicates (intentional or extensional) Local rule at my-laptop: all predicates in the body of the rules are from my-laptop Local with local intentional headclassic datalog Local with local extensional headdatabase update Local with non-local extensional headmessaging between peers Local with non-local intentional head view delegation Non-localgeneral delegation 19

Abiteboul – DBPL Local rules with local intentional head Example: Rule at peer my-laptop friend is extensional, fof is intentional $y) :- :- iphone($z,$y) fof is the transitive closure of friend Datalog = Webdamlog with only local rules and local intentional head

Abiteboul – DBPL Local rules with local extensional head A new fact is inserted into the local database $loc) :- $loc),

Abiteboul – DBPL Local rules with non-local extensional head A new fact is sent to an external peer via a message “Happy birthday!”) :- $message, $peer, $date) Extensional facts: 6) “sendmail”, “gmail.com”, March 6) “Happy birthday”)

Abiteboul – DBPL Local rules with non-local intentional head View delegation! $boy) :- $loc), $loc) Semantics of gossip-site is a join of relations girls and boys from my-iphone Formally, my-iphone delegates a rule gossip-site (g,b) for each g, b, l,

Abiteboul – DBPL Non-local rules: general delegation (at my-iphone): $boy) :- $loc), $loc) Suppose that “Julia's birthday”) holds. Then my-iphone installs the following rule at alice-iphone (at alice-iphone): $boy) :- “Julia's birthday”) When “Julia's birthday”) no longer holds, my-iphone uninstalls the rule

Abiteboul – DBPL Non-local rules: general delegation (at my-iphone): $boy) :- $loc), $loc) An alternative, more database-ish, way of looking at this: at my-iphone : $loc):- $loc) at alice-iphone : $boy) :- $loc), $loc) view delegation

Abiteboul – DBPL Complexity of delegation: illustration fof(x,y) :- friend(x,y) (at p) :- $q ), $q (x,y) If contains tuples 1 ),...., ) This rule will install rules! for i=1 to (at q i ) :- i (x,y) Data complexity transformed into program complexity

Abiteboul – DBPL Summary of results [PODS 2011] Formal definition of the semantics of Webdamlog Results on expressivity  the model with delegation is more general, unless all peers and programs are known in advance Convergence is very hard to achieve  positive Webdamlog  strongly stratified programs with negation

Abiteboul – DBPL The Web as a distributed knowledge base Webdamlog: a rule-based language for the Web Access control in Webdamlog The Webdamlog system Conclusion

Abiteboul – DBPL Requirements Data access Users would like to control who can read and modify their information Data dissemination Users would like to control how their data are transferred from one participant to another, and how they are combined, with the owner of each piece of data keeping some control over it Application control Users would like to control which applications can run on their behalf, and what information these applications can access. 29

Abiteboul – DBPL The general picture The privileges we consider: read, write, grant For read: Coarse grained access control: at the relation level Fine grain access control: at the tuple level 30

Abiteboul – DBPL Insertion in extentional relations Definition of intensional relations Requires write privilege on the target relation [at Alice] :- “Friend”), $p), $f) [at Alice] : [at Bob] :- 31

Abiteboul – DBPL Who can read a fact ? – default Extensional relations: if you have read privilege to the relation Intensional relations: if you have read privilege to the relation & if you can read all the tuples that have been used to create this fact – provenance of the fact 32

Abiteboul – DBPL Digression: provenance Provenance of a tuple How it was constructed: conjunction Alternatives: disjunction 33

Abiteboul – DBPL Digression: provenance graph John) rule 3 × Julia’s birthday) Julia's birthday) rule 1 × × John) × + (Also used for maintenance in case of update)

Abiteboul – DBPL Coarse grain access control [at Alice] :- “Friend”), $p), $f) is extensional Whoever has read access to sees all the relation 35

Abiteboul – DBPL Fine grain access control [at Alice] : [at Bob] :- is intensional Sue who has read privilege to and alicePhotos only, can see only the photos of Alice in allPhotos Lili who has read privilege to the three relations, sees everything 36

Abiteboul – DBPL Overwriting the default for intensional data Let us change the rule to: [at Alice] :- Issue: you can read the photos only if you also have read privilege to 37

Abiteboul – DBPL Overwriting the default for intensional data [at Alice] :- [hide Hide: block the provenance from Similar mechanism for extensional data – expose 38

Abiteboul – DBPL Issues with non local rules [at Bob] hate you”) :- :- Ignoring access rights, by delegation, this results in running [at Alice] hate you”) :- :- 39

Abiteboul – DBPL Default solution: sand box We run the rule at Alice in a Sandbox We use the access rights of Bob So the second rule does not succeed in sending secrets The message specifies that this is done at Bob’s request So requires authentication/signatures Alternative: delegation without sandbox. Possible if the peer that asks for the delegation is given the privilege to install rules at the other peer – Here if Alice gives Bob the right to install a rule in her environment 40

Abiteboul – DBPL Access control implementation A program with access control is compiled locally in a Webdamlog program without that is executed Access control data is managed like any other data Relation acl (defines relation access) Relation kind (ext or int) Based on provenance implemented as a distributed graph On-going work on optimization 41

Abiteboul – DBPL The Web as a distributed knowledge base Webdamlog: a rule-based language for the Web Access control in Webdamlog The Webdamlog system

Abiteboul – DBPL The Webdamlog engine Based on Bud developed at UC Berkeley Manages knowledge  Stores facts and rules  exchanges knowledge with other engines  performs reasoning

Abiteboul – DBPL The engine: beyond Bud Compilation of (Bud’s language) Main Webdamlog features not supported by Bud 1. Variable relation and peer names 2. Delegations with dynamic changes of the program Webdamlog+AC ⇒ Webdamlog ⇒ Bloom

Abiteboul – DBPL The Webdamlog peer Support communication with other peers and with users Support common security protocols Support wrappers to external systems such as Facebook Provides Web interfaces

Abiteboul – DBPL Provenance graphs Records the history of derivation Provenance semiring semantics [Green et al. 07] Used for performance optimization Used for fine grain access control Other possible uses such as explanation of results

Abiteboul – DBPL The Web as a distributed knowledge base Webdamlog: a rule-based language for the Web Access control in Webdamlog The Webdamlog system Conclusion

Abiteboul – DBPL Thesis Let us turn the Web into a distributed knowledge base with billions of users supported by billions of systems analyzing information extracting knowledge exchanging knowledge inferring knowledge 48

Abiteboul – DBPL Webdamlog Language A language for distributed data management [PODS 2011] Datalog with distribution, updates, messaging Main novelty: delegation Implementation WebdamExchange peer in Java [demo ICDE 2011] Webdamlog engine based on Bud [demo Sigmod 2013] Access control: on-going work with Miklau- Stoyanovich Probabilistic Webdamlog: on-going work with Deutch- Vianu 49

Cambridge University Press, Grazie !