Download presentation
Presentation is loading. Please wait.
Published byCecil Dalton Modified over 9 years ago
1
Enabling Sovereign Information Sharing Using Web Services R. Agrawal, D. Asonov, R. Srikant IBM Almaden Research Center P. Baliga, L. Liang, B. Porst Additional contributors: P. Baliga, L. Liang, B. Porst
2
A Reusable Platform for Building Sovereign Information Sharing Applications R. Agrawal, D. Asonov, P. Baliga, L. Liang, B. Porst R. Srikant IBM Almaden Research Center
3
Outline Background Background Implementation Architecture Implementation Architecture Resource discovery, Schema mapping, and Authentication Resource discovery, Schema mapping, and Authentication Performance Performance Conclusion Conclusion R. Agrawal, A. Evfimievski, R. Srikant. Information Sharing Across Private Databases. SIGMOD 03.
4
Assumption: Information in each database can be freely shared. Information Integration Today Mediator Q R Federated Q R Centralized
5
Need for a new style of information sharing Compute queries across databases so that no more information than necessary is revealed (without using a trusted third party). Compute queries across databases so that no more information than necessary is revealed (without using a trusted third party). Need is driven by several trends: Need is driven by several trends: –End-to-end integration of information systems across companies (virtual organizations) –Simultaneously compete and cooperate. –Security: need-to-know information sharing
6
Security Application Security Agency finds those passengers who are in its list of suspects, but not the names of other passengers. Security Agency finds those passengers who are in its list of suspects, but not the names of other passengers. Airline does not find anything. Airline does not find anything. Agency Suspect List Airline Passenger List http://www.informationweek.com/story/showArticle.jhtml?articleID=184010%79
7
Medical Research Validate hypothesis between adverse reaction to a drug and a specific DNA sequence. Validate hypothesis between adverse reaction to a drug and a specific DNA sequence. Researchers should not learn anything beyond 4 counts: Researchers should not learn anything beyond 4 counts: Mayo Clinic DNA Sequences Drug Reactions Adverse Reaction No Adv. Reaction Sequence Present ?? Sequence Absent ??
8
R S R must not know that S has b & y S must not know that R has a & x u v RSRSau v x bu v y R S Count (R S) R & S do not learn anything except that the result is 2. Minimal Necessary Sharing
9
Problem Statement: Minimal Sharing Given: Given: –Two parties (honest-but-curious): R (receiver) and S (sender) –Query Q spanning the tables R and S –Additional (pre-specified) categories of information I Compute the answer to Q and return it to R without revealing any additional information to either party, except for the information contained in I Compute the answer to Q and return it to R without revealing any additional information to either party, except for the information contained in I –For intersection, intersection size & equijoin, I = { |R|, |S| } –For equijoin size, I also includes the distribution of duplicates & some subset of information in R S
10
A Possible Approach Secure Multi-Party Computation Secure Multi-Party Computation –Given two parties with inputs x and y, compute f(x,y) such that the parties learn only f(x,y) and nothing else. –Can be solved by building a combinatorial circuit, and simulating that circuit [Yao86]. Prohibitive cost for database-size problems. Prohibitive cost for database-size problems. –Intersection of two relations of a million records each would require 144 days (Yao’s protocol)
11
Intersection Protocol RS R S Secret key a b f b (S ) Shorthand for { f b (s) | s S } Commutative Encryption f a (f b (s)) = f b (f a (s)) f(s,b,p) = s b mod p
12
R Intersection Protocol S R S f b (S) f a (f b (S )) a b f b (f a (S )) Commutative property
13
R Intersection Protocol S R S f a (R ) f b (f a (S )) { } a b { } Since R knows
14
Implementation: Grid of Data Services DP DB Server meta data Data Provider SIS Server n DP DB Server meta data Data Provider SIS Server 1 Application SIS Client User Application Developer Client Metadata SIS Platform Constructs web service query requests against multiple data providers, and collects responses. Mapping information and data provider access information. Thin layer on top of the SIS client: invokes the required SIS operations, provides an interface to a SIS user. Includes view information to retrieve data from the data provider database, database access information, and context information. Provides the necessary functionality on the data provider side to enable sovereign sharing. Templates to aid application development
15
Implementation Environment Data resides in DB2 v.8.1. database systems, installed on 2.4GHz/ 512MB RAM Intel workstations, connected by a 100Mbit LAN network. Data resides in DB2 v.8.1. database systems, installed on 2.4GHz/ 512MB RAM Intel workstations, connected by a 100Mbit LAN network. Web services run on top of the IBM WebSphere Application Server v.5.0 and use Apache AXIS v.1.1. SOAP library for messaging. Web services run on top of the IBM WebSphere Application Server v.5.0 and use Apache AXIS v.1.1. SOAP library for messaging. IBM private UDDI registry installed on one of the machines. IBM private UDDI registry installed on one of the machines.
16
Issues How does the application developer find the necessary data sources and their schemas? How does the application developer find the necessary data sources and their schemas? – need a resource discovery mechanism How does the application developer link the data between different providers? How does the application developer link the data between different providers? –need a schema mapping mechanism How to ensure that only eligible users can carry out the computation? How to ensure that only eligible users can carry out the computation? –need an authentication mechanism
17
Resource Discovery Problem: Finding the necessary data sources and their schemas Problem: Finding the necessary data sources and their schemas Solution: Employ a UDDI registry to store and search Solution: Employ a UDDI registry to store and search –data providers and operations they support –available schemas for each data provider TSA and AL publish their schemas in the Business Services elements of the private UDDI. TSA and AL publish their schemas in the Business Services elements of the private UDDI. <Table Name=“Passenger List” Schema=“AL”> Passenger Name … <Table Name=“Suspect List” Schema=“TSA”> Suspect Name …
18
Schema Mapping 1. Global: Data providers use a standard domain-specific vocabulary (e.g. Rosetta Net) and schema.Data providers use a standard domain-specific vocabulary (e.g. Rosetta Net) and schema. Data providers map local schema to global schema.Data providers map local schema to global schema. 2. Application Specific: Every data provider maps local schema to the application schema, separately for each application.Every data provider maps local schema to the application schema, separately for each application. Every data provider updates its mapping as the application evolves.Every data provider updates its mapping as the application evolves. 3. Sovereign: Data providers publish schemas in their own vocabularies.Data providers publish schemas in their own vocabularies. Developers link the schemas.Developers link the schemas. - Least burden on data providers - Maximal autonomy for data providers and developers
19
Schema Mapping: TSA-AL scenario <Table Name=“Passenger List” Schema=“AL”> Passenger Name … <Table Name=“SuspectList” Schema=“TSA”> Suspect Name … Airline schemaTSA schema The application developer determines mapping (possibly negotiating using the information in the Business Entity element of the UDDI registry)
20
Authentication Across Multiple Domains Auth Token User: Bob Authentication Authority (AA) Username: Bob Password: **** Authenticate Bob Certified by AA Client application 1 User: Bob Certified by AA Sovereign Airlines Access Manager Web Service Request 4 Application receives an authentication token from AA 23 The token is used to authenticate the client application to the client web service.
21
Authentication Across Multiple Domains Auth Token User: Bob Authentication Authority (AA) Username: Bob Password: **** Authenticate Bob Certified by AA Client application 1 User: Bob Certified by AA Sovereign Airlines Access Manager Web Service Request 4 Certified by SA Token Validation Web Service Request User: Bob Certified by AA 5 6 TSA Access Manager 23 Client web service signs the token from AA to authenticate the request to the server web service.
22
Authentication Across Multiple Domains Auth Token User: Bob Authentication Authority (AA) Username: Bob Password: **** Authenticate Bob Certified by AA Client application 1 User: Bob Certified by AA Sovereign Airlines Access Manager Web Service Request 4 Certified by SA Token Validation Web Service Request User: Bob Certified by AA 5 6 TSA Access Manager Web Service Response 7 Token Validation Web Service Response 9 8 23
23
Security Application: Execution TSA encrypts Suspect Name column of Suspect List table. TSA encrypts Suspect Name column of Suspect List table. TSA sends an intersection web-service request to AL, with the encrypted Suspect List as a SOAP attachment. TSA sends an intersection web-service request to AL, with the encrypted Suspect List as a SOAP attachment. AL encrypts Passenger Name column of Passenger List table and double-encrypts the encrypted Suspect List from TSA. AL encrypts Passenger Name column of Passenger List table and double-encrypts the encrypted Suspect List from TSA. AL sends a web-service response to TSA with both encrypted tables as attachments. AL sends a web-service response to TSA with both encrypted tables as attachments. TSA double-encrypts Passenger List from AL. TSA double-encrypts Passenger List from AL. Finally, TSA uses both double-encrypted tables to perform the intersection and returns the results to the application. Finally, TSA uses both double-encrypted tables to perform the intersection and returns the results to the application.
24
PerformanceImplementationms Java program 32 Java DB2 UDF 33-34 Exponentiation time for one number (Intel P3) 65 ms MS Visual C++ (Crypto++ library)
25
Making Encryption Faster: Software Approaches The main component of encryption is exponentiation: enc(x, k, p) = x k mod p The main component of encryption is exponentiation: enc(x, k, p) = x k mod p Tried custom implementations of exponentiation that used preprocessing based on Tried custom implementations of exponentiation that used preprocessing based on –fixed exponent (k) –fixed base (x) Fixed exponent implementation turned out to be slower than the Java native implementation Fixed exponent implementation turned out to be slower than the Java native implementation Fixed based is beneficial if the same value is encrypted multiple times with different keys (not useful for intersection where each value is encrypted once) Fixed based is beneficial if the same value is encrypted multiple times with different keys (not useful for intersection where each value is encrypted once)
26
Making Encryption Faster: Hardware Accelerator Use SSL card to speed-up exponentiation Use SSL card to speed-up exponentiation Multiple threads (100+) must post exponentiation request simultaneously to the card API to get the advertised speed-up Multiple threads (100+) must post exponentiation request simultaneously to the card API to get the advertised speed-up AEP scheduler distributes exponentiation requests between multiple cards automatically; linear speed-up AEP scheduler distributes exponentiation requests between multiple cards automatically; linear speed-up Example: AEP SSL CARD Runner 2000 ≈ $2k
27
Execution time: Encryption UDF Encryption Engine Number of rows in the table 1,0005,00010,000 CPU Intel III 2.0 Ghz 34s 175s 320s AEP Runner 2000 3.5s 19s 37s
28
Application Performance Encryption speed is 20K encryptions per minute using one accelerator card ($2K per card) Encryption speed is 20K encryptions per minute using one accelerator card ($2K per card) TSA-Airline: 150,000 (daily) passengers and 1 million people in the watch list: TSA-Airline: 150,000 (daily) passengers and 1 million people in the watch list: 120 minutes with one accelerator card 12 minutes with ten accelerator cards Epidemiological research: 1 million patient records in the hospital and 10 million records in the Genebank: Epidemiological research: 1 million patient records in the hospital and 10 million records in the Genebank: 37 hours with one accelerator cards 3.7 hours with ten accelerator cards
29
Related Work [Naor & Pinkas 99]: Two protocols for list intersection problem [Naor & Pinkas 99]: Two protocols for list intersection problem –Oblivious evaluation of n polynomials of degree n each. –Oblivious evaluation of n 2 linear polynomials. [Huberman et al 99]: find people with common preferences, without revealing the preferences. [Huberman et al 99]: find people with common preferences, without revealing the preferences. –Intersection protocols are similar [Clifton et al, 2003]: Secure set union and set intersection [Clifton et al, 2003]: Secure set union and set intersection –Similar protocols
30
Summary and Challenges New applications require us to go beyond traditional centralized and federated information integration: sovereign information integration New applications require us to go beyond traditional centralized and federated information integration: sovereign information integration Demonstrated feasibility of realizing sovereign sharing Demonstrated feasibility of realizing sovereign sharing Need models of minimal disclosure and corresponding protocols for Need models of minimal disclosure and corresponding protocols for –other database operations –combination of operations Need faster commutative encryption Need faster commutative encryption Need further study of tradeoff between efficiency and Need further study of tradeoff between efficiency and –additional information disclosed –approximation
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.