Private Information Retrieval Amir Houmansadr CS660: Advanced Information Assurance Spring 2015 Content may be borrowed from other resources. See the last.

Slides:



Advertisements
Similar presentations
An Ω(n 1/3 ) Lower Bound for Bilinear Group Based Private Information Retrieval Alexander Razborov Sergey Yekhanin.
Advertisements

PIR-Tor: Scalable Anonymous Communication Using Private Information Retrieval Prateek Mittal University of Illinois Urbana-Champaign Joint work with: Femi.
Efficient Information Retrieval for Ranked Queries in Cost-Effective Cloud Environments Presenter: Qin Liu a,b Joint work with Chiu C. Tan b, Jie Wu b,
CS470, A.SelcukCryptographic Authentication1 Cryptographic Authentication Protocols CS 470 Introduction to Applied Cryptography Instructor: Ali Aydin Selcuk.
Building web applications on top of encrypted data using Mylar Presented by Tenglu Liang Tai Liu.
Censorship Resistance: Decoy Routing Amir Houmansadr CS660: Advanced Information Assurance Spring 2015 Content may be borrowed from other resources. See.
The Mobile Code Paradigm and Its Security Issues Anthony Chan and Michael Lyu September 27, 1999.
CS555Topic 241 Cryptography CS 555 Topic 24: Secure Function Evaluation.
Predicting Tor Path Compromise by Exit Port IEEE WIDA 2009December 16, 2009 Kevin Bauer, Dirk Grunwald, and Douglas Sicker University of Colorado Client.
Introduction and Logistics Amir Houmansadr CS660: Advanced Information Assurance Spring 2015.
Lecture 15 Private Information Retrieval Stefan Dziembowski MIM UW ver 1.0.
Rumor Riding, IEEE ICNP2006, Jinsong Han & Yunhao Liu, HKUST, Nov 12 1 Rumor Riding Anonymizing Unstructured Peer- to-Peer System Jinsong Han and Yunhao.
Project in Computer Security Integrating TOR’s attacks into the I2P darknet Chen Avnery Amihay Vinter.
How Much Anonymity does Network Latency Leak? Paper by: Nicholas Hopper, Eugene Vasserman, Eric Chan-Tin Presented by: Dan Czerniewski October 3, 2011.
Yan Huang, Jonathan Katz, David Evans University of Maryland, University of Virginia Efficient Secure Two-Party Computation Using Symmetric Cut-and-Choose.
On Traffic Analysis in Tor Guest Lecture, ELE 574 Communications Security and Privacy Princeton University April 3 rd, 2014 Dr. Rob Jansen U.S. Naval Research.
CSCE 715 Ankur Jain 11/16/2010. Introduction Design Goals Framework SDT Protocol Achievements of Goals Overhead of SDT Conclusion.
Security and Privacy of Future Internet Architectures: Named-Data Networking Amir Houmansadr CS660: Advanced Information Assurance Spring 2015 Content.
Secure Computation of Constant-Depth Circuits with Applications to Database Search Problems Omer Barkol Yuval Ishai Technion.
Private Information Retrieval Benny Chor, Oded Goldreich, Eyal Kushilevitz and Madhu Sudan Journal of ACM Vol.45 No Reporter : Chen, Chun-Hua Date.
Crowds: Anonymity for Web Transactions Paper by: Michael K. Reiter and Aviel D. Rubin, Presented by Eric M. Busse Portions excerpt from Crowds: Anonymity.
Cryptography1 CPSC 3730 Cryptography Chapter 10 Key Management.
Mar 4, 2003Mårten Trolin1 This lecture Diffie-Hellman key agreement Authentication Certificates Certificate Authorities.
ODISSEA Mehdi Kharrazi Kulesh Shanmugasundaram Security Issues.
Privacy-Preserving Cross-Domain Network Reachability Quantification
How to Share a Secret Amos Beimel. Secret Sharing [Shamir79,Blakley79,ItoSaitoNishizeki87] ? bad.
1 Introduction to Secure Computation Benny Pinkas HP Labs, Princeton.
Privacy-Preserving Computation and Verification of Aggregate Queries on Outsourced Databases Brian Thompson 1, Stuart Haber 2, William G. Horne 2, Tomas.
Private Information Retrieval Amos Beimel – Ben-Gurion University Tel-Hai, June 4, 2003 This talk is based on talks by:
SSH Secure Login Connections over the Internet
On the Anonymity of Anonymity Systems Andrei Serjantov (anonymous)
Digital Cash By Gaurav Shetty. Agenda Introduction. Introduction. Working. Working. Desired Properties. Desired Properties. Protocols for Digital Cash.
Cong Wang1, Qian Wang1, Kui Ren1 and Wenjing Lou2
Privacy-Preserving P2P Data Sharing with OneSwarm -Piggy.
Overview of Privacy Preserving Techniques.  This is a high-level summary of the state-of-the-art privacy preserving techniques and research areas  Focus.
1 TAPAS Workshop Nicola Mezzetti - TAPAS Workshop Bologna Achieving Security and Privacy on the Grid Nicola Mezzetti.
Prateek Mittal Femi Olumofin Carmela Troncoso Nikita Borisov Ian Goldberg Presented by Justin Chester.
Cryptography on Non-Trusted Machines Stefan Dziembowski.
A Privacy-Preserving Interdomain Audit Framework Adam J. Lee Parisa Tabriz Nikita Borisov University of Illinois, Urbana-Champaign WPES 2006.
GNUTELLA PEER-TO-PEER NETWORKING. GNUTELLA n What is Gnutella n Relation to the World Wide Web n How it Works n Sites / Links / Information.
Never Been KIST: Tor’s Congestion Management Blossoms with Kernel- Informed Socket Transport 23 rd USENIX Security Symposium August 20 th 2014 Rob JansenUS.
Chapter 23 Internet Authentication Applications Kerberos Overview Initially developed at MIT Software utility available in both the public domain and.
A Linear Lower Bound on the Communication Complexity of Single-Server PIR Weizmann Institute of Science Israel Iftach HaitnerJonathan HochGil Segev.
Cryptography and Network Security (CS435) Part Eight (Key Management)
TOMA: A Viable Solution for Large- Scale Multicast Service Support Li Lao, Jun-Hong Cui, and Mario Gerla UCLA and University of Connecticut Networking.
Traffic Analysis: Network Flow Watermarking Amir Houmansadr CS660: Advanced Information Assurance Spring CS660 - Advanced Information Assurance.
Chapter 3 (B) – Key Management; Other Public Key Cryptosystems.
Merkle trees Introduced by Ralph Merkle, 1979 An authentication scheme
Lecture 16 Page 1 CS 236 Online Web Security CS 236 On-Line MS Program Networks and Systems Security Peter Reiher.
Guard Sets for Onion Routing JOSHUA FREE. Tor Most popular low-latency distributed anonymity network Controversial decisions of guard selection strategies.
On the Cryptographic Complexity of the Worst Functions Amos Beimel (BGU) Yuval Ishai (Technion) Ranjit Kumaresan (Technion) Eyal Kushilevitz (Technion)
m-Privacy for Collaborative Data Publishing
Private Information Retrieval Based on the talk by Yuval Ishai, Eyal Kushilevitz, Tal Malkin.
Traffic Correlation in Tor Source and Destination Prediction PETER BYERLEY RINDAL SULTAN ALANAZI HAFED ALGHAMDI.
Secure Data Outsourcing
1 Diffie-Hellman (Key Exchange) Protocol Rocky K. C. Chang 9 February 2007.
Center for E-Business Technology Seoul National University Seoul, Korea Private Queries in Location Based Services: Anonymizers are not Necessary Gabriel.
Professor Tzong-Chen Wu
OblivP2P: An Oblivious Peer-to-Peer Content Sharing System
Hybrid Cloud Architecture for Software-as-a-Service Provider to Achieve Higher Privacy and Decrease Securiity Concerns about Cloud Computing P. Reinhold.
Anonymous Communication
OblivP2P: An Oblivious Peer-to-Peer Content Sharing System
Privacy and Fault-Tolerance in Distributed Optimization Nitin Vaidya University of Illinois at Urbana-Champaign.
Memory Management for Scalable Web Data Servers
563.10: Bloom Cookies Web Search Personalization without User Tracking
0x1A Great Papers in Computer Security
Anupam Das , Nikita Borisov
Anonymous Communication
CS590B/690B Detecting network interference (Spring 2018)
Anonymous Communication
Presentation transcript:

Private Information Retrieval Amir Houmansadr CS660: Advanced Information Assurance Spring 2015 Content may be borrowed from other resources. See the last slide for acknowledgements!

AOL search data scandal (2006) # : clothes for age single men best retirement city jarrett arnold jack t. arnold jaylene and jarrett arnold gwinnett county yellow pages rescue of older dogs movies for dogs sinus infection Thelma Arnold 62-year-old widow Lilburn, Georgia

Observation The owners of the database know a lot about the users! This poses a risk to users’ privacy. E.g. consider database with stock prices… Can we do something about it? Yes, we can: trust them that they will protect our secrecy, or use cryptography! Really?

How can crypto help? Note: this problem has nothing to do with side-channels, website fingerprinting, etc. user U database D

Threat Model user U database D A new primitive: Private Information Retrieval (PIR) secure link

Private Information Retrieval (PIR) [CGKS95] Goal: allow user to query database while hiding the identity of the data-items she is after. Note: hides identity of data-items; not existence of interaction with the user. Motivation: patient databases; stock quotes; web access; many more.... Paradox(?): imagine buying in a store without the seller knowing what you buy. (Encrypting requests is useful against third parties; not against owner of data.)

Model Server: holds n-bit string x n should be thought of as very large User: wishes – to retrieve x i and – to keep i private

Private Information Retrieval (PIR) x=x 1,x 2,..., x n {0,1} n SERVER i {1,…n} xixi USER ij ? n

NO privacy!!! Communication: 1 SERVER USER x =x 1,x 2,..., x n xixi Non-Private Protocol i i {1,…n}

Server sends entire database x to User. Information theoretic privacy. Communication: n SERVER xixi USER x =x 1,x 2,..., x n x 1,x 2,..., x n Trivial Private Protocol Not optimal !

Other solutions? User asks for additional random indices. Drawback: leaks information, reduces communication efficiency Employ general crypto protocols to compute x i privately. Drawback: highly inefficient (polynomial in n). Anonymity (e.g., via Anonymizers). Note: different concern: hides identity of user; not the fact that x i is retrieved.

Two Approaches for PIR Information-Theoretic PIR [CGKS95,Amb97,...] Replicate database among k servers. User queries all the servers Computational PIR [CG97,KO97,CMS99,...] Computational privacy, based on cryptographic assumptions.

Known Comm. Upper Bounds Multiple servers, information-theoretic PIR: 2 servers, comm. n 1/3 [CGKS95] k servers, comm. n 1/  (k) [CGKS95, Amb96,…,BIKR02] log n servers, comm. Poly( log(n) ) [BF90, CGKS95] Single server, computational PIR: Comm. Poly( log(n) ) Under appropriate computational assumptions [KO97,CMS99] Sub-linear with n

Approach I: k-Server PIR Correctness: User obtains x i Privacy: No single server gets information about i U S1S1 x {0,1} n S2S2 i SkSk

A 2-server Information Theoretical PIR S2S2 i U i n S1S

S2S2 i U i n Q 1 subset {1,…,n} S1S

Protocol I: 2-server PIR S2S2 i U i n Q 1 subset {1,…,n} S1S

Protocol I: 2-server PIR S2S2 i U i n Q 1 subset {1,…,n} S1S1 Q 2 =Q 1 + {i}

Protocol I: 2-server PIR S2S2 i U i n Q 1 subset {1,…,n} S1S1 Q 2 =Q 1 + {i} Weakness: Servers should not collude!

Protocol I: 2-server PIR S2S2 i U i n Q 1 subset {1,…,n} S1S1 Q 2 =Q 1 + {i} Weakness: Servers should not collude!

Computation PIR Only one server, no need to trust Based on cryptographic assumptions Downside: Server has to run over the whole database, otherwise leaks information – High computation load on the server CS660 - Advanced Information Assurance - UMassAmherst 21

PIR-Tor: Scalable Anonymous Communication Using Private Information Retrieval Prateek Mittal University of Illinois Urbana-Champaign Joint work with: Femi Olumofin (U Waterloo) Carmela Troncoso (KU Leuven) Nikita Borisov (U Illinois) Ian Goldberg (U Waterloo) 22 Original slides from the authors USENIX Security 2011

Tor Background List of servers? 23 Trusted Directory Authority Guards Exit Middle 1. Load balancing 2. Exit policy Directory Servers Signed Server list (relay descriptors)

Performance Problem in Tor’s Architecture: Global View Global view – Not scalable Need solutions without global system view 24 List of servers? Directory Servers Torsk – CCS09

Current Solution: Peer-to-peer Paradigm Morphmix [WPES 04] – Broken [PETS 06] Salsa [CCS 06] – Broken [CCS 08, WPES 09] NISAN [CCS 09] – Broken [CCS 10] Torsk [CCS 09] – Broken [CCS 10] ShadowWalker [CCS 09] – Broken and fixed(??) [WPES 10] Very hard to argue security of a distributed, dynamic and complex P2P system. 25

Design Goals A scalable client-server architecture with easy to analyze security properties. – Avoid increasing the attack surface Equivalent security to Tor – Preserve Tor’s constraints Guard/middle/exit relays, Load balancing – Minimal changes Only relay selection algorithm 26

Key Observation Need only 18 random middle/exit relays in 3 hours – So don’t download all 2000! Naïve approach: download a few random relays from directory servers – Problem: malicious servers – Route fingerprinting attacks Download selected relay descriptors without letting directory servers know the information we asked for. Private Information Retrieval (PIR) Inference: User likely to be Bob Directory Server Relay # 10, 25 10: IP address, key 25: IP address, key Bob

Private Information Retrieval (PIR) Information theoretic PIR – Multi-server protocol – Threshold number of servers don’t collude Computational PIR – Single server protocol – Computational assumption on server Only ITPIR-Tor in this talk – See paper for CPIR-Tor 28 R C A B A Database C R B R A RARA

MiddleExit Guards Exit relay compromised: ITPIR-Tor: Database Locations Tor places significant trust in guard relays – 3 compromised guard relays suffice to undermine user anonymity in Tor. Choose client’s guard relays to be directory servers 29 MiddleExit Guards Exit relay honest End-to-end Timing Analysis Deny Service MiddleExit Guards At least one guard relay is honest ITPIR guarantees user privacy MiddleExit Guards All guard relays compromised ITPIR does not provide privacy But in this case, Tor anonymity broken Equivalent security to the current Tor network

ITPIR-Tor Database Organization and Formatting Middles, exits – Separate databases Exit policies – Standardized exit policies – Relays grouped by exit policies Load balancing – Relays sorted by bandwidth Relay Descriptors Exit Policy 1 Exit Policy 2 Non- standard Exit policies MiddlesExits e4 e3 e5 e6 e2 e1 e7 e8 m4 m3 m5 m6 m2 m1 m7 m8 Sort by Bandwidth 30

ITPIR-Tor Architecture 31 Trusted Directory Authority Guard relays/ PIR Directory servers 5.18 PIR Queries(1 middle/exit) 2. Initial connect 3. Signed meta-information 6. PIR Response 1. Download PIR database 4. Load balanced index selection middle,18 PIR Query(exit) MiddlesExits e4 e3 e5 e6 e2 e1 e7 e8 m4 m3 m5 m6 m2 m1 m7 m8

Performance Evaluation Percy [Goldberg, Oakland 2007] – Multi-server ITPIR scheme 2.5 GHz, Ubuntu Descriptor size 2100 bytes – Max size in the current database Exit database size – Half of middle database Methodology: Vary number of relays – Total communication – Server computation 32

Performance Evaluation: Communication Overhead 33 Current Tor network: 5x--100x improvement Advantage of PIR-Tor becomes larger due to its sublinear scaling: 100x--1000x improvement 1.1 MB 216 KB 12 KB

Performance Evaluation: Server Computational Overhead 34 Current Tor network: less than 0.5 sec 100,000 relays: about 10 seconds (does not impact user latency)

Performance Evaluation: Scaling Scenarios 35 Scenario Tor Communication (per client) ITPIR Communication (per client) ITPIR Core Utilization ExplanationRelayClients Current Tor 2,000250, MB0.2 MB0.425 % 10x relay/client 20,0002.5M11 MB0.5 MB4.25 % Clients turn relays 250, MB1.7 MB0.425 %

Conclusion PIR can be used to replace descriptor download in Tor. – Improves scalability 10x current network size: very feasible 100x current network size : plausible – Easy to understand security properties Side conclusion: Yes, PIR can have practical uses! Questions? 36

Acknowledgement Some of the slides, content, or pictures are borrowed from the following resources, and some pictures are obtained through Google search without being referenced below: Stefan Dziembowski, Private Information RetrievalPrivate Information Retrieval Amos Beimel, Private Information RetrievalPrivate Information Retrieval Prateek Mittal, PIR-TorPIR-Tor 37 CS660 - Advanced Information Assurance - UMassAmherst