Auditing Information Leakage for Distance Metrics Yikan Chen David Evans www.securecomputation.org TexPoint fonts used in EMF. Read the TexPoint manual.

Slides:



Advertisements
Similar presentations
Review: Search problem formulation
Advertisements

Efficient Information Retrieval for Ranked Queries in Cost-Effective Cloud Environments Presenter: Qin Liu a,b Joint work with Chiu C. Tan b, Jie Wu b,
Discovering Lag Interval For Temporal Dependencies Larisa Shwartz Liang Tang, Tao Li, Larisa Shwartz1 Liang Tang, Tao Li
Solving Problem by Searching
School of Computer Science and Engineering Finding Top k Most Influential Spatial Facilities over Uncertain Objects Liming Zhan Ying Zhang Wenjie Zhang.
Maintaining Variance and k-Medians over Data Stream Windows Brian Babcock, Mayur Datar, Rajeev Motwani, Liadan O’Callaghan Stanford University.
Serverless Search and Authentication Protocols for RFID Chiu C. Tan, Bo Sheng and Qun Li Department of Computer Science College of William and Mary.
Los Angeles September 27, 2006 MOBICOM Localization in Sparse Networks using Sweeps D. K. Goldenberg P. Bihler M. Cao J. Fang B. D. O. Anderson.
Advanced Topics in Algorithms and Data Structures Lecture 6.1 – pg 1 An overview of lecture 6 A parallel search algorithm A parallel merging algorithm.
Advanced Topics in Algorithms and Data Structures Page 1 Parallel merging through partitioning The partitioning strategy consists of: Breaking up the given.
Seminar in Foundations of Privacy 1.Adding Consistency to Differential Privacy 2.Attacks on Anonymized Social Networks Inbal Talgam March 2008.
Review: Search problem formulation
How Should We Solve Search Problems Privately? Kobbi Nissim – BGU A. Beimel, T. Malkin, and E. Weinreb.
1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.
Parallel Merging Advanced Algorithms & Data Structures Lecture Theme 15 Prof. Dr. Th. Ottmann Summer Semester 2006.
Computing Sketches of Matrices Efficiently & (Privacy Preserving) Data Mining Petros Drineas Rensselaer Polytechnic Institute (joint.
1 An Empirical Study on Large-Scale Content-Based Image Retrieval Group Meeting Presented by Wyman
Preference Analysis Joachim Giesen and Eva Schuberth May 24, 2006.
Parallel Computation in Biological Sequence Analysis Xue Wu CMSC 838 Presentation.
Foundations of Privacy Lecture 11 Lecturer: Moni Naor.
Online Learning Algorithms
Impact Analysis of Database Schema Changes Andy Maule, Wolfgang Emmerich and David S. Rosenblum London Software Systems Dept. of Computer Science, University.
C&O 355 Mathematical Programming Fall 2010 Lecture 17 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A.
Time Complexity Dr. Jicheng Fu Department of Computer Science University of Central Oklahoma.
A Secure Protocol for Computing Dot-products in Clustered and Distributed Environments Ioannis Ioannidis, Ananth Grama and Mikhail Atallah Purdue University.
Preserving Link Privacy in Social Network Based Systems Prateek Mittal University of California, Berkeley Charalampos Papamanthou.
Basel Alomair, Krishna Sampigethaya, and Radha Poovendran University of Washington TexPoint fonts used in EMF.
Statistical Databases – Query Auditing Li Xiong CS573 Data Privacy and Anonymity Partial slides credit: Vitaly Shmatikov, Univ Texas at Austin.
Problems and MotivationsOur ResultsTechnical Contributions Membership: Maintain a set S in the universe U with |S| ≤ n. Given an x in U, answer whether.
Lecture 2 Computational Complexity
Distributed Protein Structure Analysis By Jeremy S. Brown Travis E. Brown.
Bandwidth Estimation TexPoint fonts used in EMF.
Ranking Queries on Uncertain Data: A Probabilistic Threshold Approach Wenjie Zhang, Xuemin Lin The University of New South Wales & NICTA Ming Hua,
Click to edit Present’s Name Xiaoyang Zhang 1, Jianbin Qin 1, Wei Wang 1, Yifang Sun 1, Jiaheng Lu 2 HmSearch: An Efficient Hamming Distance Query Processing.
Continued Fractions, Euclidean Algorithm and Lehmer’s Algorithm Applied Symbolic Computation CS 567 Jeremy Johnson TexPoint fonts used in EMF. Read the.
Analysis of Algorithms
Chapter 3 Sec 3.3 With Question/Answer Animations 1.
Towards Robust Indexing for Ranked Queries Dong Xin, Chen Chen, Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign VLDB.
Disclosure risk when responding to queries with deterministic guarantees Krish Muralidhar University of Kentucky Rathindra Sarathy Oklahoma State University.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
David Luebke 1 10/25/2015 CS 332: Algorithms Skip Lists Hash Tables.
Leonardo Guerreiro Azevedo Geraldo Zimbrão Jano Moreira de Souza Approximate Query Processing in Spatial Databases Using Raster Signatures Federal University.
Spatio-temporal Pattern Queries M. Hadjieleftheriou G. Kollios P. Bakalov V. J. Tsotras.
Exhaustion, Branch and Bound, Divide and Conquer.
Exact indexing of Dynamic Time Warping
1 Longest Common Subsequence as Private Search Payman Mohassel and Mark Gondree U of CalgaryNPS.
Machine Learning and Data Mining Clustering (adapted from) Prof. Alexander Ihler TexPoint fonts used in EMF. Read the TexPoint manual before you delete.
By: Gang Zhou Computer Science Department University of Virginia 1 Medians and Beyond: New Aggregation Techniques for Sensor Networks CS851 Seminar Presentation.
1 Diffie-Hellman (Key Exchange) Protocol Rocky K. C. Chang 9 February 2007.
C&O 355 Lecture 19 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A A A A A A.
Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification.
On the R ange M aximum-Sum S egment Q uery Problem Kuan-Yu Chen and Kun-Mao Chao Department of Computer Science and Information Engineering, National Taiwan.
CMPT 438 Algorithms.
Introduction to Algorithms: Divide-n-Conquer Algorithms
Recursive Algorithms Section 5.4.
Advanced Algorithms Analysis and Design
Private Data Management with Verification
On the Size of Pairing-based Non-interactive Arguments
MPC and Verifiable Computation on Committed Data
The Variable-Increment Counting Bloom Filter
Privacy Preserving Similarity Evaluation of Time Series Data
Machine Learning and Data Mining Clustering
Time Series Filtering Time Series
Spatio-temporal Pattern Queries
Ioannis Ioannidis, Ananth Grama and Ioannis Ioannidis
High-Level Synthesis for Side-Channel Defense
Objective of This Course
Minimizing the Aggregate Movements for Interval Coverage
Published in: IEEE Transactions on Industrial Informatics
Efficient Aggregation over Objects with Extent
Presentation transcript:

Auditing Information Leakage for Distance Metrics Yikan Chen David Evans TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAA A A PASSAT 2011, Boston, MA Oct11, 2011

Motivating Scenario ServerClient ? ………. ? How to determine whether it is safe to release the response?

From the Attacker’s View Hamming distance between X and Q Smallest-Entropy Strategy [Neuwirth, 1982]

Contribution Target: class of distance functions An efficient query auditor to estimate maximum information leakage Technique for single bit information auditing

Strategy Determine # of secrets consistent with all queries – For a special class of functions: Reduce to 0-1 integer programming Too expensive to compute exhaustively – Use heuristics [find a meaningful bound] Assumption – clients are authenticated by the server – clients cannot share information with each other

Additively Decomposable Functions Definition A function, f (A, B), where A, B, is additively decomposable if: Edit distance Ins,del and sub Chebyshev distance Lee distance Hamming distance Squared Euclidean distance Manhattan distance

Influence Function Influence function for any sequence X’: For additively decomposable functions, for each bit is independent and for any consistent sequence over a set of queries: : changes made on ith bit of X Hamming distance:

Reduction for Hamming Distance Convert a Hamming distance sequence leakage problem to: where AK=0 X=1111Q 1 =1100 Q 2 =1110 find the number of possible 0-1 solutions for K Example:

Lower bound is good enough! Divide-and-merge algorithm Computing Number of Solutions Exact methods: exponential growth Exhaustive Search Gröbner Basis ([Bertsimas, 2000])

Divide-and-merge Algorithm Divide and exhaustive search Divide matrix A into small blocks Analyze each output exhaustively Sort and select Sort output Select best combinations Merge Merge adjacent blocks AK=0 Find # of 0-1 solutions for the equation set: X, {Q i }

Complexity r = 3 A= linear to N linear to r 2 linear to N linear to r 2 X= Q 1 = Q 2 = AK=0

Evaluation Implementation experiment settings Matlab R2010b, 4 GB RAM 2.13 GHz CPU Experiment settings Averaging from 5 experiments Scalability Tightness Real Application Performance (Iris Recognition)

Evaluation Scalability M: #queries

Evaluation Tightness See the paper for detailed results and performance on iris application

Related Work Estimating Information Leakage – No auditor ([Goodrich, Oakland 2009]) – Weak auditor ([Wang, Wang et al., CCS 2009]) Differential Privacy – Add noise to protect secret information ([Dwork, ICALP 2006]) Auditing Aggregate Queries – Auditing SUM, MEAN or MAX queries over a set of private entries in a database ([Elshiekh, 2008],[Nabar et al., VLDB 2006], [Wang, Li et al., CCS 2009])

Summary Query auditor for the server – Fast – Performance adjustable – Single-bit leakage (see paper) Challenges – Non-additive decomposable functions – Secure two-party computation protocol

Thanks !

Single Bit Information Leakage Motivation One bit leakage can also be crucial: SNP (single nucleotide polymorphism) Assumptions – The client knows nothing – Any single bit information is leaked only when this bit is 100% determined

Single Bit Information Leakage Straightforward idea AK=0 Bit fully determined = corresponding k i can only be 1 Check if there is a non-zero solution when k i =0 Make it quicker…… Build pre-computed libraries containing all possible combinations Libraries can be pre-computer since they are not related to sequence length (N), X and Q Without libraries: With libraries: checks