Space for things we might want to put at the bottom of each slide. Part 6: Open Problems 1 Marianne Winslett 1,3, Xiaokui Xiao 2, Yin Yang 3, Zhenjie Zhang.

Slides:



Advertisements
Similar presentations
Scalable and Dynamic Quorum Systems Moni Naor & Udi Wieder The Weizmann Institute of Science.
Advertisements

Publishing Set-Valued Data via Differential Privacy Rui Chen, Concordia University Noman Mohammed, Concordia University Benjamin C. M. Fung, Concordia.
Differentially Private Transit Data Publication: A Case Study on the Montreal Transportation System ` Introduction With the deployment of smart card automated.
Active Learning for Streaming Networked Data Zhilin Yang, Jie Tang, Yutao Zhang Computer Science Department, Tsinghua University.
Private Analysis of Graph Structure With Vishesh Karwa, Sofya Raskhodnikova and Adam Smith Pennsylvania State University Grigory Yaroslavtsev
Raef Bassily Adam Smith Abhradeep Thakurta Penn State Yahoo! Labs Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds Penn.
SVG Graph Browsers Data Visualization and Exploration With Directed Graphs in SVG.
Game Theory, Mechanism Design, Differential Privacy (and you). Aaron Roth DIMACS Workshop on Differential Privacy October 24.
CPSC 322, Lecture 35Slide 1 Finish VE for Sequential Decisions & Value of Information and Control Computer Science cpsc322, Lecture 35 (Textbook Chpt 9.4)
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
Tracking Moving Objects in Anonymized Trajectories Nikolay Vyahhi 1, Spiridon Bakiras 2, Panos Kalnis 3, and Gabriel Ghinita 3 1 St. Petersburg State University.
The community-search problem and how to plan a successful cocktail party Mauro SozioAris Gionis Max Planck Institute, Germany Yahoo! Research, Barcelona.
Abstract Shortest distance query is a fundamental operation in large-scale networks. Many existing methods in the literature take a landmark embedding.
The Union-Split Algorithm and Cluster-Based Anonymization of Social Networks Brian Thompson Danfeng Yao Rutgers University Dept. of Computer Science Piscataway,
Building a Strong Foundation for a Future Internet Jennifer Rexford ’91 Computer Science Department (and Electrical Engineering and the Center for IT Policy)
Database Access Control & Privacy: Is There A Common Ground? Surajit Chaudhuri, Raghav Kaushik and Ravi Ramamurthy Microsoft Research.
Active Learning for Probabilistic Models Lee Wee Sun Department of Computer Science National University of Singapore LARC-IMS Workshop.
Differentially Private Transit Data Publication: A Case Study on the Montreal Transportation System Rui Chen, Concordia University Benjamin C. M. Fung,
Tomo-gravity Yin ZhangMatthew Roughan Nick DuffieldAlbert Greenberg “A Northern NJ Research Lab” ACM.
University of Toronto Department of Computer Science © 2001, Steve Easterbrook CSC444 Lec22 1 Lecture 22: Software Measurement Basics of software measurement.
Computational Stochastic Optimization: Bridging communities October 25, 2012 Warren Powell CASTLE Laboratory Princeton University
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Charu Aggarwal + * Department of Computer Science, University of Texas at Dallas + IBM T. J. Watson.
CS573 Data Privacy and Security Statistical Databases
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Michael Baron + * Department of Computer Science, University of Texas at Dallas + Department of Mathematical.
CS Learning Rules1 Learning Sets of Rules. CS Learning Rules2 Learning Rules If (Color = Red) and (Shape = round) then Class is A If (Color.
RESOURCES, TRADE-OFFS, AND LIMITATIONS Group 5 8/27/2014.
Reinforcement Learning for Spoken Dialogue Systems: Comparing Strengths & Weaknesses for Practical Deployment Tim Paek Microsoft Research Dialogue on Dialogues.
Unsupervised Learning: Clustering Some material adapted from slides by Andrew Moore, CMU. Visit for
Black-box Testing.
School of Computer Science, The University of Adelaide© The University of Adelaide, Australian Computer Science Week 2005 Selected papers from: ACSC.
Personalized Social Recommendations – Accurate or Private? A. Machanavajjhala (Yahoo!), with A. Korolova (Stanford), A. Das Sarma (Google) 1.
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
1 Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld.
Inference Complexity As Learning Bias Daniel Lowd Dept. of Computer and Information Science University of Oregon Joint work with Pedro Domingos.
1 Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld.
Florida State UniversityZhenhai Duan1 BCSQ: Bin-based Core Stateless Queueing for Scalable Support of Guaranteed Services Zhenhai Duan Karthik Parsha Department.
Foundations of Privacy Lecture 5 Lecturer: Moni Naor.
Part 3: Query Processing -- Data-Independent Methods 1 Marianne Winslett 1,3, Xiaokui Xiao 2, Yin Yang 3, Zhenjie Zhang 3, Gerome Miklau 4 1 University.
A Whirlwind Tour of Differential Privacy
Differential Privacy Xintao Wu Oct 31, Sanitization approaches Input perturbation –Add noise to data –Generalize data Summary statistics –Means,
Smart Sleeping Policies for Wireless Sensor Networks Venu Veeravalli ECE Department & Coordinated Science Lab University of Illinois at Urbana-Champaign.
Private Release of Graph Statistics using Ladder Functions J.ZHANG, G.CORMODE, M.PROCOPIUC, D.SRIVASTAVA, X.XIAO.
1 Differential Privacy Cynthia Dwork Mamadou H. Diallo.
Yang, et al. Differentially Private Data Publication and Analysis. Tutorial at SIGMOD’12 Part 4: Data Dependent Query Processing Methods Yin “David” Yang.
Optimization of Association Rules Extraction Through Exploitation of Context Dependent Constraints Arianna Gallo, Roberto Esposito, Rosa Meo, Marco Botta.
Written By: Presented By: Swarup Acharya,Amr Elkhatib Phillip B. Gibbons, Viswanath Poosala, Sridhar Ramaswamy Join Synopses for Approximate Query Answering.
Output Perturbation with Query Relaxation By: XIAO Xiaokui and TAO Yufei Presenter: CUI Yingjie.
Sergey Yekhanin Institute for Advanced Study Lower Bounds on Noise.
Reconciling Confidentiality Risk Measures from Statistics and Computer Science Jerry Reiter Department of Statistical Science Duke University.
Computer Science and Engineering Jianye Yang 1, Ying Zhang 2, Wenjie Zhang 1, Xuemin Lin 1 Influence based Cost Optimization on User Preference 1 The University.
Auburn University
Semi-Supervised Clustering
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Exploratory Decomposition Dr. Xiao Qin Auburn.
Chapter 8: Concurrency Control on Relational Databases
Data Science Algorithms: The Basic Methods
FORA: Simple and Effective Approximate Single­-Source Personalized PageRank Sibo Wang, Renchi Yang, Xiaokui Xiao, Zhewei Wei, Yin Yang School of Information.
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Mapping Techniques Dr. Xiao Qin Auburn University.
Understanding Generalization in Adaptive Data Analysis
Generalization and adaptivity in stochastic convex optimization
Associative Query Answering via Query Feature Similarity
Vitaly (the West Coast) Feldman
Current Developments in Differential Privacy
Lecture 16: Probabilistic Databases
Transactions.
Differential Privacy (2)
Learning Probabilistic Graphical Models Overview Learning Problems.
Intro to Machine Learning
Gentle Measurement of Quantum States and Differential Privacy *
Overview: Chapter 2 Localization and Tracking
Presentation transcript:

Space for things we might want to put at the bottom of each slide. Part 6: Open Problems 1 Marianne Winslett 1,3, Xiaokui Xiao 2, Yin Yang 3, Zhenjie Zhang 3, Gerome Miklau 4 1 University of Illinois at Urbana Champaign, USA 2 Nanyang Technological University, Singapore 3 Advanced Digital Sciences Center, Singapore 4 University of Massachusetts, Amherst, USA

Space for things we might want to put at the bottom of each slide. Practical usability & deployment Privacy parameters How should the data owner set privacy parameters to comply with regulation or internal policies? Efficiency & Scalability Executing some mechanisms is still expensive. Some adaptive mechanisms require an optimization step prior to execution. Dependence on the domain of the data is problematic. Automated mechanisms for novice users Allow novice users get the best utility for their task. 2

Space for things we might want to put at the bottom of each slide. Error bounds & optimal mechanisms For a given task T, what is the most accurate method of performing T under epsilon-DP? Current answers to this question only for limited T. When is a task or query set “hard”? Note: sensitivity is not a sufficient answer! For data-independent mechanisms: How can we measure the “hardness” of a task T or a set of queries, in terms of the accuracy achievable? For data-dependent mechanisms: How can we measure the “hardness” of a task T on a dataset D. 3

Space for things we might want to put at the bottom of each slide. Complex data models Classical DP is defined for a database consisting of a single relation. Naïve extensions to more complex data models do not always provide desirable privacy properties. E.g. for graph data: node DP vs. edge DP Extensions to complex schemas with key/foreign-key relationships not clear. 4

Space for things we might want to put at the bottom of each slide. Measuring utility Are we designing differentially private methods with the right utility measures in mind? A typical approach: Given target task T, decompose in query set Q. Develop method to compute Q with low “error” where “error” is max or avg squared error. For complex tasks, does this approach lead to the best utility? 5

Space for things we might want to put at the bottom of each slide. Inconsistency Perturbed DP output often violates constraints known to hold on the true data. Coping with inconsistency: Analysts often cannot use data that violates constraints known to hold on the true data. Remove inconsistency: find closest consistent output. Use exponential mechanism to select consistent output. Exploiting inconsistency When/why does removing inconsistency from noisy output improve utility? 6