Download presentation
Presentation is loading. Please wait.
1
A model for data revelation
Poorvi Vora Dept. of Computer Science George Washington University
2
“Security” frameworks
Binary Divide the world into trusted and untrusted parties Provides complete revelation of information or complete protection E.g. multiparty computation, encrypted data 2/27/2019 Poorvi Vora/CS/GWU
3
Even a statistic or aggregate reveals “private” information
Secure multiparty computation reveals f(x1, x2, .. xn) And nothing more. Yet, this reveals information about all xi Thus, typical security assurances not enough 2/27/2019 Poorvi Vora/CS/GWU
4
What is privacy Control over information
Extent of information revelation Tensions between: Access to aggregate information for community Vs. Individual control reputation vs. predjudice 2/27/2019 Poorvi Vora/CS/GWU
5
Information is often given up for something in return
Individual control requires more than binary security of personal information Information is often given up for something in return Safeway card Monthly charge to be kept of phone books Information for community statistics: Health statistics Collaborative filtering/personalization in virtual communities 2/27/2019 Poorvi Vora/CS/GWU
6
A model: introduce uncertainty maximum uncertainty (i. e
A model: introduce uncertainty maximum uncertainty (i.e. secrecy) corresponds to crypto protocols Alice and Bob determine: a binary data point from Alice’s personal information, x a probability of truth, p a return, y Alice reveals a variable z = x with probability p Bob provides, in return, y z exists in the ether as Alice’s value x with probability p This is not mutually exclusive with cryptographic protection (p=0.5 is cryptographic) Used in public health community for twenty odd years 2/27/2019 Poorvi Vora/CS/GWU
7
Outcome Protocol is a mathematical game between Alice and Bob
Optimal situation not when no information is revealed, but when Alice gets maximum benefit for her information Think about this: should women in Africa test for HIV when they will certainly not obtain any treatment for it? 2/27/2019 Poorvi Vora/CS/GWU
8
An analogy The protocol is a communication channel
The sender is Alice, the receiver (malicious?) Bob The probability of error is the probability of a lie 2/27/2019 Poorvi Vora/CS/GWU
9
Security properties of randomization
Repeated queries Error 0 as n And n as Error 0 Cost to attacker increases without bound if error not bounded above zero This is a repetition code over channel 2/27/2019 Poorvi Vora/CS/GWU
10
Other attacks Query 1: Graying? Query 2: Balding? Query 3: Weight?
Query 4: Sports? Really asking about age and gender How does one characterize all such attacks? What can one say about security wrt such attacks? 2/27/2019 Poorvi Vora/CS/GWU
11
An analogy The attributes that Bob wants to determine form the message
The protocol is a communication channel The sender is Alice, the receiver (malicious?) Bob The probability of error is the probability of a lie The attributes that Bob wants to determine form the message 2/27/2019 Poorvi Vora/CS/GWU
12
A simple attack Query 1: Female? Query 2: Over 40?
Query 3: Losing Calcium? Query 3 checks answers to Query 1 and 2 Is a parity-check it 2/27/2019 Poorvi Vora/CS/GWU
13
An analogy All attacks are communication over channel
Good attacks are codes What Bob queries is a codeword bit What he receives is the transmitted codeword that he decodes 2/27/2019 Poorvi Vora/CS/GWU
14
Shannon’s theorems apply
In fact, assuming any functions of Alice’s data points as queries (adaptive, related queries) and error probability 0 as n The number of queries required per bit of entropy is asymptotically tightly bound below by the inverse of the channel capacity Above this bound, error tends exponentially to 0 Below it, it increases exponentially with n 2/27/2019 Poorvi Vora/CS/GWU
15
Questions How does one determine the entropy of a particular data set, or a general data set? What kinds of attacks are computationally feasible? This was a very powerful attacker. What are reasonable limits on the attacker’s abilities? Result in itself, independent of model. Partly published at Int. Symp. Info. Theory, 2003 Journal paper in review, at website 2/27/2019 Poorvi Vora/CS/GWU
16
Value-free model Human rights aspects covered through crypto protocols
Necessary health information and community information can be gathered Consumer behaviour treated through this game Criticism: very adversarial model 2/27/2019 Poorvi Vora/CS/GWU
17
Another application: anonymous delivery Crowds: Reiter and Rubin/Lucent and AT&T
At node i+1: node i more likely than any other Receiver: Node i+1 Message: sending node Received symbol: Node i Channel characteristic: Probability that true sender is Node i, Probability that other nodes are senders Traffic analysis/data mining: correlations among senders (communication across channel, less efficient than some error-correcting code) B A E C D N nodes; pf probability of forwarding 2/27/2019 Poorvi Vora/CS/GWU
18
An example of model use to measure the value of information with Yu-An Sun and Sumit Joshi
Auction bids reveal much about an individual’s profile Consider the Vickrey – sealed second highest bid – auction Optimal strategy: to bid one’s valuation Bids (and hence valuations) can be protected with secure multiparty computation But, bids allow determination of market demand (efficient markets) Need for an aggregate value, not well-defined at the moment of the auction 2/27/2019 Poorvi Vora/CS/GWU
19
Variably Private Vickrey – Bidding Round Introduce uncertainty
The seller announces a minimum sale price and a maximum randomization setting. Each bidder submits a sealed interval containing her bid. The size of the interval is her choice. In the running with high end, committed to low 2/27/2019 Poorvi Vora/CS/GWU
20
Variably Private Vickrey – Revealing Round
Bidders not in the running will reveal no more information on their valuations. Largest of the others will reveal which half of their interval contains valuation 2/27/2019 Poorvi Vora/CS/GWU
21
{ Sale Price Buyer pays Seller gets
Divided among all bidders proportional to the interval width 2/27/2019 Poorvi Vora/CS/GWU
22
Properties? Provides various demand statistics
In general, accuracy of future bid estimation lower for more uncertainty Allows for bidder to vary uncertainty, and pay for it Allows seller to obtain more than regular Vickrey, depending on how much information is valued Bidder with highest valuation still wins auction as long as she can tolerate revealing her valuation to the extent required. 2/27/2019 Poorvi Vora/CS/GWU
23
Summary A model that we hope will:
Provide choices not currently typically available to users Extend the security framework to include problems like those in statistical databases Provide a means of measuring uncertainty in situations where there is some not none or complete Include other leakage from security-related protocols such as anonymous delivery and ciphers Be useful for measuring the economic value of information 2/27/2019 Poorvi Vora/CS/GWU
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.