Secure Computation of Surveys Joan Feigenbaum Benny Pinkas Raphael S. Ryger Felipe Saint Jean Workshop on Secure Multiparty Protocols (SMP 2004)
Surveys and other Naturally Centralized Multiparty Computations ● Consider – Sealed-bid auctions – Elections – Referenda – Surveys ● Each participant weighs the hoped-for payoffs against any revelation penalty (“loss of privacy”) and is concerned that the computation be fault-free and honest. ● The implementor, in control of the central computation, must configure auxiliary payoffs and privacy assurances to encourage (honest) participation.
CRA Taulbee Survey: Computer Science Faculty Salaries ● Computer science departments in four tiers, all the rest ● Academic faculty in four ranks: full, associate, and assistant professors, and non-tenure-track teaching faculty ● Intention: Convey salary distribution statistics per tier-rank to the community at large without revealing department-specific information.
CRA Taulbee Survey: The Current Computation ● Inputs, per department and faculty rank: – Minimum – Maximum – Median – Mean ● Outputs, per tier and faculty rank: – Minimum, maximum, mean of department minima – Minimum, maximum, mean of department maxima – Median of department means (not weighted) – Mean (weighted mean of department means)
CRA Taulbee Survey: The Problem ● CRA wishes to provide fuller statistics than the meager data currently collected can support. ● The current level of data collection already compromises department-specific information. Asking for submission of full faculty-salary information greatly raises the threshold for trust in CRA's intentions and its security competence. Furthermore, detailed disclosure, even if anonymized, may be explicitly prohibited by the school. ● Hence, there is a danger of significant non- participation in the Taulbee Survey.
Communication Pattern: General Secure Function Evaluation
Communication Pattern: Surveys (Insecure, Natural Computation), or SFE “Ideal Model” (Trusted Party)
Communication Pattern: M-for-N-Party Secure Function Evaluation
Real-World Human-Input Network Computation ● Opportunistic participation: Input is provided if/when humans, computers, and networking are available and operative. The exact participation is not predictable. ● The “function” being computed, then, is not known until the input-collection phase is closed, at which point the participants are generally no longer available for interaction. ● Solution: Two major modular phases, – secure collection of (“N”) inputs into M-node hub – M-party secure function evaluation ● The entire process to be supervised by a control node.
CRA Taulbee Survey: Secure Input Collection Participant Control Register
CRA Taulbee Survey: Secure Input Collection Participant Control Register Participant Control Log In Session ID
CRA Taulbee Survey: Secure Input Collection Participant Control Register Participant Control Log In Session ID Compute 1 Compute 2 Session ID, Data Shares Session ID, Data Shares
CRA Taulbee Survey: Secure Input Collection Participant Control Register Participant Control Log In Session ID Compute 1 Compute 2 Session ID, Data Shares Session ID, Data Shares Session ID, # Data Points Session ID, # Data Points
CRA Taulbee Survey: Secure Input Collection Participant Control Register Participant Control Log In Session ID Acknowledgment Compute 1 Compute 2 Session ID, Data Shares Session ID, Data Shares Session ID, # Data Points Session ID, # Data Points
CRA Taulbee Survey: Securely evaluate what function(s)? ● The implemented prototype supports secure computation of salary distribution statistics in each tier-rank. ● Exactly the same approach is applicable to the secure computation of distribution statistics for the departmental rank aggregates – minima, maxima, medians, and means – for each rank, for each tier. ● The approach strives to compute as little as possible securely, a minimal secure computation feeding a postprocessing phase that computes the statistics CRA wishes to publish.
CRA Taulbee Survey: The Proposed Computation (1) ● Secure input collection (control aside): – Salary and rank data entry by department head – Per rank, in JavaScript, computation of XOR shares of the individual salaries for the two (M = 2) computation servers – Per rank, HTTPS transmission of XOR shares to their respective computation servers CRA closes the input-collection phase, and then...
CRA Taulbee Survey: The Proposed Computation (2) ● Per tier and rank, construction of a Boolean circuit to – reconstruct inputs by XOR-ing their shares – sort the inputs in an odd-even sorting network ● Secure computation, per tier and rank: – Fairplay implementation of the Yao two-party SFE protocol for the constructed circuit and the collected input shares – output is a sorted list of all salaries in the tier-rank ● Postprocessing, per tier and rank: – arbitrary, insecure computation on the sorted, cross-departmental salary list
Open Questions ● Input “sanity checking” in a privacy-preserving system lacking strong natural incentives for truthfulness and accuracy: – data-entry error trapping – detection/deterrence of intentional, possibly gross misrepresentation by participants ● Traditional SFE considerations regarding maliciousness, as they arise in the M-for-N-party protocol setting ● Economy of the core (symmetric) SFE protocols ● Economy of the Boolean circuits and of their generation. ● The legal difficulty: uncharted territory.