Privacy as a tool for Robust Mechanism Design in Large Markets (A Case Study) Based on joint works with: Rachel Cummings, Justin Hsu, Zhiyi Huang, Sampath Kannan, Michael Kearns, Mallesh Pai, Jamie Morgenstern, Ryan Rogers, Tim Roughgarden Jon Ullman, and Steven Wu
Sampath Kannan, Jamie Morgenstern, and Steven Wu Approximately Stable, School Optimal, and Student-Truthful Many-to-One Matchings (via Differential Privacy) Aaron Roth Joint work with: Sampath Kannan, Jamie Morgenstern, and Steven Wu
Many-to-one Stable Matchings
Many-to-one Stable Matchings In a stable matchings problem there are 𝑛 students and 𝑚 schools. Students 𝑖 each have a total order ≻ 𝑖 over the schools Schools 𝑐 have a total order ≻ 𝑐 over the students Students can be matched to at most 1 school; schools to at most 𝑠 students. Definition: A matching 𝜇: 𝑛 →[𝑚] is stable if it satisfies: Feasibility: For each school 𝑐: 𝜇 −1 (𝑐) ≤𝑠 (No Blocking Pairs with Filled Seats): For each 𝑖∈[𝑛] and 𝑐∈[𝑚] such that 𝜇 𝑖 ≠𝑐, either 𝜇 𝑖 ≻ 𝑖 𝑐 or for every 𝑗∈ 𝜇 −1 (𝑐), 𝑗 ≻ 𝑐 𝑖. (No Blocking Pairs with Empty Seats): For every 𝑐 such that |𝜇 −1 𝑐 |<𝑠, and every 𝑖∈[𝑛] such that 𝑖 ≻ 𝑐 ∅, 𝜇 𝑖 ≻ 𝑖 𝑐.
Many-to-one Stable Matchings Simple mechanisms compute the student-optimal/school optimal matchings (student/school proposing deferred acceptance). But… Even in the 1-to-1 case, no mechanism is dominant-strategy-truthful for both sides of the market [Dubins and Freedman 1981, Roth 1982] In the many-to-one case, no school-optimal mechanism is dominant-strategy truthful for either side of the market. [Roth 1984] Can we circumvent them with approximation and large-market assumptions? Worst Case Results
“Traditional” Economic Approach e. g “Traditional” Economic Approach e.g. [Immorlica and Mahdian 05], [Kojima and Pathak 09], [Lee 11], [Azevedo and Budish 12], … Make a strong distributional assumption about how preferences are generated e.g. ([IM 05, KP09]) students have preference lists of constant length 𝑘, drawn i.i.d. from a product distribution Show that as the “market grows large”, when exact school-optimal matching is computed, the fraction of people who have incentive to deviate diminishes e.g. as 𝑛→∞ (and 𝑘 fixed), with high probability, a 1−𝑜(1) fraction of students have incentive to mis-report.
Here: A more robust “dual” approach. Make no assumptions about student or school preferences. Ask for truthful reporting to be an asymptotic dominant strategy for every student. Make no “large market” assumptions except that schools have sufficiently many slots. Instead: Perturb the process by which matchings are computed, and find “approximately stable”, “approximately school optimal” matchings. Also: Ask for small finite-market bounds (not just limit results)
Approximately Stable Matchings Definition: A matching 𝜇: 𝑛 →[𝑚] is stable if it satisfies: Feasibility: For each school 𝑐: 𝜇 −1 (𝑐) ≤𝑠 (No Blocking Pairs with Filled Seats): For each 𝑖∈[𝑛] and 𝑐∈[𝑚] such that 𝜇 𝑖 ≠𝑐, either 𝜇 𝑖 ≻ 𝑖 𝑐 or for every 𝑗∈ 𝜇 −1 (𝑐), 𝑗 ≻ 𝑐 𝑖. (No Blocking Pairs with Empty Seats): For every 𝑐 such that |𝜇 −1 𝑐 |<𝑠, and every 𝑖∈[𝑛] such that 𝑖 ≻ 𝑐 ∅, 𝜇 𝑖 ≻ 𝑖 𝑐. Definition: A matching 𝜇: 𝑛 →[𝑚] is 𝛼-approximately stable (envy free) if it satisfies: (No Blocking Pairs with Empty Seats at under-enrolled schools): For every c such that |μ −1 c |<(1−α)s, and every i∈[n] such that i ≻ c ∅, μ i ≻ i c. Schools tolerate a small degree of under-enrollment
Approximately School Optimal Matchings Definition: Let 𝜇 ∗ be the school-optimal stable matching. A matching 𝜇 is school dominant if for every school 𝑐, and every pair of students 𝑖,𝑗 such that 𝑖∈ 𝜇 −1 (𝑐)\ 𝜇 ∗ −1 (𝑐)and 𝑗∈ 𝜇 ∗ −1 (𝑐)\ 𝜇 −1 (𝑐): 𝑖 ≻ 𝑐 𝑗 i.e. every student matched to 𝑐 in a school dominant matching must be at least as preferred as every student matched to 𝑐 in the school optimal matching. But there may be fewer of them.
Approximate Dominant Strategy Truthfulness A utility function 𝑢 𝑖 : 𝑚 →[0,1] is consistent with an ordering ≻ 𝑖 if for every 𝑐, 𝑐′: 𝑐 ≻ 𝑖 𝑐 ′ if and only if 𝑢 𝑖 𝑐 > 𝑢 𝑖 ( 𝑐 ′ ). Definition: A matching mechanism 𝑀 is 𝜂-approximately dominant strategy truthful if for every ≻=( ≻ 1 ,…, ≻ 𝑛 ), 𝑖∈[𝑛] and deviation ≻ 𝑖 ′ , and for every utility function 𝑢 𝑖 consistent with ≻ 𝑖 : 𝔼 𝑐∼𝑀 ≻ 𝑖 𝑢 𝑖 𝑐 ≥ 𝔼 𝑐∼𝑀 ≻ 𝑖 ′ , ≻ −𝑖 𝑖 𝑢 𝑖 𝑐 −𝜂
When 𝒔=𝝎( 𝒎 ⋅𝒍𝒐𝒈 𝒏,𝒍𝒐𝒈 𝒏 ), we can take Our Result Theorem: There is a computationally efficient algorithm for computing 𝛼-approximately stable, school dominant matchings, that makes it an 𝜂-approximately dominant strategy for every student to report truthfully whenever school capacity is sufficiently large: 𝑠≥Ω 𝑚 𝜂𝛼 log 𝑛 When students have constant length preference lists, we only require: 𝑠≥Ω log 𝑛 𝜂𝛼 When 𝒔=𝝎( 𝒎 ⋅𝒍𝒐𝒈 𝒏,𝒍𝒐𝒈 𝒏 ), we can take 𝜶,𝜼→𝟎.
Differential Privacy [DMNS06] A measure of Algorithmic Stability Let 𝑡∈ 𝒯 𝑛 denote an arbitrary type profile, and let 𝑡 𝑖 ′ ∈𝒯 be any possible report for agent 𝑖. Then a mechanism 𝑀: 𝒯 𝑛 →𝒪 is 𝜖-differentially private if for all 𝑆⊆𝒪: Pr 𝑀 𝑡 ∈𝑆 ≤ 𝑒 𝜖 Pr[𝑀 𝑡 𝑖 ′ , 𝑡 −𝑖 ∈𝑆] In particular, for any 𝑢:𝒪→ ℝ ≥0 : 𝔼 𝑥∼𝑀(𝑡) 𝑢 𝑥 ≤ 𝑒 𝜖 𝔼 𝑥∼𝑀 𝑡 𝑖 ′ , 𝑡 −𝑖 [𝑢 𝑥 ] Algorithmically enforced informational smallness.
A Helpful Change in Perspective Admissions Thresholds Think of school preferences ≻ 𝑐 as being represented by assigning a rating 𝑟 𝑖 𝑐 ∈{1,…,𝑈} to each student 𝑖. 𝑖 ≻ 𝑐 𝑗⇔ 𝑟 𝑖 𝑐 > 𝑟 𝑗 𝑐 . A set of admissions thresholds 𝑇=( 𝑡 1 ,… 𝑡 𝑚 ) induces a matching: 𝜇 ≻ 𝑇 𝑖 = arg max ≻ 𝑖 𝑐 𝑟 𝑖 𝑐 ≥ 𝑡 𝑐 } (i.e. students go to their favorite school that will have them) Say thresholds 𝑇 are 𝛼-approximately stable if 𝜇 ≻ 𝑇 is. Idea: Try and find 𝛼-approximately stable, school dominant thresholds, subject to differential privacy.
Differential Privacy Yields Approximate DSIC. Theorem: Let 𝑀: ≻ 𝑛 → 0,𝑈 𝑚 be an 𝜖-differentially private algorithm for computing admissions thresholds. The algorithm 𝐴 which takes as input preferences ≻ 1 ,…, ≻ 𝑛 and: computes 𝑇=𝑀(≻), and outputs 𝜇 ≻ 𝑇 is 𝜖-approximately dominant strategy truthful for all students. Matching is computed subject to “joint differential privacy”.
Differential Privacy Yields Approximate DSIC. Proof: Fix a set of preferences ≻, a student 𝑖, a deviation ≻ 𝑖 ′ , and a utility function 𝑢 𝑖 consistent with ≻ 𝑖 . 𝔼 𝑐∼𝐴 ≻ [ 𝑢 𝑖 𝑐 ] = 𝔼 𝑇∼𝑀(≻) 𝑢 𝑖 arg max ≻ 𝑖 𝑐 𝑟 𝑖 𝑐 ≥ 𝑡 𝑐 } ≥ 𝑒 −𝜖 𝔼 𝑇∼𝑀( ≻ 𝑖 ′ , ≻ −𝑖 ) 𝑢 𝑖 arg max ≻ 𝑖 𝑐 𝑟 𝑖 𝑐 ≥ 𝑡 𝑐 } (Differential Privacy) ≥ 𝑒 −𝜖 𝔼 𝑇∼𝑀( ≻ 𝑖 ′ , ≻ −𝑖 ) 𝑢 𝑖 arg max ≻ 𝑖 ′ 𝑐 𝑟 𝑖 𝑐 ≥ 𝑡 𝑐 } (argmax and consistency) ≥ 𝑒 −𝜖 𝔼 𝑐∼𝐴 ≻ 𝑖 ′ , ≻ −𝑖 [ 𝑢 𝑖 𝑐 ] ≥ 𝔼 𝑐∼𝐴 ≻ 𝑖 ′ , ≻ −𝑖 𝑢 𝑖 𝑐 −𝜖 ( 𝑒 −𝜖 ≥1−𝜖 and 𝑢 𝑖 ∈[0,1]) Goal: Design private algorithm to compute approximately stable, school dominant thresholds
School Proposing Deferred Acceptance Set all school thresholds 𝑡 𝑐 =𝑛+1, an initial empty matching 𝜇, and initial counts 𝐸 𝑐 =0 of enrollment for each school. While there exists an under-enrolled school 𝑐 : 𝐸 𝑐 <𝑠 and 𝑡 𝑐 >0: Lower the threshold for school 𝑐: 𝑡 𝑐 ← 𝑡 𝑐 −1 For each student 𝑖, if 𝜇 𝑖 ≠ arg max ≻ 𝑖 𝑐 𝑟 𝑖 𝑐 ≥ 𝑡 𝑐 } then: 𝐸 𝜇(𝑖) ← 𝐸 𝜇(𝑖) −1, 𝜇 𝑖 ← arg max ≻ 𝑖 𝑐 𝑟 𝑖 𝑐 ≥ 𝑡 𝑐 } , 𝐸 𝜇(𝑖) ← 𝐸 𝜇(𝑖) +1 Output 𝑇=( 𝑡 1 ,…, 𝑡 𝑚 ) How can we make this differentially private?
Some Useful Privacy Properties Theorem (Postprocessing): If 𝑀(≻) is 𝜖-differentially private, and 𝑓 is any (randomized) function, then 𝑓(𝑀 ≻ ) is 𝜖-differentially private.
Some Useful Privacy Properties Theorem (Composition): If 𝑀 1 ,…, 𝑀 𝑘 are 𝜖- differentially private, then: 𝑀 ≻ ≡( 𝑀 1 ≻ ,…, 𝑀 𝑘 (≻)) is ≈ 𝑘 𝜖-differentially private.
So… We can go about designing algorithms as we normally would. Just access the data using differentially private “subroutines”, and keep track of your “privacy budget” as a resource. Private algorithm design, like regular algorithm design, can be modular.
School Proposing Deferred Acceptance Set all school thresholds 𝑡 𝑐 =𝑛+1, an initial empty matching 𝜇, and initial counts 𝐸 𝑐 =0 of enrollment for each school. While there exists an under-enrolled school 𝑐 : 𝐸 𝑐 <𝑠 and 𝑡 𝑐 >0: Lower the threshold for school 𝑐: 𝑡 𝑐 ← 𝑡 𝑐 −1 For each student 𝑖, if 𝜇 𝑖 ≠ arg max ≻ 𝑖 𝑐 𝑟 𝑖 𝑐 ≥ 𝑡 𝑐 } then: 𝐸 𝜇(𝑖) ← 𝐸 𝜇(𝑖) −1, 𝜇 𝑖 ← arg max ≻ 𝑖 𝑐 𝑟 𝑖 𝑐 ≥ 𝑡 𝑐 } , 𝐸 𝜇(𝑖) ← 𝐸 𝜇(𝑖) +1 Output 𝑇=( 𝑡 1 ,…, 𝑡 𝑚 ) Only data access: Keeping track of enrollment counts.
Privately Maintaining Counts [DworkNaorPitassiRothblum10,ChanShiSong10] give exactly the tool we need. Private algorithm to maintain a running count. Given a stream of n bits, maintain an estimate of the running count to accuracy ±Δ polylog 𝑛 𝜖 , where each person can affect at most Δ entries in the stream. For us: Δ=2. (No student changes enrollment status at any school more than twice.) 32 1 1 1 1 1
Privately Maintaining Counts +𝑵 𝟎, 𝐥𝐨𝐠 𝒏 𝝐 5 +𝑵 𝟎, 𝐥𝐨𝐠 𝒏 𝝐 +𝑵 𝟎, 𝐥𝐨𝐠 𝒏 𝝐 2 3 +𝑵 𝟎, 𝐥𝐨𝐠 𝒏 𝝐 +𝑁 0, log 𝑛 𝜖 +𝑁 0, log 𝑛 𝜖 +𝑵 𝟎, 𝐥𝐨𝐠 𝒏 𝝐 1 1 2 1 1 1 1 1 1
Private School Proposing Deferred Acceptance Idea: Run school proposing deferred acceptance, but maintain enrollment counts privately. Privacy of the counters, + postprocessing + composition implies privacy of the whole algorithm. 𝜂-DP implies 𝜂-approximate dominant strategy truthfulness. 𝑚 schools to keep track of, so total error is 𝐸=𝑂 𝑚 ⋅ log 𝑛 𝜂 So as to never over-enroll, run as if capacity is shaded down by 𝐸. So long as capacity 𝑠≥ 𝐸 𝛼 =𝑂 𝑚 ⋅ log 𝑛 𝜂𝛼 , the under-enrollment due to capacity shading and error is ≤𝛼⋅𝑠.
Private School Proposing Deferred Acceptance Privacy ⇒ approximate dominant strategy truthfulness. Utility guarantees? Enrollments are always underestimated, and so… The sequence of proposals is always a subsequence of the proposals made by some trajectory of the (exact) school-proposing deferred acceptance algorithm. No blocking pairs with filled seats School dominance Excess under-enrollment of at most 𝐸 Only blocking pairs with empty seats are at almost fully enrolled schools.
Stepping back… Differential Privacy is a tool that can be used to design robust mechanisms in large markets. Ex-post guarantees for all players even in settings of incomplete information No distributional assumptions Shifts perspective to mechanism design Explicitly perturb mechanisms to yield distributional robustness… Rather than proving structural properties about exact solutions on random instances.
Stepping back… Other applications: Privately computing Walrasian equilibrium prices: Asymptotically truthful combinatorial auctions with item pricings. Privately computing correlated/Nash equilibria: Mediators for equilibrium selection that make truth-telling an ex-post Nash equilibrium. Privately selecting alternatives: General recipe for mechanism design without money. [McSherry Talwar 07, Nissim Smorodinsky Tennenholtz 11] There should be more! Lets involve mechanism/market designers!
Stepping back more… “Markets for Privacy” “Markets for Data” Can we find a “market price” for 𝜖? Depends on individual costs of privacy risk, as well as value of resulting data analysis. Disclosures viewed as public goods? (Talk to John) “Markets for Data” Information is very interesting as a commodity Lots of complicated complementarities, because of inferences. Differential privacy removes some kinds of complementarities (by making reconstruction impossible) Leaves others Privacy trades off in non-trivial ways with “price of data”. Lets involve economists!
Thanks!