Boosting and Differential Privacy Cynthia Dwork, Microsoft Research TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A.

Boosting and Differential Privacy Cynthia Dwork, Microsoft Research TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A AAA

The Power of Small, Private, Miracles TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A AAA Joint work with Guy Rothblum and Salil Vadhan

Boosting [Schapire, 1989]  General method for improving accuracy of any given learning algorithm  Example: Learning to recognize spam e-mail  “Base learner” receives labeled examples, outputs heuristic  Labels are {+1, -1}  Run many times; combine the resulting heuristics

Base Learner S: Labeled examples from D A 1, A 2, … Update D A Combine A 1, A 2, … Does well on ½ + ´ of D Terminate?

Base Learner S: Labeled examples from D A 1, A 2, … Update D A Combine A 1, A 2, … Does well on ½ + ´ of D How ? Terminate?

Boosting for People [Variant of AdaBoost, FS95]  Initial distribution D is uniform on database rows  S is always a subset of k elements drawn from D k  Combiner is majority  Weight update:  If correctly classified by current A, decrease weight by factor of e  “subtract 1 from exponent”  If incorrectly classified by current A, increase weight by factor of e  “add 1 to exponent”  Re-normalize to obtain updated D

Why Does it Work? Update rule: multiply weight by exp(-c t (i)) D t+1 (i) = [D t (i) exp(-c t (i)] / N t N t D t+1 (i) = D t (i) exp(-c t (i)) N t N t-1 …N 1 D t+1 (i) = D 1 (i)exp(-  s c s (i))  s N s D t+1 (i) = (1/m) exp (-  s c s (i))  i   s N s D t+1 (i) = (1/m)  i  exp (-  s c s (i))  s N s = (1/m)  i exp (-  s c s (i)) A t (i) correct?

 s N s = (1/m)  i exp (-  s c s (i))   s N s is shrinking exponentially (depends on ´ )  Normalizers are sums of weights;  At start of each round these sum to 1  “more” decrease (because the base learner is good) than increase  More weight has the exponent shrink than otherwise   i exp (-  s c s (i)) =  i exp (- y i  s A s (i))  This is an upper bound on # of incorrectly classified examples:  If y i ≠ sign[  s A s (i)] ( = majority{A 1 (i), A 2 (i),…}), then y i  s A s (i) < 0, so exp(-y i  s A s (i)) ≥ 1.  Therefore, the number of incorrectly classified examples is exponentially small in t

-1/+1 renormalize Base Learner S: Labeled examples from D A 1, A 2, … Update D A Combine A 1, A 2, … majority Initially: D uniform on DB rows Does well on ½ + ´ of D Privacy? Terminate?

Private Boosting for People  Base learner must be differentially private  Main concern is rows whose weight grows too large  Affects termination test, sampling, re-normalizing  Similar to problem arising when learning in the presence of noise  Similar solution: smooth boosting  Remove (give up on) elements that become too heavy  Carefully! Removing one heavy element and re-normalizing may cause another element to become heavy…  Ensure this is rare (else give up on too many elements; hurt accuracy)

Iterative Smoothing  Not today.

Boosting for Queries?  Goal: Given database DB and a set Q of low-sensitivity queries, produce an object O (eg, synthetic database) such that 8 q 2 Q : can extract from O an approximation of q(DB).  Assume existence of ( ² 0, ± 0 )-dp Base Learner producing an object O that does well on more than half of D  Pr q » D [ |q( O ) – q(DB)| (1/2 + ´ )

Base Learner S: Labeled examples from D A 1, A 2, … Update D A Combine A 1, A 2, … Initially: D uniform on Q Does well on ½ + ´ of D

-1/+1 renormalize Base Learner S: Labeled examples from D A 1, A 2, … Update D A Combine A 1, A 2, … median Initially: D uniform on Q Does well on ½ + ´ of D Privacy? Individual can affect many queries at once! Terminate?

Privacy is Problematic  In smooth boosting for people, at each round an individual has only a small effect on the probability distribution  In boosting for queries, an individual can affect the quality of q(A t ) simultaneously for many q  As time progresses, distributions on neighboring databases could evolve completely differently, yielding very different A t ’s  Slightly ameliorated by sampling (if only a few samples, maybe can avoid the q’s on the edge?)  How can we make the re-weighting less sensitive?

Private Boosting for Queries [Variant of AdaBoost]  Initial distribution D is uniform on queries in Q  S is always a set of k elements drawn from Q k  Combiner is median [viz. Freund92]  Weight update for queries  If very well approximated by A t, decrease weight by factor of e (“-1”)  If very poorly approximated by A t, increase weight by factor of e (“+1”)  In between, scale with distance of midpoint (down or up): 2 ( |q(DB) – q(A t )| - ( ¸ + ¹ /2) ) / ¹ (sensitivity 2 ½ / ¹ ) + 

Theorem (minus some parameters)  Let all q 2 Q have sensitivity · ½.  Run the query-boost algorithm for T = log | Q |/ ´ 2 rounds with ¹ = ((log | Q |/ ´ 2 ) 2 ½ √k ) / ²  The resulting object Q is ( ( ² + T ² 0 ), T ± 0 ) )-dp and, whp, gives ( ¸ + ¹ )-accurate answers to all the queries in Q.  Better privacy (small ² ) gives worse utility (larger ¹ )  Better base learner (smaller k, larger ´ ) helps

Proving Privacy  Technique #1: Pay Your Debt and Move On  Fix A 1, A 2, …, A t (record D vs D’ confidence gain) “Pay Your Debt”  Focus on gain in selection of S 2 Q k in round t+1 “Move On”  Based on distributions D t+1 and D’ t+1 determined in round t  Will call them D, D’  Technique #2: Evolution of Confidence [DiDwN03]  “Delay Payment Until Final Reckoning”  Choose q 1, q 2, …, in turn  For each q 2 Q, bound |ln ( D[q] / D’[q] )| and expectation | E q »D ln ( D[q ] / D’[q] )|  Pr q1,…,qk [|  i ln ( D[q i ] / D’[q i ] )| > z√k (A + B) + k B] < exp(-z 2 /2) A B

Bounding E q »D ln ( P[q ] / P’[q] ) Assume D, D’ are A-dp wrt one another, for A < 1. Then 0 · E q » D ln[ D(q)/D’(q) ] · 2A 2 (that is, B · 2A 2 ). KL(D||D’) =  q ln[ D(q)/D’(q) ] D(q); always ¸ 0 So, KL(D||D’) · KL(D||D’) + KL(D’||D) =  q D(q) ( ln[ D(q)/D’(q) ] + ln[ D’(q)/D(q) ] ) + (D’(q)-D(q)) ln[ D’(q)/D(q) ] ·  q 0 + |D’(q)-D(q)| A = A  q [ max (D(q),D’(q)) - min (D(q),D’(q)) ] · A  q e A min (D(q),D’(q)) - min (D(q),D’(q)) · A  q (e A – 1) min (D(q),D’(q)) · 2A 2 when A < 1 Compare DiDwN03

Motivation and Application  Boosting for People  Logistic Regression for 3000+ dimensional data  Slight twist on CM did pretty well (eps = 1.5)  Thought about alternatives  Boosting for Queries  Reducing the dependence on the concept class in the work on synthetic databases in DNRRV09 (Salil’s talk)  Over-interpreted the polytime DiNi style attacks (we were spoiled)  Can’t have cn queries with error o(√n)  BLR08: can have cn queries with error O(n 2/3 )  DNNRV09: O(n 1/2 | Q | o(1) )  Now: O(n 1/2 log 2 | Q |)  Result is more general  Only know of base learner for counting queries

Base Learner S: Labeled examples from D A 1, A 2, … Update D A Combine A 1, A 2, … Does well on ½ + ´ of D Terminate?

Boosting and Differential Privacy Cynthia Dwork, Microsoft Research TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A.

Similar presentations

Presentation on theme: "Boosting and Differential Privacy Cynthia Dwork, Microsoft Research TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Boosting and Differential Privacy Cynthia Dwork, Microsoft Research TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A.

Similar presentations

Presentation on theme: "Boosting and Differential Privacy Cynthia Dwork, Microsoft Research TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A."— Presentation transcript:

Similar presentations

About project

Feedback