Download presentation
Presentation is loading. Please wait.
Published byGabriella Nash Modified over 6 years ago
1
Tradeoffs Between Fairness and Accuracy in Machine Learning
Aaron Roth
4
The Basic Setup Given: Data, consisting of π features and a label.
Goal: Find a rule to predict label from features. College? Bankruptcy? Tall? Employed? Homeowner? Paid back loan? Y N Yes No π₯ π¦ (College and Employed and not Bankruptcy) or (Tall and Employed and not College)
5
The Basic Setup Given data, select a hypothesis β: π,π π β{π,π}
Goal is not prediction on the training data, but prediction on new examples.
6
π· The Basic Setup ( π₯ 1 , π¦ 1 ) Training Data ( π₯ 2 , π¦ 2 )
( π₯ 1 , π¦ 1 ) Training Data ( π₯ 2 , π¦ 2 ) π· ( π₯ 3 , π¦ 3 ) ( π₯ π , π¦ π ) ( π₯ π , π¦ π ) (π₯, π¦) Algorithm β β(π₯) The example is misclassified if β π₯ β π¦.
7
The Basic Setup Goal: Find a classifier to minimize
πππ β = Pr π₯,π¦ βΌπ [β π₯ β π¦] We donβt know πβ¦ But we can minimize the empirical error: πππ β,π· = 1 π π₯,π¦ βπ· :β π₯ β π¦
8
The Basic Setup Empirical Error Minimization:
College? Bankruptcy? Tall? Employed? Homeowner? Paid back loan? Y N Yes No The Basic Setup Empirical Error Minimization: Try a lookup table! β π₯ =π if π₯β πππππ, πππππ, πππππ, πππππ β π₯ =π otherwise. πππ β,π· = 0. This would over-fit. We havenβt learned, Just memorized. Learning must summarize.
9
The Basic Setup Instead, limit hypotheses to come from a βsimpleβ class πΆ. E.g. linear functions, or small decision trees, etc. Compute β β = arg min ββπΆ πππ β,π· πππ β β β€ πππ β β ,π· + max ββπΆ πππ β β πππ β,π· Training Error Generalization Error
10
Why might machine learning be βunfairβ?
Many reasons: Data might encode existing biases. E.g. labels are not βCommitted a crime?β but βWas arrested.β Data collection feedback loops. E.g. only observe βPaid back loan?β if the loan was granted. Different populations with different properties. E.g. βSAT scoreβ might correlate with label differently in populations that employ SAT tutors. Less data (by definition) about minority populations.
11
Why might machine learning be βunfairβ?
Many reasons: Data might encode existing biases. E.g. labels are not βCommitted a crime?β but βWas arrested.β Data collection feedback loops. E.g. only observe βPaid back loan?β if the loan was granted. Different populations with different properties. E.g. βSAT scoreβ might correlate with label differently in populations that employ SAT tutors. Less data (by definition) about minority populations.
12
SAT Score Number of Credit Cards Population 1 Population 2
13
SAT Score Number of Credit Cards Population 1 Population 2
14
Else, optimizing accuracy fits the majority population.
SAT Score Upshot: To be βfairβ, the algorithm may need to explicitly take into account group membership. Else, optimizing accuracy fits the majority population. Number of Credit Cards Population 1 Population 2
15
Why might machine learning be βunfairβ?
Many reasons: Data might encode existing biases. E.g. labels are not βCommitted a crime?β but βWas arrested.β Data collection feedback loops. E.g. only observe βPaid back loan?β if the loan was granted. Different populations with different properties. E.g. βSAT scoreβ might correlate with label differently in populations that employ SAT tutors. Less data (by definition) about minority populations.
16
SAT Score Number of Credit Cards Population 1 Population 2
17
Qualified individuals will be denied at a higher rate.
Upshot: Algorithms trained on minority populations will be less accurate. Qualified individuals will be denied at a higher rate. SAT Score Number of Credit Cards Population 1 Population 2
18
Why might machine learning be βunfairβ?
Many reasons: Data might encode existing biases. E.g. labels are not βCommitted a crime?β but βWas arrested.β Data collection feedback loops. E.g. only observe βPaid back loan?β if the loan was granted. Different populations with different properties. E.g. βSAT scoreβ might correlate with label differently in populations that employ SAT tutors. Less data (by definition) about minority populations.
19
Toy Model Two kinds of loan applicants.
Type 1: Pays back loan with unknown probability π 1 Type 2: Pays back loan with unknown probability π 2 Initially, bank believes π 1 , π 2 uniform in [0,1] Every day, bank makes a loan to type most likely to pay it back according to posterior distribution on π 1 , π 2 Bank observes if it is repaid, but not counterfactuals. Bank then updates posteriors.
20
Toy Model Suppose bank has made π π loans to population π, π π have paid them back , π π have defaulted. ( π π = π π + π π ) Expected payoff of next loan is π π +1 π π + π π +2 π 1 = π 2 = 1 2 0.5 0.5 π π =0 π π =0 π π =0 π π =0 π π =0 π π =0
21
Toy Model Suppose bank has made π π loans to population π, π π have paid them back , π π have defaulted. ( π π = π π + π π ) Expected payoff of next loan is π π +1 π π + π π +2 π 1 = π 2 = 1 2 0.5 0.33 π π =1 π π =0 π π =1 π π =0 π π =0 π π =0
22
Toy Model Suppose bank has made π π loans to population π, π π have paid them back , π π have defaulted. ( π π = π π + π π ) Expected payoff of next loan is π π +1 π π + π π +2 π 1 = π 2 = 1 2 0.66 0.33 π π =1 π π =0 π π =1 π π =1 π π =1 π π =0
23
Toy Model Suppose bank has made π π loans to population π, π π have paid them back , π π have defaulted. ( π π = π π + π π ) Expected payoff of next loan is π π +1 π π + π π +2 π 1 = π 2 = 1 2 0.75 0.33 π π =1 π π =0 π π =1 π π =2 π π =2 π π =0
24
Toy Model Suppose bank has made π π loans to population π, π π have paid them back , π π have defaulted. ( π π = π π + π π ) Expected payoff of next loan is π π +1 π π + π π +2 π 1 = π 2 = 1 2 0.60 0.33 π π =1 π π =0 π π =1 π π =3 π π =2 π π =1
25
Toy Model Suppose bank has made π π loans to population π, π π have paid them back , π π have defaulted. ( π π = π π + π π ) Expected payoff of next loan is π π +1 π π + π π +2 π 1 = π 2 = 1 2 0.5 0.33 π π =1 π π =0 π π =1 π π =4 π π =2 π π =2
26
Toy Model Suppose bank has made π π loans to population π, π π have paid them back , π π have defaulted. ( π π = π π + π π ) Expected payoff of next loan is π π +1 π π + π π +2 π 1 = π 2 = 1 2 0.57 0.33 π π =1 π π =0 π π =1 π π =5 π π =3 π π =2
27
Toy Model Suppose bank has made π π loans to population π, π π have paid them back , π π have defaulted. ( π π = π π + π π ) Expected payoff of next loan is π π +1 π π + π π +2 π 1 = π 2 = 1 2 0.625 0.33 π π =1 π π =0 π π =1 π π =6 π π =4 π π =2
28
Toy Model Suppose bank has made π π loans to population π, π π have paid them back , π π have defaulted. ( π π = π π + π π ) Expected payoff of next loan is π π +1 π π + π π +2 π 1 = π 2 = 1 2 0.55 0.33 π π =1 π π =0 π π =1 π π =7 π π =4 π π =3
29
Toy Model Upshot: Algorithms making myopically optimal decisions may forever discriminate against a qualified population because of unlucky prefix. Less myopic algorithms balance exploration and exploitation β but this has its own unfairness issues. Suppose bank has made π π loans to population π, π π have paid them back , π π have defaulted. ( π π = π π + π π ) Expected payoff of next loan is π π +1 π π + π π +2 π 1 = π 2 = 1 2 0.33 π π =1 π π =0 π π =1 π π =π₯ π π βπ₯/2 π π βπ₯/2
30
A solution in a stylized modelβ¦
So what can we do? A solution in a stylized model⦠Joint work with: Matthew Joseph, Michael Kearns, Jamie Morgenstern Seth Neel.
31
A Richer Model
32
A Richer Model
33
A Richer Model
34
The Contextual Bandit Problem
π populations πβ{1,β¦, π} Each with an unknown function π π : β π β[0,1] e.g. π π π₯ =β© π½ π ,π₯βͺ In rounds π‘β 1,β¦,π : A member of each population π arrives with an application π₯ π π‘ β β π The bank chooses an individual π π‘ , obtains reward π π π‘ π‘ such that πΌ π π π‘ π‘ = π π π‘ ( π₯ π π‘ π‘ ) Bank does not observe reward for individuals not chosen.
35
The Contextual Bandit Problem
Measure performance of the algorithm via Regret: Regret = 1 π π‘ max π π π π₯ π π‘ β π‘ π π π‘ π₯ π π‘ π‘ βUCBβ style algorithms can get regret π πβ
π π
36
A Fairness Definition βFair Equality of Opportunityβ: ββ¦ assuming there is a distribution of natural assets, those who are at the same level of talent and ability, and have the same willingness to use them, should have the same prospects of success regardless of their initial place in the social system.β β John Rawls
37
A Fairness Definition βFair Equality of Opportunityβ: The algorithm should never preferentially favor (i.e. choose with higher probability) a less qualified individual over a more qualified individual.
38
A Fairness Definition βFair Equality of Opportunityβ: For every round π‘: Pr Alg chooses π >Prβ‘[Alg chooses π] only if: π π π₯ π π‘ > π π ( π₯ π π‘ )
39
Is Fairness in Conflict with Profit?
Note --- optimal policy is fair! Every day, select arg max π π π ( π₯ π π‘ ) Constraint only prohibits favoring worse applicants. Maybe goals of learner are already aligned with this constraint? Main Take Away: Wrong. Fairness still an obstruction to learning the optimal policy. Even for π=0, requires Ξ( π 3 ) rounds to obtain non- trivial regret.
40
FairUCB π· π π , π π π = π π If two intervals overlap,
donβt know which is better. If two intervals overlap, play with = probability What about me? π· π π , π π π = π π βStandardβ choice [Auer et al β02] Reward 40οΏΌ
41
βStandard Approachβ Biases Towards Uncertainty
Example with 2 features: π₯ 1 is βlevel of educationβ π₯ 2 is βnumber of tophats ownedβ For Group 1: π½=(1, 0) 90%: Random s.t. π₯ 1 = π₯ 2 (perfectly correlated) 10%: Random ( π₯ 1 , π₯ 2 independent) For Group 2: π½= 0.5,0.5 All applications uniform.
42
βStandard Approachβ Biases Towards Uncertainty
βUnfair Roundsβ: π‘ s.t. Algorithm selects sub-optimal applicant. Discrimination Index for Individual: Probability of victimization conditioned on participating in being present at unfair round.
43
Chaining: Chained
44
FairUCB Pseudo-Code For t = 1 to β
Using OLS estimators π½ π , construct reward estimators and widths π€ π such that with high probability: π π π‘ = π½ π , π₯ π π‘ π π π‘ β π π π‘ β π€ π , π π π‘ + π€ π Compute π β = arg max π ( π’ π π‘ + π€ π ) Select an individual π π‘ uniformly at random from among the set of individuals chained to π β Observe reward π π π‘ π‘ and update π½ π π‘ .
45
FairUCBβs Guarantees:
Regret is: π π π 3 π Worse than without fairness β but only by a factor of π πβ
π When π π are linear, fairness has a cost, but only a mild one. What about when π π are non-linear? Can quickly, fairly learn β Can quickly find tight confidence intervals. βCost of Fairnessβ can sometimes be exponential in π.
46
Discrimination Index Standard UCB Fair UCB
47
Performance
48
Discussion Even without a taste for discrimination βbaked inβ, natural (optimal) learning procedures can disproportionately benefit certain populations. Imposing βfairnessβ constraints is not without cost. Must quantify tradeoffs between fairness and other desiderata No Pareto dominant solution β how should we choose between different points on the tradeoff curve? None of these points is unique to βalgorithmicβ decision making. Just allows formal analysis to bring these universal issues to the fore.
49
Fairness Definition
50
Where does cost of fairness come from?
Distribution over instances k Bernoulli arms Only uncertainty is their means Arm i has mean ΞΌi which is either
51
Lower bound particulars
Mean of Distribution bβk bk bβk-1 bk-1 Must play w. = probability until confident theyβre different 5151οΏΌ
52
Lower bound particulars
Mean of Distribution Must play w. = probability until confident theyβre different bk-1 bβk-2 bk-2 bβk-1 5252οΏΌ
53
Lower bound particulars
Mean of Distribution Must play w. = probability until confident theyβre different 5353οΏΌ
54
Lower bound particulars
Mean of Distribution Must play w. = probability until confident theyβre different bβ2 b2 bβ1 b1 5454οΏΌ
55
Lower bound particulars
Mean of Distribution pβk pk Must play w. = probability until confident theyβre different pk-1 pβk-1 pβ2 p2 pβ1 p1
56
So, thereβs a separation between fair/unfair learning
Without fairness one can achieve regret but this lower bound implies fair regret of
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.