Proper Scoring Rules conitzer@cs.duke.edu.

Slides:



Advertisements
Similar presentations
Copyright © Cengage Learning. All rights reserved. 7 Probability.
Advertisements

Chain Rules for Entropy
Sec 15.6 Directional Derivatives and the Gradient Vector
Scoring Rules, Generalized Entropy and Utility Maximization Victor Richmond R. Jose Robert F. Nau Robert L. Winkler The Fuqua School of Business Duke University.
CPS Proper scoring rules Vincent Conitzer
Optimal False-Name-Proof Voting Rules with Costly Voting Liad WagmanVincent Conitzer Duke University Malvika Rao CS 286r Class Presentation Harvard University.
Computing Shapley values, manipulating value division schemes, and checking core membership in multi-issue domains Vincent Conitzer, Tuomas Sandholm Computer.
1 © 2015 Pearson Education, Inc. Consumer Decision Making In our study of consumers so far, we have looked at what they do, but not why they do what they.
The Cauchy–Riemann (CR) Equations. Introduction The Cauchy–Riemann (CR) equations is one of the most fundamental in complex function analysis. This provides.
4.2 The Mean Value Theorem.
Chapter 2 Sets and Functions.
Market Scoring Rules
Bayesian games and their use in auctions
Computational problems, algorithms, runtime, hardness
Copyright © 2009 Pearson Education, Inc.
Copyright © Cengage Learning. All rights reserved.
Theory of Capital Markets
Psychology Unit Research Methods - Statistics
Sampling Distributions
5.3 The Fundamental Theorem of Calculus
Chapter 32 Exchange.
Comp/Math 553: Algorithmic Game Theory Lecture 09
Wednesday, october 25th “Consider the postage stamp:  its usefulness consists in the ability to stick to one thing till it gets there.”
Computational Optimization
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
From last time: on-policy vs off-policy Take an action Observe a reward Choose the next action Learn (using chosen action) Take the next action Off-policy.
Vincent Conitzer CPS Repeated games Vincent Conitzer
CPS Cooperative/coalitional game theory
Sampling Distributions
Week 10 Chapter 16. Confidence Intervals for Proportions
Overview and Basics of Hypothesis Testing
Strategic Information Transmission
2.2 Dealing with Errors LEARNING GOAL
Introduction to Summary Statistics
Introduction to Summary Statistics
Copyright © Cengage Learning. All rights reserved.
Inferential Statistics
6.1 – Integration by Parts; Integral Tables
Copyright © Cengage Learning. All rights reserved.
CONDITIONAL PROBABILITY
Probability, Part I.
Vincent Conitzer CPS 173 Mechanism design Vincent Conitzer
Copyright © Cengage Learning. All rights reserved.
WARM – UP A two sample t-test analyzing if there was a significant difference between the cholesterol level of men on a NEW medication vs. the traditional.
Sampling Distributions
Sampling Distributions
COUNTING AND PROBABILITY
CPS 173 Computational problems, algorithms, runtime, hardness
Computing and Statistical Data Analysis / Stat 7
The Fundamental Theorem of Calculus
Copyright © Cengage Learning. All rights reserved.
Derivatives of Logarithmic Functions
Vincent Conitzer Repeated games Vincent Conitzer
Power Section 9.7.
Derivatives in Action Chapter 7.
Back to Cone Motivation: From the proof of Affine Minkowski, we can see that if we know generators of a polyhedral cone, they can be used to describe.
Copyright © Cengage Learning. All rights reserved.
Recurrence Probabilites of Weather Events What does this "40 percent" mean? ...will it rain 40 percent of of the time? ...will it rain.
Copyright © Cengage Learning. All rights reserved.
CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.
Sampling Distributions
Copyright © Cengage Learning. All rights reserved.
Copyright © Cengage Learning. All rights reserved.
Statistics and Probability-Part 5
From Randomness to Probability
Copyright © Cengage Learning. All rights reserved.
L’Hôpital’s Rule Chapter 9.2.
Vincent Conitzer CPS Repeated games Vincent Conitzer
CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.
CPS Bayesian games and their use in auctions
Presentation transcript:

Proper Scoring Rules conitzer@cs.duke.edu

Probability forecasts rain (1,0,0) no rain rain 1 snow (0,1,0) no precipitation (0,0,1)

What makes a probability forecaster good? Calibration: in the long run, of all the times you forecasted x%, roughly x% should turn out “yes” (for all x) What’s an easy way to be calibrated? Sharpness: more extreme forecasts are preferred How to trade these off? “Hypermind forecast calibration over 2 years on 181 question[s] and 472 possible event outcomes. Every day at noon, the estimated probability of each outcome was recorded. Once all the questions are settled, we can compare, at each level of probability, the percentage of events predicted to occur and the percentage that actually occurred. The size of data points  indicates the number of forecasts recorded at each level of probability.” (https://blog.hypermind.com/2016/06/25/lessons-from-brexit/)

Definitions Let S(p, ω) denote the reward for outcome ω after reporting p If the outcomes are {0, 1}, just report p = p1 S is proper if for all p, p is in arg maxp’ Eω~p S(p’, ω) S is strictly proper if for all p, {p} = arg maxp’ Eω~p S(p’, ω)

A scoring rule Let S(p, ω) = pω. Is this proper?

Some example scoring rules (proofs that they are strictly proper on the board) Quadratic: S(p, 1) = 2p - p2 - (1-p)2, s(p, 0) = 2(1-p) - p2 - (1-p)2 Generally: S(p, ω) = 2pω - Σω’ (pω’)2 Logarithmic: S(p, 1) = ln(p), s(p, 0) = ln(1-p) Generally: S(p, ω) = ln(pω) Vince’s crazy proper scoring rule: S(p, 1) = (2-p)ep S(p, 0) = (1-p)ep (Do you want to make your own?) What’s nice / not nice about some of these rules?

Can we go beyond individual examples? Can we come up with a way of generating more proper scoring rules? How would we ever know we have found the optimal proper scoring rule (according to some metric)? If we have constraints, how do we know that a proper scoring rule exists that satisfies them?

Expected value of reporting truthfully Let G(p) denote your expected reward if you believe and report p G(p)=pS(p,1) +(1-p)S(p,0) S(p, 1) S(p, 0) Expected reward for reporting p as a function of true probability p

Proper scoring rules must have convex G Let q = p+ε and r = p-ε. Consider the following manipulation. Sometimes, when you believe q, report p. Similarly (equally often, say also at rate α), when you believe r, report p For all these misreports, the actual probability is (α(p+ε)+α(p-ε))/2α = p So these misreports on average give you G(p) Reporting truthfully, you would have received on average (G(q)+G(r))/2 So for the rule to be proper, need G(p) ≤ (G(q)+G(r))/2 (strictly proper: <) Interpretation: Destroying information should not help you! Sharpness is valued Can generalize this argument to conclude that G(p) must be (strictly) convex for a (strictly) proper scoring rule (on the board)

Convexity of G… Let G(p) denote your expected reward if you believe and report p S(q, 1) S(r, 0) S(p, 1) S(p, 0) S(r, 1) S(q, 0) r p q

Conversely: for any (strictly) convex G there is a (strictly) proper scoring rule Just use tangent lines! S(r, 0) S(p, 1) S(p, 0) S(r, 1) r p

Slight caveat S(p, 1) S(p, 0) p Some convex functions may not have a well defined derivative everywhere … but any subtangent line will do S(p, 1) S(p, 0) p

General characterization [Savage 1971, Gneiting and Raftery 2007] Theorem. A scoring rule is (strictly) proper if and only if there exists a (strictly) convex function G such that S(p, ω) = G(p) + G*(p) · (eω - p) where the vector G*(p) is a subgradient of G at p, that is, for all r, G*(p) · (r - p) ≤ G(r) - G(p)

Principal-aligned proper scoring rules [Shi, Conitzer, Guo 2009] Spending millions of dollars on some kind of fantasy league terror game is absurd and, frankly, ought to make every American angry. –Senators Wyden, Dorgan 2003 Suppose we are worried about the forecaster taking undesirable actions to affect the outcome Asking a developer to predict when the product will be ready Asking someone capable of committing terrorist acts whether there will be a terrorist act Let uω denote the principal’s utility for outcome ω. We would like the proper scoring rule to be aligned with the principal’s utility, i.e., not create incentives to reduce it Theorem. A proper scoring rule is aligned with principal utility function u if and only if it corresponds to G(p) = g(p · u) where g is convex and non-decreasing. What examples of such a function have we seen? Proof – can you figure it out?