PICKING THE RIGHT DATES AND QUANTITIES Continuous Estimation.

Slides:



Advertisements
Similar presentations
Mean, Proportion, CLT Bootstrap
Advertisements

Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Statistics.  Statistically significant– When the P-value falls below the alpha level, we say that the tests is “statistically significant” at the alpha.
1 / 27 CS 709B Advanced Software Project Management and Development Software Estimation - I Based on Chapters 1-3 of the book [McConnell 2006] Steve McConnell,
1 Introduction to Inference Confidence Intervals William P. Wattles, Ph.D. Psychology 302.
Uncertainty with Drug Weights What do our reports mean to you?
Chapter 10 Section 2 Hypothesis Tests for a Population Mean
Statistical Techniques I EXST7005 Lets go Power and Types of Errors.
Copyright © 2010 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Module 2: Introduction to ERP Statistical Concepts and Tools Common Measures Training Chelmsford, MA September 28, 2006.
Sensitivity and Scenario Analysis
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
An Inference Procedure
Data Freshman Clinic II. Overview n Populations and Samples n Presentation n Tables and Figures n Central Tendency n Variability n Confidence Intervals.
1 The Basics of Regression Regression is a statistical technique that can ultimately be used for forecasting.
1 Hypothesis Testing In this section I want to review a few things and then introduce hypothesis testing.
Chapter Sampling Distributions and Hypothesis Testing.
Stat Notes 4 Chapter 3.5 Chapter 3.7.
1 Confidence Interval for Population Mean The case when the population standard deviation is unknown (the more common case).
Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto.
Chapter 19 Rocks!.  In our formula, fractionwise, z-star is in the top and n is in the bottom.  The terms for these are directly proportional (for z-star)
1 Psych 5500/6500 Statistics and Parameters Fall, 2008.
Sociology 5811: Lecture 7: Samples, Populations, The Sampling Distribution Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Ch 8 Estimating with Confidence. Today’s Objectives ✓ I can interpret a confidence level. ✓ I can interpret a confidence interval in context. ✓ I can.
Estimation of Statistical Parameters
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Psychology’s Statistics Statistical Methods. Statistics  The overall purpose of statistics is to make to organize and make data more meaningful.  Ex.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 10, Slide 1 Chapter 10 Understanding Randomness.
LECTURER PROF.Dr. DEMIR BAYKA AUTOMOTIVE ENGINEERING LABORATORY I.
5.3 Random Variables  Random Variable  Discrete Random Variables  Continuous Random Variables  Normal Distributions as Probability Distributions 1.
Investigating the Relationship between Scores
Uncertainty with Drug Weights What do our reports mean to you?
Copyright © 2009 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
1 Chapter 18 Sampling Distribution Models. 2 Suppose we had a barrel of jelly beans … this barrel has 75% red jelly beans and 25% blue jelly beans.
TYPES OF STATISTICAL METHODS USED IN PSYCHOLOGY Statistics.
Section 10.1 Confidence Intervals
+ “Statisticians use a confidence interval to describe the amount of uncertainty associated with a sample estimate of a population parameter.”confidence.
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
Chapter 8 Parameter Estimates and Hypothesis Testing.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Unit 2 (F): Statistics in Psychological Research: Measures of Central Tendency Mr. Debes A.P. Psychology.
Inference: Probabilities and Distributions Feb , 2012.
Chapter 6: Analyzing and Interpreting Quantitative Data
General Exam Tips Think Read the question carefully and try to understand the scenario, then think about the Maths you will need to do. Is it perimeter,
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Statistical Techniques
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Hypothesis Testing.
Chapter 10 Understanding Randomness. Why Be Random? What is it about chance outcomes being random that makes random selection seem fair? Two things: –
Point Estimates. Remember….. Population  It is the set of all objects being studied Sample  It is a subset of the population.
1 Probability and Statistics Confidence Intervals.
1 Chapter 11 Understanding Randomness. 2 Why Be Random? What is it about chance outcomes being random that makes random selection seem fair? Two things:
Some handy tips!. The slope will have some meaning that will be used to determine a quantitative answer to the lab’s purpose (sometimes it’s the inverse.
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
CHAPTER 15: Tests of Significance The Basics ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
SECTION 7.2 Estimating a Population Proportion. Where Have We Been?  In Chapters 2 and 3 we used “descriptive statistics”.  We summarized data using.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Forecasting.
Analysis…Measures of Central Tendency How can we make SENSE of our research data???
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved Chapter 3 Investigating the Relationship of Scores.
Sampling Distribution Models
INF397C Introduction to Research in Information Studies Spring, Day 12
Psychology Unit Research Methods - Statistics
BUS304 – Chapter 5 Normal Probability Theory
Unit 5: Hypothesis Testing
How Psychologists Ask and Answer Questions Statistics Unit 2 – pg
Tips for exam 1- Complete all the exercises from the back of each chapter. 2- Make sure you re-do the ones you got wrong! 3- Just before the exam, re-read.
Ch. 8 Estimating with Confidence
Significance Tests: The Basics
Significance Tests: The Basics
Presentation transcript:

PICKING THE RIGHT DATES AND QUANTITIES Continuous Estimation

Forecasting isn’t just about saying yes/no We also care what about timing and amounts! By what date will Bashar al- Assad vacate the office of president of Syria? How many Japanese nuclear reactors will be operational by June 1, 2014? (This question appeared last year.)

Thinking about Uncertainty  You can’t definitively know the answers in the advance but some quantities and dates are more likely than others.  Depending on what you know or learn about an event, you can weight one option more than another  The more you know the more you can assign higher probabilities to options, the less you know, the broader the possible range can be. Know More Know Less

Dealing with Uncertainty People deal with uncertainty all the time:  How long will my commute be?  How much will my vacation cost?  What rate of return can I expect from my portfolio? The goals of this training are to: 1. Introduce you to the continuous estimation (CE) format of forecasting problems. 2. Teach you to express uncertainty in quantitative terms and think about uncertainty in politics. 3. Explain how your forecasts will be scored.

Question Format Every forecasting problem will be posed in the form of non-overlapping intervals that cover all possibilities. Example : How many Japanese nuclear reactors will be operational as of June 1 st, 2014?  0 reactors  1 to 4 reactors  5 to 9 reactors  10 or more reactors

Answer Format You can assign as many as 100 (or as few as 0) by clicking on the line associated to each bin. 0 How many Japanese nuclear reactors will be operational as of June 1 st, 2014? You have a total of 100 tokens to assign across all bins. Each token represents 1% of your belief that the “bin” is the right one. Put tokens into each bin until you’ve exhausted your token supply. 1 1 If you want to rest the number of tokens for a particular bin, click on the trash can icon. 2 2 When you are done, click “save”. 3 3

Tips on forecasting dates and quantities Hunt for the right information– any clues as to what might happen and how similar events have resolved in the past. Think about plausible scenarios that can lead to outcomes within each of the listed intervals. Beware of overconfidence in some outcomes versus others. Why might your favored outcomes not occur? Update your forecasts as you become aware of new information. CHAMPS KNOW, the next training module will give you more detailed guidelines on making good forecasts.

How you’ll be scored Our chosen metric is the Brier score:  A Brier score is the squared deviation between probability judgments and reality.  Brier scores range from 0 to 2 in this context. Like in golf, lower scores reflect more accurate predictions!

Brier Scoring explained  Brier scoring harshly penalizes extreme overconfidence. Because errors are squared, your Brier score gets worse quickly with wrong extreme predictions. It also penalizes under-confidence in the long-run.  For example, consider these three forecasts for the question: Will Nouri al-Maliki be Prime Minister of Iraq on 9/30/2014? (Assume the true outcome is ‘No’) Forecast #1: Probability ‘Yes’ = 50%. Brier Score = 2(0.5-0) 2 = 0.5 Forecast #2: Probability ‘Yes’ = 100%. Brier Score = 2(1-0) 2 = 2 (Worst possible score) Forecast #3: Probability ‘Yes’ = 0%. Brier Score = 2(0-0) 2 = 0 (Best possible score)

Scoring Forecast Intervals We then score the options in the following way: Cut-PointsOutcome Cumulative probability the Real value is less than the cut-point Brier Scores 0 reactorsTRUE0.102*( ) 2 = or fewer reactorsTRUE = 0.402*( ) 2 = or fewer reactorsTRUE = 0.452*( ) 2 = Ordinal Brier Score (average) = To score CE questions, we essentially break each larger question down into mini questions aligned to each bin. We then average the Brier Scores from the bins. For example, the cumulative forecasted probability that 0 reactors will be operational is 10%, that there will be 4 or fewer operational reactors is 40%, and that there will be 9 0r fewer reactors is 45%. The correct answer is 0.

Scores over Time  We compute your score on each question daily.  For the best possible score, you should update your forecasts regularly.  If you don’t make a forecast on the first day a question is open, you will be assigned the median Brier score across all forecasters for each day, up to the day you make your first forecast.

Leaderboard During the season, everyone will be listed by their screen name and rank ordered by their mean Brier scores for all closed questions. Do well and find yourself at the top of the leaderboard!

Setting the outer limits Before you begin forecasting on any question, you will give the highest and lowest values that you think are feasible for the date or variable. You will be asked for: 1. Lower bound: A value so low that there is only a 2% chance the actual outcome would be below it. 2. Upper bound: A value so high that there is only a 2% chance the actual outcome would be above it.

How Wide or Narrow Should the Range Be? Most people make the same mistake when setting confidence intervals: they fail to make the range wide enough.  When people estimate a 90% confidence interval, the truth often falls inside the range less than 50% of the time. You will have to work hard to counteract the natural tendency to set your range too narrowly.  What might an extreme outcome look like? How could you potentially be surprised by the outcome?

How Wide or Narrow Should the Range Be? You could make the range infinitely wide to be certain the value will fall within it. But that would be uninformative and useless as a forecast. You could set the range very narrowly, but then it is unlikely to contain the correct answer. Think carefully, using the strategies outlined earlier, and strike a balance.

Response Format for Setting Ranges You will be asked to provide your forecast bounds in a box such as that shown below. Think carefully and give bounds that you think are wide enough that it has a 96% chance of containing the true value.

Scoring Ranges In addition to using the Brier score to score probability forecasts, we will use a different scoring rule to score bounds. For that purpose, we will use the Quantile Scoring Rule: Quantile score = p( x – q ) + ( q – x )I, q is one of the bounds you set x is the true value that occurs p is the cumulative probability associated with q (either 2% or 98%) I is a variable that equals 1 if x q.

Scoring Ranges For Example:  I set bounds of 20 and 400.  The true answer is 60.  LOWER BOUND: 0.02(60-20) + (20-60)0=0.8  I=0 because 20<60.  UPPER BOUND: 0.98(60-400) + (400-60)1=6.8  I=1 because 400>60.  Average of my scores for each bound to get my Quantile Score: 0.8 and 6.8 = 3.8

Scoring Ranges Because we want to be able to compare your Brier score with the score on your quantile range, we will rescale your quantile scores from 0 to 2. The most accurate quantiles will receive a score of 2. The least accurate quantiles will receive a score of 0. Scores in between the best and the worst will be re- scaled linearly on to the 0-2 scale.

 This Quantile Scoring Rule is designed to reward accuracy: you get the best score when you provide the most accurate quantiles.  We calculate this score for both of your bounds and take the average as your score.  While the range of this score varies depending on the values being estimated, we will scale the scores to be between 0 and 2, like the Brier score.  In addition to the leaderboard for Brier scores, there also will be a leaderboard for range scores.  With careful thought and good forecasting, you can be near the top of both leaderboards! Quantile Scoring Rule