SPF workshop UBCO February 20141 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing.

Slides:



Advertisements
Similar presentations
Introductory Mathematics & Statistics for Business
Advertisements

The Poisson distribution
The Simple Regression Model
Previous Lecture: Distributions. Introduction to Biostatistics and Bioinformatics Estimation I This Lecture By Judy Zhong Assistant Professor Division.
Chapter 6 – Normal Probability Distributions
Sta220 - Statistics Mr. Smith Room 310 Class #14.
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
HSM Practitioner’s Guider for Two-Lane Rural Highways Workshop Exercise IV – US 52 from Sageville to Holy Cross – Group Exercise - Session #8 8-1.
Spring Before-After Studies Recap: we need to define the notation that will be used for performing the two tasks at hand. Let: be the expected number.
1 Normal Probability Distributions. 2 Review relative frequency histogram 1/10 2/10 4/10 2/10 1/10 Values of a variable, say test scores In.
Spring  Crash modification factors (CMFs) are becoming increasing popular: ◦ Simple multiplication factor ◦ Used for estimating safety improvement.
SPF workshop February 2014, UBCO1 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing.
Spring  Types of studies ◦ Naïve before-after studies ◦ Before-after studies with control group ◦ Empirical Bayes approach (control group) ◦ Full.
Evaluating Hypotheses
Incorporating Safety into the Highway Design Process.
CHAPTER 8 Estimating with Confidence
Getting Started with Hypothesis Testing The Single Sample.
Intermediate Statistical Analysis Professor K. Leppel.
Inference in practice BPS chapter 16 © 2006 W.H. Freeman and Company.
Psy B07 Chapter 8Slide 1 POWER. Psy B07 Chapter 8Slide 2 Chapter 4 flashback  Type I error is the probability of rejecting the null hypothesis when it.
1 9/8/2015 MATH 224 – Discrete Mathematics Basic finite probability is given by the formula, where |E| is the number of events and |S| is the total number.
1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests.
1 9/23/2015 MATH 224 – Discrete Mathematics Basic finite probability is given by the formula, where |E| is the number of events and |S| is the total number.
CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing the objective function CH8: Theoretical.
Chapter Twelve Census: Population canvass - not really a “sample” Asking the entire population Budget Available: A valid factor – how much can we.
The Empirical Bayes Method for Safety Estimation Doug Harwood MRIGlobal Kansas City, MO.
MARE 250 Dr. Jason Turner Hypothesis Testing III.
Network Screening 1 Module 3 Safety Analysis in a Data-limited, Local Agency Environment: July 22, Boise, Idaho.
Understanding and Presenting Your Data OR What to Do with All Those Numbers You’re Recording.
Statistics and Quantitative Analysis U4320 Segment 8 Prof. Sharyn O’Halloran.
Linear Functions 2 Sociology 5811 Lecture 18 Copyright © 2004 by Evan Schofer Do not copy or distribute without permission.
1 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing the objective function CH8: Theoretical.
Ch4 Describing Relationships Between Variables. Pressure.
User Study Evaluation Human-Computer Interaction.
Evaluation of Alternative Methods for Identifying High Collision Concentration Locations Raghavan Srinivasan 1 Craig Lyon 2 Bhagwant Persaud 2 Carol Martell.
Psy B07 Chapter 4Slide 1 SAMPLING DISTRIBUTIONS AND HYPOTHESIS TESTING.
1 CEE 763 Fall 2011 Topic 1 – Fundamentals CEE 763.
Ch4 Describing Relationships Between Variables. Section 4.1: Fitting a Line by Least Squares Often we want to fit a straight line to data. For example.
1 Psych 5500/6500 Standard Deviations, Standard Scores, and Areas Under the Normal Curve Fall, 2008.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Chapter 9 Tests of Hypothesis Single Sample Tests The Beginnings – concepts and techniques Chapter 9A.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
1 2 nd Pre-Lab Quiz 3 rd Pre-Lab Quiz 4 th Pre-Lab Quiz.
11/23/2015Slide 1 Using a combination of tables and plots from SPSS plus spreadsheets from Excel, we will show the linkage between correlation and linear.
Normal distributions The most important continuous probability distribution in the entire filed of statistics is the normal distributions. All normal distributions.
1 7. What to Optimize? In this session: 1.Can one do better by optimizing something else? 2.Likelihood, not LS? 3.Using a handful of likelihood functions.
SPF workshop February 2014, UBCO1 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing.
1 CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first parametric SPF CH6: Which fit is fitter CH7: Choosing the objective function.
CE 552 Week 9 Crash statistical approaches Identification of problem areas - High crash locations.
July 29 and 30, 2009 SPF Development in Illinois Yanfeng Ouyang Department of Civil & Environmental Engineering University of Illinois at Urbana-Champaign.
CHAPTER 8 Linear Regression. Residuals Slide  The model won’t be perfect, regardless of the line we draw.  Some points will be above the line.
Chapter 8: Simple Linear Regression Yang Zhenlin.
Probability Theory Modelling random phenomena. Permutations the number of ways that you can order n objects is: n! = n(n-1)(n-2)(n-3)…(3)(2)(1) Definition:
Today: Standard Deviations & Z-Scores Any questions from last time?
Fall  Crashes are “independent” and “random” events (probabilistic events)  Estimate a relationship between crashes and covariates (or explanatory.
Psych 230 Psychological Measurement and Statistics Pedro Wolf September 16, 2009.
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
Session 2 History How did SPF come into being and why is it here to stay? Geni Bahar, P.E. NAVIGATS Inc.
Role of Safety Performance Functions in the Highway Safety Manual July 29, 2009.
Saving Lives with CARE New Developments: 2004 David B. Brown, PhD, PE 30th International Traffic Records Forum Denver,
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 8.3.
Low Cost Safety Improvements Pooled Fund Study Analytical Basics Dr. Bhagwant Persaud.
PCB 3043L - General Ecology Data Analysis.
HSM Practicitioner's Guide for Two-Lane Rural Highways Workshop
CHAPTER 22: Inference about a Population Proportion
Before-After Studies Part I
HSM Practicitioner's Guide for Two-Lane Rural Highways Workshop
Presentation transcript:

SPF workshop UBCO February CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing the objective function CH8: Theoretical stuff (skip) Ch9: Adding variables CH10. Choosing a model equation Workshop Objectives: a.Learn how to fit an SFP to data b.Understand what SPFs can and cannot do

2SPF workshop UBCO February 2014 What is what. 1.What are SPFs? 2.What information do (should) they give us? 3.What is that information used for? Loosely speaking, SPFs are tools that give information about the safety of units such as road segments, intersections, ramps, grade crossings … What is this?

SPF workshop UBCO February What is Safety? Here is a count of injury accidents for a Freeway Segment in Colorado. What is its SAFETY? Here is a (monthly) count of accidents for an Intersection in Toronto. What is its SAFETY? Segment of urban freeway in Denver Intersection in Toronto

SPF workshop UBCO February … “what is its safety?” implies that SAFETY is a property of UNITS What is a ‘Unit’? A Unit can be a road segment, an intersection, Mr. C.J. Smith, heavy trucks on the 401, etc.

5 1.9 mile long segment of 6-lane urban freeway in Denver, Colorado Had I defined: Safety = Accident Counts that would mean that safety improved from 1986 to 1987, deteriorated from 1987 to 1988 etc. Such a definition is not useful for safety management because safety changes even if there is no change in safety-relevant traits. (Exposure, traffic control, physical features, user demography, etc.) What is the Safety of a unit? SPF workshop UBCO February 2014

6 We need a definition of the safety of a unit such that, as long as the ‘safety-relevant’ traits of the unit do not change, it’s ‘safety’ does not change. Three period running averages; Freeway Segment, Colorado Thirteen period running averages, Intersection, Toronto One can rightly imagine that behind the fluctuations there is a gradually changing safety property that is some kind of average

There are three elements in the graph: 1.Observed values ● 2.The invisible (unknown) safety property μ 3.Our estimate of the unknown property ○ 7 Thirteen period running averages, Intersection, Toronto Reality SPF workshop UBCO February 2014 Abstraction

We are now ready. Definition: The safety property of a unit is the number of accidents by type and severity, expected to occur on it in a specified period of time. It will always be denoted by μ and its estimate by 8SPF workshop UBCO February 2014 What is the ‘safety of a unit’? Accident type Accident Severity PDOInjuryFatal Rear-end Angle Single-vehicle Pedestrian

9SPF workshop UBCO February 2014 We are gradually assembling the elements needed to say with clarity what an SPF is. Eventually it will be a function of ‘variables’. What is the link between safety and variables? The ‘safety’ of a unit depends on its ‘traits’

Traits & Safety 10

Definition: A trait is ‘safety-related’ if when it changes, μ changes. Consequence: Units with the same s-r traits have the same μ. S-R traits Corollary: Units that differ in some s-r traits differ in μ‘s. 11SPF workshop UBCO February 2014

12 Populations Units that share some traits form a population of units. Example, (1) rural, (2) two-lane road segments in (3) flat terrain of (4) Colorado. Because only some traits are common the units differ in many s-r traits and therefore differ in their μ We will describe the safety of a population by: Mean of μ’s, E{μ} and Standard deviation of μ’s, σ{μ} SPF workshop UBCO February 2014

13 Populations: real and imagined Example: segments of rural two-lane roads in Colorado form a population Their shared traits are: (1) State: Colorado, (2) Road Type: two-lane, (3) Setting: rural. A new population (subset) (1)& (2) & (3) & (4) Terrain: flat. Flat

SPF workshop UBCO February The more traits the fewer units. Colorado data: (1) & (2) & (3) 5323 segments Their shared traits are: (1) State: Colorado, (2) Road Type: two-lane, (3) Setting: rural, Add: 2.5<Segment Length <3.5 miles 597 segments Add: 1000<AADT<2000 vpd 119 segments If bin is 2400<AADT<2420 there are no units even in the rich data. But the SPF will still provide estimate of E{µ} for a population, albeit an ‘imagined ‘ one.

15 A Safety Performance Function is a tool which for a multitude of populations provides estimates of: 1.The mean of the μ’s in populations - E{μ} and 2.The standard deviation of the μ’s in these populations - σ{μ}. Finally: “What is an SPF?” SPF workshop UBCO February 2014 Notational conventions to remember

SPF workshop UBCO February Notational conventions to remember μ - the expected number of crashes for a unit - estimate of μ. Caret above always means: estimate of... - Average of μ’s in a population of units. E{.} always means ‘average or expected value of whatever the dot stands for.’ - standard deviation of μ’s in a population of units. σ{.} always means standard deviation of whatever the dot stands for.

17 The information we get from an SPF is not about units; it is always about a population of units. When we use the SPF information to estimate the safety of a specific unit we argue as follows: “This unit has the same traits as the units in the population. Therefore my best guess of its μ is E{μ}.” SPF workshop UBCO February 2014

18 In interim summary We needed to be clear about what is an SPF To get there we had to say what we mean by ‘safety of a unit’ and that it depends on its safety-relevant traits Further, we had to mention that units that share some safety-relevant traits form populations of units The safety of a population of units can be described by E{  }  and  These are necessary for practical applications An SPF provides estimates of E{  }  and  for many populations

SPF workshop UBCO February What and are needed for? Two groups of applications: Group I: We really need the E{  }. Examples: (a)To judge what is deviant we have to know what is ‘normal’. (b) How different are the E{  }‘s of segments with and without (say, paved shoulders)? Group II. We really need the μ of a specific unit and E{  } helps us to estimate it. Examples: (a) is this road segment a ‘blackspot’? (b) How did the μ of this unit change from ‘before’ treatment to ‘after’ treatment?

SPF workshop UBCO February Group I: We need the E{μ} of a population Group II: We need the μ of a unit What is normal for a unit?Is this unit a ‘blackspot’? How different are the means of two populations What might be the safety benefit of treating it? What was the safety benefit of treating it To answer: and, and

21 Some believe that we want to know the function linking E{  } and traits in order to be able to say how a change in the level of a trait will affect the E{  } of units. Opinions differ on whether such a use of an SPF can be trusted. I do not think so, and will give my reasons in Session 5. I hope that by the end of the workshop there will be more CMF skeptics. Is there a Group III?

SPF workshop UBCO February What and are used for? A sequence of simple illustrations. Go to ‘Spreadsheets to accompany PowerPoints.’ Open Spreadsheet #1 ‘Connecticut Drivers’ on ‘1. Data’ workbook. 1. How many units are deviant? 2. How well will my screen work? 3. What will be the accident savings of a treatment? 4. How effective was the treatment?

SPF workshop UBCO February Connecticut drivers ( ) Crashes, (k)Drivers, n(k) Total =29531 Preliminaries: Get and Data

24 ABCDE kn(k)B/B$11A * C(A-D$11) 2 *C Open workbook 2. Mean and variance estimates’ (of #1) Computing sample mean and variance.

25 ABCDE kn(k)B/B$11A * C(A-D$11) 2 *C Stay on workbook 2. ‘Mean and variance estimates’ (of #1) Naturally σ{μ}>0. Even is we used age, gender and exposure as traits, there still would be differences SPF workshop UBCO February 2014 Estimate of V{μ}, =√0.26=0.51

SPF workshop UBCO February Use and for: Screening. Question: What % is these drivers have a μ that is, say, more than 5 times the mean? (μ>5*0.24=1.2 acc. in six years) Open workbook 3. ‘How many High mu drivers’ (of #1) GAMMADIST(μ, b, 1/a, TRUE)

27 Answer: 1.Assume that μ are Gamma distributed. 2.Compute parameters of 3.Use Excel function GAMMADIST(μ, b, 1/a, TRUE) 4.P(μ<1.20)= There are (≈ 29,531*0.01=) 295 such (5 x) drivers P(μ<1.20)

SPF workshop UBCO February Use and for: Screen Performance Question: If we decide to ‘treat’ those 51 (out of 29,531) who had 4 or more accidents how will such a screen do? Connecticut drivers ( ) Crashes, (k)Drivers, n(k) Total =29531

SPF workshop UBCO February To answer we have to determine how many of those drivers with 4, 5, 6 or 7 crashes are truly ‘high μ’? If in a population of unit μ is Gamma distributed then the μ’s of those units with k crashes are also Gamma distributed with Open workbook 4. ‘Gamma with k=4, 5, 6, 7’ (of #1) EB

SPF workshop UBCO February Modify formula in B7 and copy down First answer: Amongst those who recorded 4 crashes, 66% have μ<1.2. Do same for k=5, 6, and 7. Record.

31 kn(k)P(μ≤1.2)False PositivesCorrect Positives Sums Answer: Of 295 with μ>1.2, 21 correctly identified, 30 incorrectly identified and the rest missed 274 missed 21 caught 30 False Use and for: Screen Performance SPF workshop UBCO February 2014

32 Use and for: Anticipating benefit CMF ≡ Expected accident ‘with’ Expected accident ‘without’ Reduction in accidents=  CMF) Question: How many accidents will be saved if treatment with CMF=0.95 is administered to Connecticut drivers with k≥4? Preliminaries

SPF workshop UBCO February Recall that: Thus, e.g., for k=4, (4+0.85)/(3.55+1)=1.07 crashes kn(k)(k+b)/(a+1)n(k)*(k+b)/(a+1) EB Open workbook 5. ‘Anticipating benefit’ workpage (of #1) Expected reduction=59.4×(1-0.95)=2.97 acc. in six years.

SPF workshop UBCO February The 51 drivers with k>=4 received some treatment. Question: If treatment had no effect, and nothing else changed, how many crashes are they expected to have in a 6-year ‘after treatment’ period? Just as before: kn(k) (k+b)/(a+ 1)n(k)*(k+b)/(a+1) Use and for: Research about CMF

SPF workshop UBCO February kn(k) (k+b)/(a+ 1)n(k)*(k+b)/(a+1) How come that drivers with 227 accidents are expected to have only 59.4? Before: 4*33+5*14+6*3+7*1=227 crashes in six years If ineffective, Expected After= 59 crashes in six years =168 Regression to mean!

SPF workshop UBCO February Summary of illustrations: We used estimates of E{μ} and VAR{μ} to: Estimate how many deviant units are in a population; Estimate how many deviants are in subpopulations of units with many crashes (correct and false positives and negatives); How many crashes will be saved and how many to expect after an ineffective treatment.

37 Two perspectives on SPF E{  } and  = f(Traits, parameters) Applications centered perspective Cause and effect centered perspective The perspective determines how modeling is done

SPF workshop UBCO February E{  } and  = f(Traits, parameters) Applications centered perspective Here the question is: “How to do modeling to get good estimates of E{  } and  ? The perspective determines how modeling is done

SPF workshop UBCO February E{  } and  = f(Traits, parameters) Cause and effect centered perspective Here the question is:” How to do modeling to get the right ‘f’ and parameters so that I can compute the change in E{  } caused by a change in a trait. The perspective determines how modeling is done

SPF workshop UBCO February Summary of 1. 1.We defined ‘safety’; 2.The safety of a unit is determined by its s-r traits; 3. Units that share some traits form a population; 4. The safety of a population is described by E{μ} and σ{μ}; 5.The SPF is... A Safety Performance Function is a tool which for a multitude of populations provides estimates of: 1.The mean of the μ’s in populations - E{μ} and its accuracy; 2.The standard deviation of the μ’s in these populations - σ{μ}.