Simulating Sports: The Inputs and the Engines Paul Bessire General Manager, Co-Founder PredictionMachine.com September 29, 2010.

Slides:



Advertisements
Similar presentations
Methodology- Framework Homeruns Batting Average Base on Balls Runs Batted In Strikeouts Errors Double Plays Fielding % Salaries Performance.
Advertisements

Baseball Statistics Joseph Mark October 6, 2009.
Baseball: The Game of Statistics By: Jenna Hannoosh and Katie Higgins.
The Basics of Baseball for middle school physical education class’s
Baseball Simulator Doug Krach March 15, Contents Project Description User Profiles Design Profiles Risk Analysis Testing Plan Demonstration Deliverables.
Baseball Statistics By Krishna Hajari Faraz Hyder William Walker.
Optimization of Batting Order Frank R. Zheng. A Quick Introduction to Baseball  Two teams alternate batting and fielding.  Batting team tries to score.
Analyzing Major League Baseball Using XMT Architecture April 22, 2014 Vince Gennaro Society for American Baseball Research.
GameRank: Ranking and Analyzing Baseball Network Zifei Shan, Shiyingxue Li, Yafei Dai
PREDICTING MLB CAREER SALARIES Stephanie Aube Mike Tarpey Justin Teal.
Markov Chains in Baseball
Shadle Park Baseball Shadle Park Baseball Hitting is a B.L.A.S.T. Balance Load Arrive Swing Turn it loose.
Baseball Trajectories: A Game of Inches Jim Hildensperger Kyle Spaulding Dale Garrett.
James Parsons Mathematics Numerical Analysis Applied Numerical Analysis Sabermetrics.
CSE 219 COMPUTER SCIENCE III PROJECT INTRODUCTION: A FANTASY BASEBALL DRAFT KIT.
By Ron Knapp Stats G GS GF W L PCT ERA CG SHO SV IP BFP H ER R HR BB IBB SO WP HBP BK HLD , ,150 2,939 1,211.
Who’s on First: Simulating the Canadian Football League regular season Keith A. Willoughby, Ph.D. University of Saskatchewan Joint Statistical Meetings.
March Problem of the Day Statistics: Average / Mean, Median, Mode and Range.
SCOREKEEPING CLINIC Presented by: Anita Arnold 2/19/2015.
Math and Sports Paul Moore April 15, Math in Sports? Numbers Everywhere –Score keeping –Field/Court measurements Sports Statistics –Batting Average.
Capturing Hit F/X Data By: Greg Moore. Overview  Why Capture Hit F/X data?  How can we capture Hit F/X data?  What is Hit F/X data?  Accuracy of Hit.
MLB Prediction Group 25 陳乃群 、 趙冠琳 Intro CS - Project presentation.
Pitching Strategies For Catchers & Pitchers Source: “Coaching Pitchers.” Joe McFarland, 1999.
Introduction Offensive strategies of the National Football League have seemingly shifted towards a “West Coast” style offense, relying more heavily on.
BY NOAH CHRISTOPHER. Table of contents WHAT WAS JACKIE ROBINSON FAMOUS FOR? WHAT WAS JACKIE ROBINSON FAMOUS FOR? COMPARE AND CONTRAST COMPARE AND CONTRAST.
Calculating Baseball Statistics Using Algebraic Formulas By E. W. Click the Baseball Bat to Begin.
Cricket is a sport that involves a lot of skill in batting, bowling and fielding.
Two Sample Project Example 5/6/2013 Ms. Browne made this up Saber metrics: TX Rangers vs. SF Giants.
HOW TO PLAY THIS GAME!. WHAT MAKES A TEAM! SelflessSelfish AnticipatePresume ResponsibilityPrideful ChallengeProblem.
Bainbridge Island Little League 2009 Umpire Clinic 25 March 2009.
Hierarchical Design Methodology This methodology allows the designer to: –Transform a schematic into a module –Use submodules to create new modules from.
Baseball Objects Baseball Actions Fantasy Baseball.
Moneyball in the Classroom Using Baseball to Teach Statistics Josh Tabor Canyon del Oro High School
Chapter 8 Standardized Scores and Normal Distributions
Sabermetrics- Advanced Statistics in the MLB. More On Base Percentage (OBP) measures the most important thing a batter can do at the plate: not make.
1919 New York Yankees Official Logo William Brennan Sports Finance February 6, 2014.
Case 2: Assessing the Value of Alex Rodriguez Teresa Sonka Gail Bernstein.
Baseball Quiz #2 Test your knowledge. One out, runner on 2 nd. R2 is stealing 3 rd when the catcher interferes with the batter, who flies out to center.
Mike Herrmann Audience: 6 th Grade Students  Major league baseball is composed of two leagues  National League present  American League –
Hit Tracker Power Projection Jim Edmonds 1-year report: 2008 Projections, using 2007 data.
Scheduling the Optimal Baseball Line-up Stefanie Molin Christian Morales Sarah Daniels.
Continue viewing this PowerPoint to read all about the 27 Time World Series Champions! Mason Siegel Presented by: Mason Siegel January 10, 2012.
Baseball. Rules Each team has 9 players plus substitutes. The only ways you can be out is by being caught, tagged, struck out or out on a base. The.
 Rules and gameplay  Scoring  Field  Red is input.
Economic and team performance data have different effects on different elements of ticket demand Unemployment is a strong predictor of attendance but not.
Information gathered from: The Wiffle Ball Inc. ( )
Studying the Effects of Aging in Major League Baseball Phil Birnbaum
All About Baseball. Written By Jason. Table of Contents Chapter 1 All About the Field3 Chapter 2 Practices4 Chapter 3 A Real Game5 Chapter 4 How to Win.
Who was the greatest person in baseball? By: Austin Kidder.
PCLL Scorekeeper Presentation Scorekeeper’s Clinic 101.
Mighty Jackie-The strike-out queen Genre: Historical Fiction -set in the real time and place in the past. It may include real people and events that actually.
BASEBALL AVERAGES BY: CRAIG KNIGHT. WHAT IS BASEBALL? According to Websters Dictionary Baseball is defined as: a game played with a bat and ball between.
Scorekeeping Guide BASIC BATTING STATISTICS At Bats (AB) = Plate appearances minus [ BB + HBP + SB + SF + CI ] Hits = 1B + 2B + 3B + HR Batting Avg. =
Lopez – MA 276 MA 276: Sports and statistics Lecture 2: Statistics in baseball 0.
Statistics in Baseball How do players perform under different counts?
At Bats Hits Runs Doubles Triples Home Runs RBI’s Walks Batting Average Strikeouts.
By Adam Rothstein and Jesse Cox. Project description We are going to examine what characteristics, extranalities, and other influences are statistically.
Baseball: The Game of Statistics
Activities that enhance the conceptual understanding of statistics
Basic and Intermediate
Scorekeeping Guide CPLL-Spring 2014
The Baltimore Orioles, Relationship of Wins and Loses, Batting Average, Earned Run Average, and Errors Stalanic Anu, Matthew Beeman, Jonathon Chudoba,
video time
Data Mining Quantitative Values
The Math of Baseball Will Cranford 11/1/2018.
Science Fair – Baseball
2016 WORLD SERIES CHAMPIONS Presented By: Mr. Losciale
Probability in Baseball
Scorekeeping Guide Lacamas Little League--Spring 2009
Presentation transcript:

Simulating Sports: The Inputs and the Engines Paul Bessire General Manager, Co-Founder PredictionMachine.com September 29, 2010

Table of Contents Intro PredictionMachine.com & Simulation Overview Simulating Baseball Plate Appearance Decision Tree Examples (more second presentation)

Introduction 2004 University of Cincinnati BBA, Finance and QA 2005 MSQA - Master’s Project (with Dr. Fry): Measuring Individual and Team Effectiveness in the NBA Through Multivariate Regression 2004 – 2009 WhatIfSports.com/FOXSports.com, Director, Content and Quantitative Analysis 2010 Launched PredictionMachine.com in February

About PredictionMachine.com “We play the game 50,000 times before it’s actually played.” Built by Paul Bessire to focus on content after six years at WhatIfSports.com/FOXSports February Launched with Super Bowl Prediction (Indianapolis 28 – New Orleans 27) “Predictalator” – Simulation engine plays entire NFL season 50,000 times in 8 seconds March Madness, NBA Playoffs, MLB Daily, College Football, NFL Customizable Predictalator – Any teams, Any where, Any line Fantasy Football Projections Live simulator built to analyze in-game winning probabilities and value in coaching decisions

Sports Simulation Play-by-play –A “play” means something different for each sport –Probabilities for every individual outcome –Random number generation –Pitch-by-pitch (or basketball/hockey pass-by-pass) not needed –Account for every possible statistical interaction during a game Can be recreated quickly –50,000+ games/second –All data tracked –Every outcome is different –Boxscores

Significant Stats Pitchers HBP/BF BB/(BF – HBP) OAV 1B/Hit Allowed 2B/Hit Allowed 3B/Hit Allowed HR/Hit Allowed K/Out GO/FO BF Pitches Thrown/BF Relative Range Factor Fielding Percentage Handedness Ballpark Effects League Averages Hitters HBP/PA BB/(PA – HBP) AVG 1B/Hit 2B/Hit 3B/Hit HR/Hit K/Out GO/FO PA Relative Range Factor Fielding Percentage Catcher Arm Rating CS% (Runner) Speed Rating Handedness Ballpark Effects League Averages

Insignificant Stats Pitchers Wins Losses Saves Holds Complete Games Shutouts ERA (kind of – 2B and 3B approx) Unearned Runs Games Started Pitch Types Performance in Counts Other Situational Stats Hitters RBI IBB Runs (kind of – in Speed Formula) GIDP (kind of – in Speed Formula) SF (kind of – in PA, but also situational) SH (kind of – in PA, in but also situational) SBA (kind of – attempts, but also setting) Performance in Counts Other Situational Stats

Ballpark Effects

Ballparks – Extremes (Min. 3 seasons) EffectBallparkHighBallparkLow HitsCoors Field1.182Petco Park.908 2BBaker Bowl1.291Dodger Stadium.795 3BPalace of the Fans1.868Great American Ballpark.523 HR_RFCoors Field1.374Municipal Stadium.636 HR_LFCoors Field1.385Municipal Stadium.634 Runs (unused) Coors Field1.380Petco Park.830

PA Decision Tree - Normalization Every step in PA uses modified* log5 normalization (Bill James AVG example): H/AB = ((AVG * OAV) / LgAVG) / ((AVG * OAV) / LgAVG + (1- AVG )*(1- OAV)/(1-LgAvg)) Where, LgAVG = (PLgAVG + BLgAVG)/ Pedro vs Ruth Example: H/AB = ((.393 *.167) /.2791) / ((.393 *.167) / (1-.393)*(1-.167)/( )) Where, LgAVG = ( )/2 or.2791 Result =.2504 * Modified due to a flaw in the assumption above that the batter and pitcher carry equal (50/50) weights on each possible outcome of the PA event. Also accounts for handedness and ballpark.

PA Decision Tree – Steps 1* Plate Appearance Unusual Event (IBB, WP, PB, SB, CS, SH, Hit and Run, Pickoff, Balk) Normal PA HBP (per PA or BFP) Not HBP BB (per PA or BFP – HBP) At Bat… * No ballpark or handedness adjustments made yet.

PA Decision Tree – Steps 2 At-Bat Out Strikeout (K/Out) Normal (Logic to determine direction and GO or FO) Hit (Poor Play) Error (Fielding Percentage) Normal Hit… (AVG vs. OAV)* * Historical handedness adjustment and ballpark hits multiplier used.

PA Decision Tree – Steps 3 Hit* Normal – In Play HR* (HR/Hit) Out (Plus Play) Normal Hit 3B * (3B/Hit * multiplier for lost HR) 2B* (2B/Hit * multiplier for lost HR) 1B * Ballpark multipliers used.

PA Decision Tree – Matchup Weights Addresses previous 50/50 assumption using League-Adjusted Variance to form batter and pitcher weights for each step: HBP/PABB/(PA-HBP)H/ABK/(OUT)HR/HIT2B/HIT3B/HIT Pitcher% Hitter%

Matchup Weights: What does this mean? Batter always has more control (even with HBP and BB) –Makes final decision (Swing or not) –Dictates strike zone –Less consistent Doubles and Triples are (mostly) out of pitcher’s control (BABIP) Does not necessarily batting is more important –9 vs. 1 –Fewer pitcher outliers means elite pitchers are more valuable

PA Decision Tree - Normalization Batting Average Example using Matchup Weights: H/AB = ((1.066*AVG *.934*OAV) / LgAVG) / ((1.066*AVG *.934*OAV) / LgAVG + ( *AVG )*( *OAV)/(1-LgAvg)) Where, LgAVG = (.934*PLgAVG *BLgAVG)/ Pedro vs Ruth Example (with handedness): H/AB = ((1.066*.393 *.167 *.934) /.2795) / ((.393 *.167) / (1-.393)*(1-.167)/( )) Where, LgAVG = (1.066* *.276)/2 or.2795 Result * Handedness =.2502 * Final Result =.2614