simulating sports the inputs and the engines l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Simulating Sports: The Inputs and the Engines PowerPoint Presentation
Download Presentation
Simulating Sports: The Inputs and the Engines

Loading in 2 Seconds...

play fullscreen
1 / 20

Simulating Sports: The Inputs and the Engines - PowerPoint PPT Presentation


  • 450 Views
  • Uploaded on

Simulating Sports: The Inputs and the Engines Paul Bessire Product Manager, Quantitative Analysis and Content FOX Sports Interactive, WhatIfSports.com July 15, 2009 Table of Contents WhatIfSports.com Overview Challenges with Simulating Baseball Plate Appearance Decision Tree

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Simulating Sports: The Inputs and the Engines' - Ava


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
simulating sports the inputs and the engines

Simulating Sports: The Inputs and the Engines

Paul Bessire

Product Manager, Quantitative Analysis and Content

FOX Sports Interactive, WhatIfSports.com

July 15, 2009

table of contents
Table of Contents
  • WhatIfSports.com Overview
  • Challenges with Simulating Baseball
  • Plate Appearance Decision Tree

Or “Improving the log5 Normalization Model for Batter/Pitcher Matchups”

  • Pedro vs. Ruth (mostly second presentation)
about whatifsports com
About WhatIfSports.com
  • February 2000 - Launched in Cincinnati with SimMatchup
  • 2001 - SimLeague Baseball (like Strat-o-Matic) and Basketball; Paul Bessire runs free leagues
  • 2002 – SimLeague Football and Hockey; Paul wins own baseball league with “Streaking Ho-Hos”
  • 2004 – Hoops Dynasty and Gridiron Dynasty; Paul joins WIS part-time “between school”
  • 2005 – WhatIfSports.com acquired by FOX Interactive Media; Paul comes on full-time
  • 2006 – Hardball Dynasty and Clutch Racing Dynasty; All simulations rewritten with Paul’s help
  • 2008 – FC Dynasty
  • Present – 600,000+ registered users, part of FOX Sports TV group
sports simulation
Sports Simulation
  • Play-by-play
    • A “play” means something different for each sport
    • Probabilities for every individual outcome
    • Random number generation
    • Pitch-by-pitch (or basketball/hockey pass-by-pass) not needed
    • Account for every possible statistical interaction during a game
  • Can be recreated quickly
    • 200+ games/second
    • All data tracked
    • Every outcome is different
    • Boxscore (link)
    • Many relevant applications (second presentation)
baseball challenges
Baseball Challenges
  • Missing Player Data
  • Defensive Metrics
  • Ballpark Effects
  • Era Adjustments
  • Assigning Value (Salaries ~ RC27# * PA or ERC# * BF + Fielding + Extremes)
  • Career “Seasons” (Pujols #3 in career $/PA, Musial #16; Gibson #31 in $/IP)
  • Fatigue (Projected PA vs Actual PA/162 or Projected IP & GP% vs Actual IP/162 & Historical GP%)
missing player data
Missing Player Data
  • Typically solved with Regression
    • Linear: Pitchers’ 2B or 3B per hit allowed or Pitches Thrown per BF
    • Multivariate: Ballpark Effects
    • May be Era and/or Ballpark Adjusted
  • Discriminate Analysis/Cluster Analysis
    • Catcher’s Arm Ratings
    • Basketball Positional Effectiveness
  • Fitting to a curve/distribution
    • Player Generation and Development
    • Assigning Ratings or Grades
significant stats has missing data
Pitchers

HBP/BF

BB/(BF – HBP)

OAV

1B/Hit Allowed

2B/Hit Allowed # (regression)

3B/Hit Allowed # (regression)

HR/Hit Allowed

K/Out # (regression)

GO/FO # (regression for GO)

BF # (approx. ~ outs + hits + bb + hbp)

Pitches Thrown/BF # (regression)

Relative Range Factor # (WIS formula)

Fielding Percentage # (fit to curve for grade)

Handedness (historical impact)

Ballpark Effects # (multivariate regression)

League Averages

Hitters

HBP/PA

BB/(PA – HBP)

AVG

1B/Hit

2B/Hit

3B/Hit

HR/Hit

K/Out # (regression)

GO/FO # (regression for GO)

PA

Relative Range Factor # (WIS formula)

Fielding Percentage # (fit to curve for grade)

Catcher Arm Rating # (discriminate analysis)

CS% (Runner) # (regression for CS)

Speed Rating # (WIS forumla)

Handedness (historical impact)

Ballpark Effects # (multivariate regression)

League Averages

Significant Stats ( # has missing data)
insignificant stats
Pitchers

Wins

Losses

Saves

Holds

Complete Games

Shutouts

ERA (kind of – 2B and 3B approx)

Unearned Runs

Games Started

Pitch Types

Performance in Counts

Other Situational Stats

Hitters

RBI

IBB

Runs (kind of – in Speed Formula)

GIDP (kind of – in Speed Formula)

SF (kind of – in PA, but also situational)

SH (kind of – in PA, in but also situational)

SBA (kind of – attempts, but also setting)

Performance in Counts

Other Situational Stats

Insignificant Stats
wis relative range factor
WIS Relative Range Factor
  • Range Factor
    • Important because range can turn hits into outs and outs into hits
    • Generally defined as (Putouts + Assists)/(Innings/9)
    • Reliant on many factors
    • Wildly inconsistent across eras
    • Does not include errors
    • Need another metric…
  • WIS Relative Range Factor
    • Similar to Bill James RRF, but not as robust (data limitations)
    • Approximates plays made/possible plays made
    • Used to approximate + and – plays
    • Includes errors
    • Era-adjusted
pa decision tree normalization
PA Decision Tree - Normalization

Every step in PA uses modified* log5 normalization (Bill James AVG example):

H/AB = ((AVG * OAV) / LgAVG) /

((AVG * OAV) / LgAVG + (1- AVG )*(1- OAV)/(1-LgAvg))

Where, LgAVG = (PLgAVG + BLgAVG)/2

2000 Pedro vs. 1923 Ruth Example:

H/AB = ((.393 * .167) / .2791) /

((.393 * .167) / .2791+ (1- .393)*(1- .167)/(1-.2791))

Where, LgAVG = (.283 + .276)/2 or .2791

Result = .2504

* Modified due to a flaw in the assumption above that the batter and pitcher carry equal (50/50) weights on each possible outcome of the PA event. Also accounts for handedness and ballpark.

pa decision tree steps 1

Plate Appearance

Unusual Event

(IBB, WP, PB, SB, CS, SH,

Hit and Run, Pickoff, Balk)

Normal PA

HBP

(per PA or BFP)

Not HBP

BB

(per PA or BFP – HBP)

At Bat…

PA Decision Tree – Steps 1*

* No ballpark or handedness adjustments made yet.

pa decision tree steps 2
PA Decision Tree – Steps 2

* Historical handedness adjustment and ballpark hits multiplier used.

pa decision tree steps 3

Hit*

Normal – In Play

HR*

(HR/Hit)

Out

(Plus Play)

Normal Hit

3B*

(3B/Hit * multiplier

for lost HR)

2B*

(2B/Hit * multiplier

for lost HR)

1B

PA Decision Tree – Steps 3

* Ballpark multipliers used.

pa decision tree matchup weights
PA Decision Tree – Matchup Weights

Addresses previous 50/50 assumption using League-Adjusted Variance to form batter and pitcher weights for each step:

matchup weights what does this mean
Matchup Weights: What does this mean?
  • Batter always has more control (even with HBP and BB)
    • Makes final decision (Swing or not)
    • Dictates strike zone
    • Less consistent
  • Doubles and Triples are (mostly) out of pitcher’s control (BABIP)
  • Does not necessarily batting is more important
    • 9 vs. 1
    • Fewer pitcher outliers means elite pitchers are more valuable
pa decision tree normalization19
PA Decision Tree - Normalization

Batting Average Example using Matchup Weights:

H/AB = ((1.066*AVG * .934*OAV) / LgAVG) /

((1.066*AVG * .934*OAV) / LgAVG + (1.066- 1.066*AVG )*(.934- .934*OAV)/(1-LgAvg))

Where, LgAVG = (.934*PLgAVG + 1.066*BLgAVG)/2

2000 Pedro vs. 1923 Ruth Example (with handedness):

H/AB = ((1.066*.393 * .167 * .934) / .2795) /

((.393 * .167) / .2795+ (1- .393)*(1- .167)/(1-.2795))

Where, LgAVG = (1.066*.283 + 0.934*.276)/2 or .2795

Result * Handedness = .2502 * 1.045

Final Result = .2614

thanks
Thanks

Questions? – @lunch or after second presentation

Email: PBessire@WhatIfSports.com

Phone: 513-291-0321

See me for business card with promo code