Pitchfx sounds great now where do i get it
1 / 58

PitchFX : Sounds great! ... Now, where do I get it? - PowerPoint PPT Presentation

  • Uploaded on

PitchFX : Sounds great! ... Now, where do I get it?. Daniel I. Brooks The University of Iowa. PitchFX. PitchFX. Tracks each and every pitch thrown in MLB in real-time

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'PitchFX : Sounds great! ... Now, where do I get it?' - cricket

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Pitchfx sounds great now where do i get it

PitchFX:Sounds great! ... Now, where do I get it?

Daniel I. Brooks

The University of Iowa


  • Tracks each and every pitch thrown in MLB

    • in real-time

  • Provides all of the parameters necessary to very accurately model the flight of the baseball from the pitcher’s hand to the plate

  • Data includes accompanying play-by-play info

  • Available at a very affordable price. (Free!)

  • Sounds great! ... Now, where do I get it?


Prologue: Accessing PitchFX Data in 2007

PitchFX Data Today

Part 1: How can interested fans get access to it?

Part 2: What can they get out of it?

Part 3: How could availability and analysis improve?

Prologue how to get pitchfx data circa 2007

Prologue:How to get PitchFX Data Circa 2007

The first step by step guide
(The First?) Step by Step Guide

  • Alan Nathan

    • Published August 6th, 2007

    • http://webusers.npl.illinois.edu/~a-nathan/pob/tracking.htm

  • Contains a section on “How to Download”.

  • Ought to be straightforward enough, right?

How to download
How to Download

MLB Extended Gameday Pitch Logs

I. How to Download

Downloading the data: Go to the web site http://gd2.mlb.com/components/game/mlb/. Click on the year, then on the month; on the next page click on the day;on the next page click on the specific game; on the next page click on pdb; on the next page click on pitchers. For the Baltimore vs. Boston game played on August 1, 2007, the full link is as follows: http://gd2.mlb.com/components/game/mlb/year_2007/month_08/day_01/gid_2007_08_01_balmlb_bosmlb_1/pbp/pitchers/

The above steps take you to a page with a bunch of links that are of the form zzzzzz.xml, where zzzzzz is a six-digit code for a specific pitcher (see section III). For the above game, click on 122201.xml, which will get you to the pitch logs of Paul Shuey, who pitched to two batters in the 7th inning.

There is no way you

could have known this

Here s what you get
Here’s What You Get

Error Message?

How to download1
How to Download

You will then see a lot of numbers on the screen. Use whatever tools you have with your browser (e.g., "save page as") to save it as 122201.xml in some convenient folder.

Now launch Excel. From the File menu, open the file you just saved. An Open XML box will pop up. Check the As an XML list box, then click OK, and the file will load. You should see columns A through AK (37 columns total) filled and with headers in the first row. Immediately save is as an Excel file. The number of columns may change depending on when the file was written, and there is no guarantee that the number will remain the same into the future. However, the header names will hopefully stay constant. In the next section, I will discuss the meaning of the important parameters in the database.

As long as your version of Excel supports this…

earlier versions (and some Mac versions) do not

It gets harder
It gets harder…

  • But that’s only if you want to look at two batters worth of data (pitched by Paul Shuey!)

  • Want to look at multiple starts? Then you need a database. And you probably need perl or php or some other scripting language with easy XML parsing. And then you need a database front-end, and you need to learn SQL to access your database…

Accessing pitchfx data in 2007
Accessing PitchFX data in 2007…

  • Is really hard.

  • Is really time consuming.

  • Requires a high level of technical expertise.

  • And that’s before you even ever get into what it means.

  • What has changed to help remedy this problem?

Part 1 the casual sabermetrician or how can interested fans access pitchfx data

Part 1: The Casual SabermetricianorHow can interested fans access PitchFX Data?

The casual sabermetrician
The Casual Sabermetrician

  • A new group of baseball viewer

  • “Casual Sabermetricians”

  • This group is roughly made up of:

    • Bloggers

    • Forum Dwellers


    • Sportswriters

    • Major League Scouts

The casual sabermetrician1
The Casual Sabermetrician

  • The casual sabermetrician :

    • Wants to answer data-driven question

    • Knows PitchFX Data is out there

    • Lacks expertise to access PitchFX data

How to access pitchfx data
How to Access PitchFX Data?

  • There are now a few different ways these individuals can access PitchFX data:

    • Josh Kalk’s Website (now offline)

    • Fangraphs.com

    • BrooksBaseball.net

Pitchfx tools
PitchFX Tools

  • Fangraphs .com

    • Seasonal detail

    • Some game-by-game info

    • Lots of other sabermetric statistics handy

  • BrooksBaseball.net

    • Lots of game-by-game detail

    • Easily view other pitchers from same game

    • Strikezone maps / Splits / Situational Graphs

Pitchfx tools1
PitchFX Tools

  • These tools simply getting information

  • Still require that you can interpret the data once you have it…

  • …but they offload the busy work onto computers

Part 2 what can we get from the pitchfx data

Part 2: What can we get from the PitchFX data?

Let s pick a pitcher
Let’s Pick a Pitcher

  • Suppose we were interested in Jon Lester.

  • Let’s generate a “scouting report”:

    • What does he throw?

    • How hard does he throw the ball?

    • What mix of pitches does he use in games?

    • Which pitches worked for him?

    • When does he throw different pitches?

Pitchfx through b ref
PitchFX through B-Ref

Pitchfx through b ref1
PitchFX through B-Ref

Lefty righty splits
Lefty/Righty Splits



Pitchfx tools2
PitchFX Tools

  • Using a combination of PitchFX tools we can get an incredible amount of information about how a pitcher has performed.

  • Fangraphs: season-wide perspective

  • BrooksBaseball: start-by-start perspective

Pitchfx tools3
PitchFX Tools

  • Each tool provides other information that can help evaluate a pitcher:

  • FanGraphs provides easy access to other sabermetric pitching statistics

  • BrooksBaseball provides easy access to other pitcher detail from the same game

One more case study aroldis chapman
One More Case StudyAroldis Chapman

“He has a fastball clocked at 101 or 102 MPH, and a plus curveball and plus slider, to use the scouts' vernacular.”


“Aroldis Chapman has a tantalizing 100 mph fastball, but also question marks about his other pitches -- and his maturity.”

-…also ESPN

“He throws 100 and 101 mph… If he polishes up his changeup and tightens up his slider, he can be a young Randy Johnson.”

-His Agent

“His fastball was clocked from anywhere between 97 and 100 mph.”


"In order to become the best pitcher, I still need lots of things. I need to improve professionally. I need to work. I need to work with curveballs. I need to work with other kinds of pitches."


Case study aroldis chapman
Case Study: Aroldis Chapman

Can aroldis throw 100mph
Can Aroldis Throw 100mph?

Case study aroldis chapman1
Case Study: Aroldis Chapman

  • You can go do this at home.

  • You need to know virtually nothing about computers, you just need to know who Aroldis Chapman is and when he might have pitched.

The new access barrier
The New Access Barrier

  • The casual sabermetrician :

    • Wants to answer data-driven question

    • Can easily access PitchFX data

      Problem Solved! ... Right?

The new access barrier1
The New Access Barrier

  • The casual sabermetrician :

    • Wants to answer data-driven question

    • Can easily access PitchFX data

      Problem Solved! ... Right?

    • Two existing problems:

      • Data analysis is non-trivial

      • How trustworthy is the data?

Data analysis is non trivial
Data Analysis is Non-Trivial

  • Identifying pitches can be difficult at first

    • though it gets easier with practice

  • Sabermetricians are notoriously descriptive statisticians.

  • You could read dozens of articles online and not find a single inferential statistical test or any measure of variability.

  • This is exacerbated by a strange fascination with small sample sizes.

Problems with trust
Problems with Trust

  • Our tools purport to show lots of information

  • How accurate is this info?

  • Scouts/Teams may feel that the data isn’t trustworthy enough to use to evaluate pitchers.

  • May feel that due to obvious errors, data is bad

    • Consistent pitch classification is a huge problem.

  • May feel that due to odd conventions, data makes no sense

Problems with trust1
Problems with Trust

  • Certain conventions that the community has adopted are strange and educated fans/scouts get frustrated

    • Vertical Movement (rising fastballs, etc)

  • Certain results from the data are so counterintuitive that people get worried

    • Sinkers in large majority don’t really sink.

Existing problems
Existing Problems

  • Existing problems in accessibility relate to:

    • Poor ability to analyze / understand data

    • Lack of trust of data

  • With very basic instruction, both of these problems could be easily overcome

The new access barrier2
The New Access Barrier

  • Rather than technical, the new access barrier is subject-driven:

    • How do different pitches move?

    • How do I identify pitches?

    • What kind of information can I extract from the PitchFX dataset?

The new access barrier3
The New Access Barrier

  • My opinion: these kinds of limitations are better than technical ones.

  • Sort of like running a complex statistical analysis:

    • hard and time consuming by hand…

    • but the problem is really in the interpretation


  • But, for most people who are:

    • a) familiar with major league pitching

    • b) able to read a graph or table

    • c) familiar with very basic statistics

  • Accessing & learning how to understand PitchFX data is greatly simplified.

  • Instantly provides a wealth of previously unavailable information to the interested fan.

Since most of us don t have these guys at our disposal
…since most of us don’t have these guys at our disposal


  • Sportvision

  • The Many “Daves” of Fangraphs.com

  • Alan Nathan

  • Alex Clapp

  • Good friends at sonsofsamhorn.net


For Webcasters - Feel free to email:

Contact info: dan@brooksbaseball.net