Transferring lean six sigma and dfss data simply and effectively baseball analytics
1 / 28

click here to view this dfss presentation - PowerPoint PPT Presentation

  • Uploaded on

TRANSFERRING LEAN SIX SIGMA AND DFSS DATA SIMPLY AND EFFECTIVELY “Baseball Analytics” 4 th Annual Design for Six Sigma Conference James M. Wasiloff Cary Young US Army TACOM LCMC 9 February 2009

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'click here to view this dfss presentation' - Audrey

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Transferring lean six sigma and dfss data simply and effectively baseball analytics l.jpg


4th Annual Design for Six Sigma Conference

James M. Wasiloff

Cary Young


9 February 2009

Baseball is the only field of endeavor where a man can succeed three times out of ten and be considered a good performer.  ~Ted Williams

Agenda l.jpg

  • Introduction of Baseball Analytics

  • Descriptive statistics and graphical data analysis

  • Hypothesis development and testing

  • Analysis of Variance (ANOVA)

  • Pearson Correlation Coefficient

  • Simple Linear Regression

  • Multiple Regression and Best Fit Model

  • Predictive Models

  • Statistical Process Control

  • Next Steps / Application in Other Sports

Introduction l.jpg

Baseball quote EFFECTIVELY…


  • Why the session…

  • Better way to understand and teach LSS and DFSS Tools

  • Can Money Spent = Wins

  • Keep it “Statistically Simple”

  • Just the Beginning

The charm of baseball is that, dull as it may be on the field, it is endlessly fascinating as a rehash.  ~Jim Murray

Test of hypothesis l.jpg
Test of Hypothesis EFFECTIVELY

  • Null Hypothesis:

    Ho: m1=m2

    MLB example: Ho: Mean Batting Average of the NY Yankees from 2006-2008 equals the Mean Batting Average of the Tampa Bay Rays from 2006-2008

  • Alternative Hypothesis:

    Ha: m1= m2 or m1> m2

    “They are not the same”

During my 18 years I came to bat almost 10,000 times.  I struck out about 1,700 times and walked maybe 1,800 times.  You figure a ballplayer will average about 500 at bats a season.  That means I played seven years without ever hitting the ball.  ~Mickey Mantle, 1970

Batting stats l.jpg

American League

National League

It ain't like football.  You can't make up no trick plays.  ~Yogi Berra

Test of hypothesis6 l.jpg
Test of Hypothesis EFFECTIVELY

  • Are the batting averages of the National League different than the American League?

  • T-test

  • Interpretation: “P Low, null must go – P High, null will fly”

Two-Sample T-Test and CI: AL, NL

N Mean StDev SE Mean

AL 14 0.27086 0.00772 0.0021

NL 16 0.26356 0.00704 0.0018

Difference = mu (AL) - mu (NL)

Estimate for difference: 0.007295

95% CI for difference: (0.001717, 0.012872)

T-Test of difference = 0 (vs not =): T-Value = 2.69 P-Value = 0.012

Are salaries correlated to team performance l.jpg
Are Salaries Correlated to Team Performance? EFFECTIVELY

  • The trend is…

  • Problem statement:

    • Will increasing player salaries lead to more success?

Baseball was the major American sport in which money bought success. George Will, Moneyball

Slide11 l.jpg

Use This Simple Graphic? EFFECTIVELY

Pearson Correlation Coefficient Definition

Values of r

Correlation coefficient l.jpg
Correlation Coefficient EFFECTIVELY

  • Graphic approximation… what do you think?

  • Minitab results: Pearson correlation of Total Salary 2008 and Wins in 2008 = 0.323

  • Interpretation of results

American league west in 2002 moneyball data set l.jpg
American League West in 2002 EFFECTIVELY(“Moneyball” Data Set)

Pearson correlation of Wins and Payroll = -0.928

Anova l.jpg

A baseball fan has the digestive apparatus of a billy goat.  He can, and does, devour any set of diamond statistics with insatiable appetite and then nuzzles hungrily for more.  ~Arthur Daley

  • Null Hypothesis:

    Ho: m1=m2 = m3=mn

    MLB example: Ho: Mean Batting Average of the NY Yankees equals the Mean Batting Average of the Tampa Bay Rays equals the Mean Batting Average of the NY Mets equals the Mean Batting Average of the …

  • Alternative Hypothesis:

    Ha: At least on mkis different from one other mk

    MLB example: At least one team has a Mean Batting Average different from all other teams

Regression analysis l.jpg
Regression Analysis EFFECTIVELY

  • Is it possible to model and predict number of wins for a season based on statistical parameters?

  • The initial simple linear regression model, 2002 data:

Multiple regression and best fit model l.jpg
Multiple Regression and Best Fit Model EFFECTIVELY

  • Regression studies the relationship between the mean value of a random variable and the corresponding values of one or more independent variables.

    • A model for predicting one variable from another.

    • A statistical analysis assessing the association between two variables.Regression analysis is a method of analysis that enables you to quantify the relationship between two or more variables (X) and (Y) by fitting a line or plane through all the points such that they are evenly distributed about the line or plane.

  • Multiple regression is a method of determining the relationship between a continuous process output (Y) and several factors (Xs).

American league west in 2002 moneyball data set17 l.jpg
American League West in 2002 EFFECTIVELY(“Moneyball” Data Set)

Exploratory data analysis l.jpg
Exploratory Data Analysis EFFECTIVELY

What does it mean?

Testing the predictive model l.jpg
Testing the Predictive Model EFFECTIVELY

  • Tigers 2008 data…

    • Here is the predictive transfer function from Minitab:

    • Testing on 2008 Data:

      • Actual win count = 74

      • Predicted win count = 74.26

Wins = 32.1 + 1.48 Average Age - 34.5 Team ERA + 154 Team Batting Average + 0.582 Saves (P) + 0.150 Runs (P) - 0.0202 Walks (P) - 0.0087 SO (P)

Statistical process control and statistical thinking l.jpg
Statistical Process Control and Statistical Thinking EFFECTIVELY

  • “Statistical process control is the application of statistical methods to identify and control the special cause of variation in a process” –

  • Statistical Thinking: The process of using wide ranging and interacting data to understand processes, problems, and solutions.

    • The opposite of “one factor at a time” where the tendency is to change one factor and “see” what happens.

    • Statistical thinking is the tendency to want to understand situational phenomena over a wide range of data where several control factors may be interacting at once to produce and outcome.

    • Common cause variation becomes your friend and special cause variation your enemy.

    • Attribute judgements of good and bad are replaced with estimates of significance with given confidence.

Slide21 l.jpg

Example 1: Notional Data – Status at Game 37 EFFECTIVELY

Range outside UCL indicates “out of control”

-Need to investigate “special cause”

Slide22 l.jpg

Which Method is Earliest at Detecting a “Special Cause? EFFECTIVELY

Analytics Approach

Old Way

Next steps l.jpg

  • Additional MLB Analytics

  • System approach to baseball

  • Other sports?

    • Golf Fishbone Cause and Effect Analysis example

Baseball statistics are like a girl in a bikini.  They show a lot, but not everything.  ~Toby Harrah, 1983

Slide26 l.jpg




world class batters

who use consistent, disciplined, and proven

methods, of eliminating or preventing hitting problems

thereby providing our fans excellence in batting,

league leading run creation resulting in high level fan satisfaction  




  • Lean Six Sigma Analytics

  • Design for Six Sigma

  • Statistical Methods

  • Correlation/Regression Analysis

  • Design of Experiments

  • VOC / QFD

  • Taguchi Methods

  • Innovation Methods

Systems Based Potential




Optimal Batting System Design

Pre Emptive Batting Problem Discovery

Wasiloff – Young Baseball Analytics

“Systems Approach to Batting”

Analytic Based Reactive Batting Problem Solving

Slide28 l.jpg

Baseball?  It's just a game - as simple as a ball and a bat.  Yet, as complex as the American spirit it symbolizes.  It's a sport, business - and sometimes even religion.  ~Ernie Harwell, "The Game for All America," 1955

Questions / comments?