220 likes | 456 Views
2. Outline. Background and motivationCurrent two-stage DEA methodProposed changes to two-stage DEADefinitions Relative to an Inefficient FrontierLeave-one-out method / threshold valueIterative Outlier DetectionSecond Stage BootstrapSummary. 3. Background. Outlier detection in a non-parametric
E N D
1. 1 An Outlier Detection Methodology with Consideration for an Inefficient Frontier By
Andy Johnson
Leon McGinnis
2. 2 Outline Background and motivation
Current two-stage DEA method
Proposed changes to two-stage DEA
Definitions Relative to an Inefficient Frontier
Leave-one-out method / threshold value
Iterative Outlier Detection
Second Stage Bootstrap
Summary
3. 3 Background Outlier detection in a non-parametric framework is important because many of these methods do not consider measurement error or random fluctuations when constructing a frontier
Thus allowing over stated data to be included in the reference set can bias not only one efficiency estimate, but several efficiency estimates if the over stated observation
is used to construct the frontier
4. 4 Motivation The iDEAs project for warehouse performance benchmarking
On-line data collection requires more scrutiny than data collected and analyzed by a single person
What could be the/a cause of a
negatively skewed efficiency
distribution
5. 5 Motivation To investigate the impact of environmental characteristics on efficiency the two-stage DEA method has been developed
A data set is identified as using a similar technology
In the first stage efficiency estimates are calculated
In the second stage the estimates are regressed against environmental characteristics
6. 6 Current Two-stage DEA Method when Outlier Detection is Considered Consider the problem of understanding sources of inefficiency
Identify outliers relative to the efficient frontier and remove them
Calculate efficiency estimates
Regress efficiency estimates against environmental variables
7. 7 An Outlier Detection Methodology with Consideration for an Inefficient Frontier
8. 8 An Outlier Detection Methodology with Consideration for an Inefficient Frontier A proposed improvement on the current method requires an outlier methodology
First, identify outliers relative to both the efficient and inefficient frontier
Use a two-stage DEA method where DEA estimates are calculated in the first stage and regressed against environmental variables in the second stage
9. 9 An Outlier Detection Methodology with Consideration for an Inefficient Frontier
10. 10 Definitions Relative to an Inefficient Frontier The production possibility set when the inefficient frontier is included
Shephards input inefficient distance function can be defined as
Shephards output inefficient distance function can be defined as
11. 11 Definition of an Inefficient Frontier The Multiple-Output Inefficient Production Frontiers and the Measure of Technical Efficiency
The inefficient frontier with respect to the subset X(y) can be found by
12. 12 Linear Program for Calculating Inefficiency The Multiple-Output Inefficient Production Frontiers and the Measure of Technical Efficiency
The inefficiency estimate calculated from the input perspective can be found by solving the following linear program
13. 13 Outlier Detection
As suggested by Simar 2003 an outlier needs to be identified by both an input and an output oriented detection method
14. 14 Advantages of the Hyperbolic Orientation Advantage #1
Assume all data are positive, the hyperbolic oriented measure can always be calculated (even in cases when the input or output orientations could not be calculated)
15. 15 Advantages of the Hyperbolic Orientation
16. 16 Leave-One-Out DEA Program The leave-one-out hyperbolic oriented DEA inefficiency estimate is
17. 17 Threshold Value for Identifying Outliers
18. 18 Iterative Outlier Detection Identify outliers based on hyperbolic orientation detection method
Remove identified outliers
Rerun outlier detection method
19. 19 Bootstrapping method for the second stage of the two-stage DEA method Necessary because of the correlation among error terms in the second stage regression
Sample n observations, call this set b, with replacement from the set of input/output data
Calculate efficiency estimates for each of the original n observations relative to the set b
20. 20 Second Stage Bootstrap Bootstrapping method for the second stage of the two-stage DEA method
Because of bias present, the corrected efficiency estimate
Confidence interval estimates
21. 21 Banker and Morey Data When outliers are only determined based on the efficient frontier, the finding is population has no impact on efficiency
When outliers are determined based on an efficient and an inefficient frontier, the finding is population is negative correlated with efficiency at the 95% confidence level
22. 22 Summary By not considering an inefficient frontier the second stage results of the two-stage method are biased. This bias is not corrected for by the bootstrapping method.
Developed a more comprehensive outlier detection methodology for non-parametric efficiency methods
23. 23 Thank You