Stochastic DEA: Myths and misconceptions Timo Kuosmanen (HSE & MTT) Andrew Johnson (Texas A&M University) Mika

Stochastic DEA: Myths and misconceptions Timo Kuosmanen (HSE & MTT) Andrew Johnson (Texas A&M University) Mika Kortelainen (University of Manchester) XI EWEPA 2009, Pisa, Italy

What is stochastic DEA? ”DEA is truly a stochastic frontier estimation method, and it is incorrect to classify it as a deterministic method.” Banker & Natarajan (2008) Operations Research, p.49

What is stochastic DEA? • Term stochastic (from Greek “Στοχος” for ”aim” or ”guess”) generally refers to statistical random variation

Elements of random variation in DEA • Random sampling of observations from the production possibility set (sampling error) • Random sampling of observations outside the production possibility set (outliers) • Random outcome of production process (stochastic technology) • Random measurement errors, omitted variables, and other disturbances (stochastic noise)

Common myths and misconceptions • Confusing stochastic noise with sampling variation, outliers, or stochastic technology • Statistical inference on sampling error is believed to improve robustness to noise • Robustness to outliers is seen as the same as robustness to noise (or at least closely related)

Sampling error output y True frontier input x

Sampling error y True frontier Random sample of observations (DMUs, firms) x

Sampling error y True frontier DEA frontier x

Statistical foundation of DEA • Banker (1993) Management Science • Korostelev, Simar & Tsybakov (1995) Annals Stat. • Kneip, Park & Simar (1998) Econometric Theory • Simar & Wilson (2000) JPA • Deterministic technology • No outliers or noise • Data randomly sampled from the PPS • DEA frontier converges to the true frontier as the sample size approaches to infinity • In a finite sample, DEA frontier is downward biased

Statistical foundation of DEA • Statistical inference on sampling error is possible by using • Asymptotic sampling distribution (Banker 1993) • Bootstrapping (Simar & Wilson 1998) • Such inferences have nothing to do with • outliers • stochastic technology • stochastic noise

Bootstrapping • Purpose of the smooth consistent bootstrap (Simar & Wilson 1998, 2000) is to mimic the original random sampling to estimate the sampling bias • Bias corrected DEA frontier will always lie above the original DEA frontier • In noisy data, DEA tends to overestimate the frontier • Assuming away noise, and “correcting” for the small sample bias by bootstrapping, we will shift the frontier upward => If noise is a problem, then bias correction will only make it worse

Simulated example y x

Critique of Löthgren & Tambour (LT) “LT bootstrap involves measuring the distance from a different, random (as opposed to fixed) point to the [frontier] on each replication of the bootstrap Monte Carlo exercise. It seems entirely unclear what this procedure estimates. Certainly, it does not estimate anything of interest.” … “LT method assumes not only that [the frontier]is unknown, but also (implicitly) that the point from which one wishes to measure distance to the frontier is unknown. This is absurd.” Simar & Wilson (2000), JPA, pp. 67-68.

Outliers y Outliers True frontier x

Outliers y DEA frontier True frontier x

Outliers • Super-efficiency approach (Wilson 1995 JPA) • Peeling the onion; context dependent DEA (Seiford & Zhu 1999 Management Science) • Robust efficiency measures / efficiency depth (Kuosmanen & Post 1999 DP, Cherchye, Kuosmanen & Post 2000 DP) • Conditional order-m and order-α quantile frontiers (Aragon, Daouia & Thomas-Agnan 2002 DP; Cazals, Florens & Simar 2002 J Econometrics; Daouia & Simar 2007 J Econometrics; Daraio & Simar 2007 book) • Deterministic technology • Improve robustness to outliers by not enveloping the most extreme observations • Outliers are different from noise • Noise affects all observations

Stochastic technology y Pr.[f(x)≤f]= 0.05 Pr.[f(x)≤f]= 0.50 Pr.[f(x)≤f]= 0.95 x

Chance constrained DEA • Land, Lovell & Thore (1993) Managerial & Decision Econ. • Olesen & Petersen (1995) Management Science • Cooper, Huang & Li (1996) Annals of OR • Huang & Li (2001) JPA • Stochastic technology, stochastic noise, both?

Chance constrained stochastic DEA • Huan & Li (2001) JPA • Assume inputs and outputs are multivariate normal random variables, with known expected values and covariance matrix

Chance constrained stochastic DEA • How do we get “knowledge” about the expected values of inputs and outputs? • Cannot be estimated from cross-sectional data • Panel data estimation would require that the true inputs and outputs do not change over time • How do we get “knowledge” about the variances and covariances of the error terms??? • Uncertainty of the parameter estimates not taken into account in the model

Stochastic noise y True frontier x

Stochastic DEA models to deal with noise • DEA+ • Gstach (1998) JPA • Banker & Natarajan (2008) Operations Research • “Stochastic DEA” • Banker, Datar & Kemerer (1991) Management Science • Stochastic FDH/DEA estimators • Simar & Zelenyuk (2008) DP. • Stochastic Nonparametric Envelopment of Data (StoNED) • Kuosmanen (2006) DP; Kuosmanen & Kortelainen (2007) DP.

Stochastic DEA models to deal with noise • Estimation of a fully deterministic frontier based on data perturbed by noise • The shape of frontier can be estimated without parametric assumptions • Estimation of inefficiency (efficiency scores) is very challenging in cross-sectional setting • Observed output contains the noise term • Only conditional expected value can be estimated • Even the SFA efficiency estimator is not consistent!

Stochastic DEA models to deal with noise • In cross-sectional setting, identifying inefficiency and noise requires some strong assumption • Assuming away noise completely is a strong assumption, too • Distributional assumptions do not influence the efficiency rankings • Ondrich & Ruggiero 2001, EJOR

Conclusions • Stochastic noise should not be confused with sampling error, outliers, or stochastic technology • Correcting for small sample bias by bootstrapping does not improve robustness to noise; it can even make things worse • Improving robustness to outliers is different from stochastic noise that perturbs all observations

Stochastic DEA: Myths and misconceptions Timo Kuosmanen (HSE & MTT) Andrew Johnson (Texas A&M University) Mika