an efficient data envelopment analysis with a large data set in stata l.
Download
Skip this Video
Download Presentation
An Efficient Data Envelopment Analysis with a large data set in Stata

Loading in 2 Seconds...

play fullscreen
1 / 39

An Efficient Data Envelopment Analysis with a large data set in Stata - PowerPoint PPT Presentation


  • 447 Views
  • Uploaded on

An Efficient Data Envelopment Analysis with a large data set in Stata. 15-16 July, 2010 Boston10 Stata Conference Choonjoo Lee, Kyoung-Rok Lee sarang90@kndu.ac.kr, bloom.rampike@gmail.com Korea National Defense University. Contents. Part I. A Large Data Set in Stata/DEA

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'An Efficient Data Envelopment Analysis with a large data set in Stata' - neal


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
an efficient data envelopment analysis with a large data set in stata

An Efficient Data Envelopment Analysis with a large data set in Stata

15-16 July, 2010

Boston10 Stata Conference

Choonjoo Lee, Kyoung-Rok Lee

sarang90@kndu.ac.kr, bloom.rampike@gmail.com

Korea National Defense University

contents
Contents

Part I. A Large Data Set in Stata/DEA

  • Large Data Set in DEA?
  • Computational Aspects of Large Data Set
  • The Scope of this Study
  • Efficiency Matters in Stata/DEA/Linear Programming
  • Tasks to be covered

Part II. Malmquist Index Analysis with the Panel Data

  • Basic Concept of Malmquist Index
  • The User Written Command “malmq”
slide3

Part I. A Large Data Set in Stata/DEA

  • Large Data Set in DEA?
  • Computational Aspects of Large Data Set
  • The Scope of this Study
  • Efficiency Matters in Stata/DEA/Linear Programming
  • Tasks to be covered
large data set in dea
Large Data Set in DEA?
  • Graphical illustration of DEA concept
large data set in dea5
Large Data Set in DEA?
  • Variables and Observation Constraints by the Features of DEA Domain Programs(Language)
    • Statistical Package based DEA Programs
    • Spreadsheet based DEA Programs
    • Language based DEA Codes
  • Performance of Linear Program(LP): Efficiency and Accuracy
    • LP is the Critical Component of DEA Program
    • Approaches to Solve LP: Simplex, Interior Point Methods(IPMs)

☞ Numerous Variants of the Basic LP Approach

  • DEA Report Format(User Interface Design)
    • Results(input, output)
    • Graphical Display
    • Log
computational aspects of large data set
Computational Aspects of Large Data Set
  • Matrix Size for the Data Set in Matrix Format
    • # of rows and columns(variables and observations) allowed by the Program
    • The storage limit of the computer memory
    • upgrade of computer technology, the way to access the data in the memory
  • Matrix Density
    • # of nonzeros of the matrix
    • How many zero elements in the matrix?
  • A Computationally Demanding Procedure of DEA due to the LP
    • The number of iterations needed to solve a problem grows exponentionally as a function of variables and observations
  • Numerical Difficulties
    • Inaccuracy and inefficiency due to the Floating Point Arithmetic with finite precision
    • Numerical Precision due to the binary representation of number
the scope of this study
The Scope of this Study
  • Performance of DEA code
    • Linear Program/Simplex Method
    • Computational Technique
    • Illustration
  • Panel Data in DEA
    • Malmquist Index Analysis
efficiency matters in stata dea lp
Efficiency Matters in Stata/DEA/LP
  • DEA program demands heavy computation
    • Computation time heavily depends on the number of observations(DMUs), variables(inputs, outputs), LP process, etc.
  • Stata uses RAM(memory) to store data
    • The memory size matters for the large data set
efficiency matters in stata dea lp9
Efficiency Matters in Stata/DEA/LP
  • The performance of Input Oriented DEA models

※ Stata SE

efficiency matters in stata dea lp10
Efficiency Matters in Stata/DEA/LP
  • Understanding the difference of computation
  • if the number of observations(n) becomes significantly larger than the number of variables(m)?
efficiency matters in stata dea lp11
Efficiency Matters in Stata/DEA/LP
  • Data
    • Source: Cooper et al.(2006), table3-7
  • Tableau and Revised Simplex in DEA/LP
efficiency matters in stata dea lp12
Efficiency Matters in Stata/DEA/LP
  • Tableau and Revised Simplex in DEA/LP
  • For DMU A
  • The Basic DEA Models
efficiency matters in stata dea lp14
Efficiency Matters in Stata/DEA/LP
  • Program Syntax

dea ivars = ovars [if] [in] [, rts(crs | vrs | drs | irs) ort(in | out) stage(1 | 2) trace saving(filename)]

  • rts(crs | vrs | drs | irs) specifies the returns to scale. The default, rts(crs), specifies constant returns to scale.
  • ort(in | out) specifies the orientation. The default is ort(in), meaning input-oriented DEA.
  • stage(1 | 2) specifies the way to identify all efficiency slacks. The default is stage(2), meaning two-stage DEA.
  • trace specifies to save all the sequences displayed in the Results window in the dea.log file. The default is to save the final results in the dea.log file.
  • saving(filename) specifies that the results be saved in filename.dta.
efficiency matters in stata dea lp15
Efficiency Matters in Stata/DEA/LP
  • Develop the Basic Data Bank(input oriented CRS)
  • Canonical form
  • Standard form
efficiency matters in stata dea lp17
Efficiency Matters in Stata/DEA/LP
  • Model V1: Tableau DEA
    • Efficiency score(θ) of DMU A is 14/15
efficiency matters in stata dea lp18
Efficiency Matters in Stata/DEA/LP
  • Model V3: Revised DEA

c

0

0

A

I

b

cB

cN

0

B

N

b

cBB-1b

0

cN-cBB-1N

I

B-1N

B-1b

efficiency matters in stata dea lp19
Efficiency Matters in Stata/DEA/LP
  • Model V3: Revised DEA
    • Step1: Set up the initial tableau factors.
    • Step2: Find entering variable.
    • Step3: Find leaving variable.
    • Step4: Update the tableau. (Update the basis.)

cN

cB

N

B

b

efficiency matters in stata dea lp20
Efficiency Matters in Stata/DEA/LP
  • Model V3: Revised DEA

- 1st step: The initial tableau factors.

B= xB= CB= CBB-1=

  • - 2nd step: Finding entering variable

cN -cBB-1N: Max value is selected as a entering variable

  • - 3rd step: Finding entering variable
    • B-1N = Min{xB/(B-1N)} ={×, ×, 70/90, 6/8} = 6/8 (←x4)
efficiency matters in stata dea lp21
Efficiency Matters in Stata/DEA/LP
  • Model V3: Revised DEA

- 4th step: Update the tableau

cN

cB

N

B

b

tasks to be covered
Tasks to be covered
  • Computational Accuracy
    • Example: Obtaining Inverse Matrix
      • Matrix D
tasks to be covered23
Tasks to be covered
  • Computational Accuracy
    • Example: Obtaining Inverse Matrix
      • Inverse matrix D by Stata/Mata “luinv (D)”
tasks to be covered24
Tasks to be covered
  • Computational Accuracy
    • Example: Obtaining Inverse Matrix
      • Inverse matrix D by Stata/Mata “luinv (D)”
tasks to be covered25
Tasks to be covered
  • Computational Accuracy
    • Example: Obtaining Inverse Matrix
      • D*D-1 in Stata/Mata(default tolerance)
  • Should it be Identity Matrix?
tasks to be covered26
Tasks to be covered
  • Computational Accuracy
    • Example: Obtaining Inverse Matrix
      • D*D-1 in Excel
  • Where the computational inaccuracy comes from?
tasks to be covered27
Tasks to be covered
  • Computational Accuracy
    • One of the possible reasons: Decimal and Binary numbers

17(decimal number)

      • 17 / 2 = 1
      • 8 / 2 = 0
      • 4 / 2 = 0
      • 2 / 2 = 0
      • 1 / 2 = 1

= 10001(binary number)

  • How computer saves a=0.75, b=0.7+0.05, c=0.6+0.1+0.05?
tasks to be covered28
Tasks to be covered
  • Accuracy
    • Tolerance
      • to set upper or lower limit on the number of iterations.
      • to stop an unattended run if the algorithm falls into a cycle
    • Preprocessing: Scaling
      • to improve the numerical gap and get a safe solution.

Ex) Rank(D)

slide29

Part II. Malmquist Index Analysis with the Panel Data

  • Basic Concept of Malmquist Index
  • The User Written Command “malmq”
basic concept of malmquist index
Basic Concept of Malmquist Index
  • Malmquist Productivity Index(MPI) measures the productivity changes along with time variations and can be decomposed into changes in efficiency and technology.
basic concept of malmquist index32
Basic Concept of Malmquist Index

The input oriented MPI can be expressed in terms of input oriented CRS efficiency as Equation 1 and 2 using the observations at time t and t+1.

basic concept of malmquist index33
Basic Concept of Malmquist Index

The input oriented geometric mean of MPI can be decomposed using the concept of input oriented technical change and input oriented efficiency change as given in equation 4.

the user written command malmq
The User written command “malmq”
  • Program Syntax

malmq ivars = ovars [if] [in] [, ort(in | out) period(varname) trace saving(filename)]

  • ort(in | out) specifies the orientation. The default is ort(in), meaning input-oriented DEA.
  • period(varname) identifies the time variable.
  • trace specifies to save all the sequences displayed in the Results window in the malmq.log file. The default is to save the final results in the malmq.log file.
  • saving(filename) specifies that the results be saved in filename.dta.
notes
Notes
  • The data and code related to the presentation will be available from the Conference website.
references
References
  • Cooper, W. W., Seiford, L. M., & Tone, A. (2006). Introduction to Data Envelopment Analysis and Its Uses, Springer Science+Business Media.
  • Ji, Y., & Lee, C. (2010). “Data Envelopment Analysis”, The Stata Journal, 10(no.2), pp.267-280.
  • Lee, C., & Ji, Y. (2009). “Data Envelopment Analysis in Stata”, DC09 Stata Conference.
  • Maros, Istvan. (2003). Computational techniques of the simplex method, Kluwer Academic Publishers.