1 / 30

Implementing Legacy Statistical Algorithms in a Spreadsheet Environment

Implementing Legacy Statistical Algorithms in a Spreadsheet Environment. Stephen W. Liddle Information Systems Faculty Rollins eBusiness Center. John S. Lawson Department of Statistics. Brigham Young University Provo, UT 84602. Overview. Introduction Fundamentals of VBA in Excel

robertkirby
Download Presentation

Implementing Legacy Statistical Algorithms in a Spreadsheet Environment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information Systems Faculty Rollins eBusiness Center John S. Lawson Department of Statistics Brigham Young University Provo, UT 84602

  2. Overview • Introduction • Fundamentals of VBA in Excel • Retargeting traditional algorithms to a spreadsheet environment • Converting FORTRAN to VBA • Conclusions

  3. Why Convert FORTRAN Programs to Run in a Spreadsheet Environment? • Useful code available that is not implemented in standard statistical packages • FORTRAN compilers not usually available on normal Windows workstation • Many textbooks refer to published FORTRAN algorithms

  4. Sources for Published FORTRAN Algorithms • STATLIB (http://lib.stat.cmu.edu/) • General Archive • Applied Statistics Archive • Journal of Quality Technology Archive • JASA Software Archive • JCGS Archive

  5. Advantages of Running Legacy FORTRAN Code in Excel • Comfortable environment for practitioners • More user friendly input from spreadsheet • Output to spreadsheet allows further graphical and computational analysis of results with Excel functions

  6. Proposed Methodology • Understand original FORTRAN program • Choose suitable I/O methods • Convert original FORTRAN code to VBA • Test and use resulting Excel code

  7. Visual Basic For Applications • Built on ANSI BASIC • Language engine of Microsoft Office • Modern structured programming language • Has vast array of types, functions, programming helps • Powerful support environment (Office platform) • Popular in business contexts

  8. Application Workbooks (Workbook) Worksheets (Worksheet) Range Chart Excel Object Model • Objects in Excel are addressable in VBA • Each object has: • Properties • Methods

  9. Output Region Input Region Clicking these buttons runs the ORPS1 and ORPS2 algorithms. Input/Output Methods • Non-interactive • Files, databases • Worksheet cells • Interactive • Message boxes • Input boxes • Custom GUI forms

  10. FORTRAN vs. VBA • VBA: (-b+Sqr (b^ 2-4*a*c))/(2*a) • FORTRAN: (-b+SQRT(b**2-4*a*c))/(2*a)

  11. More Operators • .EQ. = • .NE. <> • .LT. < • .LE. <= • .GT. > • .GE. >= • .AND. And • .OR. Or • .NOT. Not • // &

  12. Data Types • INTEGER Byte, Integer, Long • REAL Single • DOUBLE PRECISION Double • COMPLEX Non-primitive in VBA • LOGICAL Boolean • CHARACTER String • CHARACTER*length String*length • Other notable VBA types: • Currency, Decimal, Date, Variant

  13. Worksheet Functions • ChiDist(x,deg_freedom) • Returns one-tailed probability of the χ2 distribution. • Correl(array1,array2) • Returns the correlation coefficient of two cell ranges. • Fisher(x) • Returns the Fisher transformation at a given x. • Pearson(array1,array2) • Returns the Pearson product moment correlation coefficient for two sets. • Quartile(array,quart) • Returns the requested quartile of a data set. • StDev(array) • Returns the standard deviation of a data set. • ZTest(array,x,sigma) • Returns the two-tailed P-value of a z-test.

  14. Flow-Control Statements

  15. Subtle Differences (“Gotchas”) • Implicit conversion of real to integer values • FORTRAN: truncate • VBA: round • Solution: use VBA’s Fix(), which truncates • Both languages allow implicit typing • This introduces ambiguity • Solution: supply explicit types everywhere

  16. Eliminating Goto Statements • Computer science accepts the axiom that goto is generally “considered harmful” • We advocate rewriting alogrithms to use structured programming techniques where feasible • Sine qua non is “make it work” • It’s a good idea for maintainability, understandability to move to structured form

  17. Eliminating Goto Statements DO 8 J=1,3 ... 6 ... IF(OBJFN.GT.BESTFN) GO TO 7 ... GO TO 6 7 IF(J.EQ.3) GO TO 8 XK=BESTK-STEP 8 CONTINUE

  18. Eliminating Goto Statements For j=1 To 3 ... 6 ... IF(OBJFN.GT.BESTFN) GO TO 7 ... GO TO 6 7 IF(J.EQ.3) GO TO 8 XK=BESTK-STEP 8 Next j

  19. Eliminating Goto Statements For j=1 To 3 ... 6 ... IF(OBJFN.GT.BESTFN) GO TO 7 ... GO TO 6 7 If j <> 3 Then xk = bestk - step End If Next j

  20. Eliminating Goto Statements For j=1 To 3 ... Do Until objfn > bestfn ... Loop If j <> 3 Then xk = bestk - step End If Next j

  21. Our Reasoning • Digital assets are fragile • FORTRAN is not universally available • Excel is a ubiquitous, powerful platform • VBA is a full-featured language capable of handling sophisticated statistical computations

  22. Conclusions • We recommend creating a Web-based repository of Excel/VBA implementations of classic statistical algorithms • We can preserve our legacy algorithms in this modern spreadsheet environment • E-mail us if you want a copy of our manuscript (liddle or lawson@byu.edu)

More Related