1 / 23

Simon Murphy simon.murphy@codematic Developer – Codematic Ltd

Management Paper - Comparison of Spreadsheets with other development tools (limitations, solutions, workarounds and alternatives). Simon Murphy simon.murphy@codematic.net Developer – Codematic Ltd. Spreadsheet background. 5-30 Mb size 20-200,000 formulas 1-10,000 unique formulas

vanna
Download Presentation

Simon Murphy simon.murphy@codematic Developer – Codematic Ltd

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Management Paper - Comparison of Spreadsheets with other development tools (limitations, solutions, workarounds and alternatives) Simon Murphy simon.murphy@codematic.net Developer – Codematic Ltd

  2. Spreadsheet background • 5-30 Mb size • 20-200,000 formulas • 1-10,000 unique formulas • 5-10,000 lines of VBA • £1M to Billions in values • Often linked to other technologies such as OLAP, ADO, COM or .net etc. • Finance, Banking and Sales and Marketing areas • Growth rate 500 pa pa (12k data items ph)

  3. Agenda • 2 Apologies – Excel v Spreadsheet, Critical • Motivation • Definitions • Brief Summary Of Main Points • In Depth Analysis And Demos • Select by location • Worksheet & VBA insecurity • Formula complexity • Culture • Summary • Conclusion • Any Questions

  4. Motivation (For writing the paper) • Spreadsheets seem more difficult to test effectively than databases and source code. • Most mainstream/formally trained developers shun spreadsheets as a development tool. • Most spreadsheet developers only work in spreadsheets, rarely databases, or procedural languages. • Spreadsheets seem to be stuck on ‘Garbage in Garbage Out’ whilst mainstream development has moved to the much more robust ‘No Garbage In’ approach. • Many observers recommend extra tools, methodologies or training to improve spreadsheet quality, they miss the key point – People use spreadsheets because of their flexibility not in spite of it. • Is it realistic to comply with the spirit of Sarbanes-Oxley with spreadsheets? • I believe spreadsheets should have a much more limited role in important information systems.

  5. Definitions • Spreadsheet – powerful and flexible, single end user, analysis and presentation tool, optimised for speed of initial development. • Spreadsheet Paradigm – normal reference based formulas, excludes lookups and pivots etc. • Systems Development lifecycle – Requirements, Analysis, Design,<Technology Choice>, Construction, Test, Release, Maintain. In some shape or form. • Spreadsheet Development lifecycle – “Oh! I need a model” – clickety-click, reasonableness check, release, (Test & Maintain in live environment).

  6. Brief Summary • Spreadsheets are great for many jobs, but their flexibility make them ... • Inherently fragile • Not type safe • Only have global scope • Lack of data/logic separation • Insecure • Don’t scale well • Limited development tools • VBA generally poorly written • Often ad-hoc • External links can be dangerous Possible workaround Issue • Focus on smaller models • Clear layout, data validation • Use blocks with spaces • Layout for understanding • Use compiled language • Use in requirements phase • 3rd Party tools from web • Take training/coaching • Sketch design on paper first • Use auditable import routine

  7. External link example from a commercial model • 1 workbook linked to 34 other workbooks, • 20 of which were found • 14 missing (one was ‘book1’ (i.e. not saved)) • over 100 links found in total. • Excel generally does not know or warn about circular references through external links. • Links lock directory structure • Note: Approx 1.4 Million data items

  8. In Depth and Demos • Select by Location v Select by Value • Worksheet and VBA Insecurity • Formula complexity

  9. In Depth and Demo 1Select by location not value Original paper spreadsheet had only one mode of access, by the human user, reading and writing the text and numbers.

  10. In Depth and Demo 1Select by location not value Electronic spreadsheets have two modes of access 1.The human user, reading and typing the text and numbers, as before. 2.The spreadsheet itself, calculating formulas based on cell address (location). This disconnect is great for flexibility but makes spreadsheets very fragile.

  11. In Depth and Demo 1Database approach In a database the user issues queries based on the values in the actual data, the database uses these same values to provide results. There is only one access mode, there is no disconnect. SELECT * FROM PL WHERE LineItem = ‘Sales’

  12. In Depth and Demo 1Select by location not value Demo 1: Which worksheet is correct Demo 2: The database approach

  13. In Depth and Demo 2Worksheet and VBA Insecurity Key point 1: Security is measured in time and effort. Eg. Safes – TXTL-60X6 – Torch, explosive and tool resistant for 60 minutes (all 6 walls) In software – User skill, access and specialist tools are considerations Key point 2: Spreadsheet security is not weak or poor. If anything it is probably too strong (for a single end user tool). (Personal experience: Relying on spreadsheet security can make bad things happen.) Demo 1: Break sheet protection faster than setting it. Demo 2: Cracking VBA protection for fun.

  14. In Depth and Demo 3Formula Complexity (1) How many people are intermediate or better in spreadsheets? How many people have any experience of C#?

  15. In Depth and Demo 3Formula Complexity (2) Who can tell us what this (array) formula does? {=INDEX(Circulation!CU1:CU175, MAX(ROW(Circulation!CU71:CU175) *(Circulation!CU71:CU175<>0)))}

  16. In Depth and Demo 3Formula Complexity (3) Who can tell us what this custom worksheet formula does? =LastNonZeroValueFromVerticalList( Circulation!CU71:CU175)

  17. In Depth and Demo 3Formula Complexity (4) Note: function defined once, used many times in workbook Note: Although it is possible to write worksheet functions in .net, c++ xlls are still the standard for performance reasons.

  18. In Depth and Demo 3Formula Complexity (4) (Uses XLL+ from Planatech)

  19. Optional Relative complexitySpreadsheets v mainstream Glass suggests that for each 25% increase in problem complexity the complexity of the required solution doubles.

  20. Culture • If Spreadsheets have such clear limits, why are they so regularly abused? • Must be commercial pressure, preference for speed over accuracy. Considering current skills, time and cost to build, spreadsheets are a great tool. Considering Total Cost of Ownership they aren’t. • If the research is right about error rates why no large scale business collapse? • Spreadsheet errors must not be material (or not mission critical at least). Probably spreadsheets are only a part of the decision support system.

  21. Summary • Although spreadsheets have many uses, they also have some features that make them inappropriate for certain types of development. We looked at 3 in depth: • Inherent weakness of the ‘Select by location’ approach compared to ‘Select by value’ • No real security to protect model integrity or intellectual property. • Rather cryptic native syntax (even experienced developers sometimes have difficulty) • Spreadsheet robustness and reliability can be increased by use of complimentary technologies like .net and databases.

  22. Conclusion • The ‘Select by Location’ approach is a fundamental cause of spreadsheet complexity and fragility. • Spreadsheets are a superb tool with many uses, mainly at the smaller and shorter-term end of system developments • Be aware of their limitations • You have a choice of tools – choose wisely.

  23. Questions? • simon.murphy@codematic.net • Spreadsheet consulting, reviewing, maintaining, rescuing, migrating, add-in development etc. • Websites • www.codematic.net • www.xlanalyst.co.uk

More Related