1 / 29

Chris Mohl Statistics Canada

The Continuing Evolution of Generalized Systems at Statistics Canada for Business Survey Processing. Chris Mohl Statistics Canada. Outline . Why Generalize? Factors Influencing the Evolution The Systems Development, Support and Maintenance Lessons Learned Possible Future Activities

solada
Download Presentation

Chris Mohl Statistics Canada

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Continuing Evolution of Generalized Systems at Statistics Canada for Business Survey Processing Chris Mohl Statistics Canada

  2. Outline • Why Generalize? • Factors Influencing the Evolution • The Systems • Development, Support and Maintenance • Lessons Learned • Possible Future Activities • Conclusions

  3. Why Generalize Systems? • Fully researched methods • Thoroughly tested • Complete documentation • Expert support team • Minimal user programming required – improves timeliness • Coherent methods across surveys

  4. Factors Influencing the Evolution • Changes in technology • Mainframe to PC/UNIX processing • Some underlying software no longer supported • Statistics Canada’s SAS site license • Need for new or more sophisticated methods

  5. The Systems • Can be classified into three groupings • Mature Systems • No new development • Redesign Systems • Reengineering of old systems • New Development Systems • New methodologies

  6. Mature Systems • The longest surviving generalized systems • No new functionality being added – only maintenance • SAS macros • Interface built with SAS/AF • Can be run in batch mode (macro call within SAS program) or via interface • PC or UNIX

  7. Mature Systems • Generalized Sampling (GSAM) • Performs functions related to sample selection for ongoing and ad hoc surveys • Stratification, Allocation, Sampling, Frame Maintenance • Generalized Estimation System (GES) • Performs functions related to weighting and estimation • One-stage element and cluster, two-phase element designs • Mostly design based, some synthetic, jackknife

  8. Example of GES Interface Screen

  9. Redesigned Systems • Generalized systems previously existed that performed similar functions but needed replacement • Why? • Often due to outdated architecture – mainframe, obsolete software • New capabilities in SAS • New methodologies couldn’t be integrated into previous system

  10. Redesigned Systems • Banff (replaces Oracle based GEIS) • Performs edit and imputation of numeric continuous data • Nine custom built SAS procedures • SAS Enterprise Guide based “interface” (Banff wizards)

  11. Example of Banff SAS Procedure

  12. Example of Banff Wizard

  13. Redesigned Systems • New CONFID • Performs protection of tabular economic data • SAS-based custom built procedures (like Banff) and macros for PC and UNIX • Jasper (replacement for ACTR) • Performs automated coding of character strings • Retains interface-based processing, but may later build SAS-based custom built procedures

  14. New Development Systems • Fills in needs for functionality not already available in other generalized systems • Replaces customized programs that may already exists

  15. New Development Systems • Statistical Macro Extensions (StatMx) • New functionality not available in GES / GSAM • Multi-stage design estimation, Lavallée-Hidiroglou allocation, extended synthetic estimation • SAS macros, no interface • Forillon • Time Series processing • Benchmarking sub-annual series, Raking to retain additivity, trend computations, variance calculations, analytical tools • SAS-based procedures and Enterprise Guide "interface”

  16. Development, Support and Maintenance • Most systems developed and maintained by teams of individuals from two groups • Mathematical statisticians (Methodology Branch) • Programmers (Informatics Branch) • Certain projects are the sole responsibility of one group • Moving away from such situations

  17. Development • Methodologists review mathematical needs • Consultation with potential users, literature searches, research into mathematical methods • Programmers review informatics needs • Methodologists write specifications • Programmers produce new version • Methodologists do final certification • Documentation is written

  18. Support • Team members not directly responsible to implement the systems – assist users • Mathematical questions go to methodologists, informatics questions to programmers • Amount of support depends upon number of users, complexity of the methods, “newness” of the system

  19. Maintenance • May consist of bug fixes or adding new functionality • May be identified by the users or by team members • Team members work together to identify if it merits attention and then implement and certify the change

  20. Costs • Generalized systems require a very significant outlay of resources • Varies significantly from project to project • Development of a large project • 2-3 methodologists, 2-3 programmers over several years • Support and maintenance • 1 methodologist, 1 programmer per year

  21. Lessons Learned • Reduce Software Diversity • Emphasis put on SAS, reduce reliance on different programming languages • Easier to move people from one project to another • Users only need to know one language • Learning SAS is part of staff’s early training

  22. Lessons Learned • Traditional interfaces are expensive – there are alternatives • Interface development can cost as much as the mathematical functionality • Changes can be difficult • Often does not upgrade as well as rest of the system • Most users prefer batch processing for production • Can be necessary when tool is used by non-technical personnel • SAS Enterprise Guide being successfully used

  23. Lessons Learned • People like things they are familiar with • Customized SAS procedures (Banff, Forillon) have been favorably received • Centralization of resources is beneficial • People can take ideas used in one project and apply it to others • Examples: Enterprise Guide interfaces, Customized SAS procedures

  24. Lessons Learned • Modularity and flexibility are important • Some early systems too rigid – successful ones had more flexibility • Users only want pieces of certain systems • Reduce custom-built systems, put in generalized systems • People often “borrow” other programs and don’t understand all the implications • Support is a problem when person leaves project • However, timing sometimes makes it necessary

  25. Lessons Learned • Buy when possible, but don’t get cornered • No need to build certain components ex. linear programming function • Ensure that changing to an alternate component is not difficult • Make sure that the support is there • Stay up to date on technology • Don’t wait too long to react to advances • Ex. Mainframe → PC 1990s, Linux

  26. Possible Future Activities • Current Systems • Banff – categorical data capabilities • New CONFID – add additional functionality • Jasper – review of methodology used • Forillon – add additional functionality • StatMx – advanced variance calculations?

  27. Possible Future Activities • General avenues • Continue movement towards SAS based procedures and Enterprise Guide interfaces • Buy components when possible – free up programming resources for specialized tasks • Metadata table-based processor

  28. Conclusions • Generalized Systems have become a critical part of business survey processing • Due to the investments made in development we have to keep them relevant • Moving towards a more standardized look and feel • Use what we have learned in the past to help shape the future

  29. For more Information please contact Pour plus d’information, veuillez contacter Chris MohlChris.Mohl@statcan.ca

More Related