1 / 28

SAS Macros are the Cure for Quality Control Pains

SAS Macros are the Cure for Quality Control Pains. Gary McQuown Data and Analytic Solutions. Rants and Raves of a SAS Programmer. Purpose. I. Quality Control II. SAS Macros for Quality Control III. Sources of SAS Macros and QC Code. I. Quality Control.

efia
Download Presentation

SAS Macros are the Cure for Quality Control Pains

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SAS Macros are the Cure for Quality Control Pains Gary McQuown Data and Analytic Solutions

  2. Rants and Raves of a SAS Programmer

  3. Purpose I. Quality Control II. SAS Macros for Quality Control III. Sources of SAS Macros and QC Code

  4. I. Quality Control An ongoing effort for validation, improvement and facilitation of the data related process to insure that data meets the business needs.

  5. Quality Control “Quality control means you can have what you need, how you need it, when you need it.” E. Demming

  6. Why Practice QC? • It Saves Time • It Saves Money • It Makes Money • Ignorance is not Bliss

  7. How Good Data Goes Bad • “Love Child” ... Design by “Accident” • Adopted ... Someone Else’s Design • Child Abuse ... Poorly Nurtured • Terrible Teens ... Growing Pains

  8. The QC Process • Define Requirements • Identify Data Issues • Analyze Options • Improve Data Quality • Document every step and repeat

  9. Define Requirements • What do you need? • Requires an understanding of the business process, the data, the operating system and the users. • Documentation, business specs and “experts”.

  10. Devil’s Advocate • What is correct for one task / group may be incorrect for another. • What is correct now may be incorrect later. • What is correct now ... may not be able to be repeated.

  11. Identify Data Issues • Define Quality Criteria What do you need? • Define Data Metrics What do you have? • Identify Questionable Values What you have vs. What you need

  12. Identify Questionable Values • Accuracy • Completeness • Consistency • Timeliness • Uniqueness • Validity

  13. G = Good F = Fair B = Bad

  14. Analyze Options • What do you need? • What do you have? • What changes need to be made? • Will you break anything along the way?

  15. Improve Data Quality • Selective Processing • Clean Existing Values. • Correcting Existing Values. • Delete “bad” observations. • Add additional data. • Document original and new values.

  16. Documentation • Design Process ... business specs • “As You Go” ... in the code, log, email • Input and Output files • Modifications .... “as per xxx “, email • Exceptions (Errors and Issues) • User’s Manual • Elizabeth Axelrod ... Big ‘D’ “Just Shoot Them”

  17. General Suggestions • Avoid placing blame: “Drive Out Fear”. • Repair as “Early” as possible. • Obtain “Buy In” from all parties. • Keep it “Simple” ... use macros. • Monitor results. • Document everything, every time.

  18. II. SAS Macros Macros allow you to use, re-use and share “object-oriented” code. QC is very redundant .... the same or similar process performed on each data set, each variable and each process.

  19. FREQOUT.SAS Produces Frequencies for multiple variables % FREQOUT (data= /* input dataset name */, out= freqout /* output data set name , vars= /* list of variables */, by = /* list of by variables */, fmtassign = /* var fmt var fmt */, debugging = NO /* YES or NO */ Author: Ian Whitlock Location: SESUG 2003 PROCEDINGS

  20. EAP_RPT.SAS Produce frequencies identifying questionable data. Values will be compared to the specified format and judged to be G "Good", M "Missing" or N "Non Conforming”. An output file will be produced with cross tabs of the variable name by the result of the comparison. The output file name will be the same as the input file, preceded by the letter Q and ending in an LST extension. The output file can be redirected to a different directory.

  21. %EAP_RPT ( _VARS= , _FMTS= , LIBIN= , LIBOUT=, DSN=); _VARS= list of character variables to review .. paired with _FMTS _FMTS= list of formats to apply ... paired with _VARS DSN = Name of input SAS data set LIBIN= SAS library of input data set LIBOUT= SAS library of output data set Example: %EAP_RPT(_VARS = AGE INCOME EDUCATION , _FMTS = AGE INC EDU , LIBIN = PROJ_IN , LIBOUT = PROJ_OUT , DSN = STUDY_1); Author: Gary McQuown Location: SESUG 2003 PROCEDINGS

  22. Data Cleaning TIP00128a - Cleansing Macro, Data Scrubbing routine (see tip 00128 for more) %cleanse(schlib=work, schema=, strlen=50, var=, target=target, replace=replace, case=nocase); Author: Charles Patridge Version: 2.1 (sug. by Ian Whitlock) Location: www.sconsig.com

  23. REMOVE OUTLIERS %outlier ( data = _SAS_dataset_name_, out = _SAS_output_dataset_name var = _variable_to_screen pass = _number_of_passes except = _exception_report_data_set_, mult = _multiplier_of_standard_deviations_) The %OUTLIER macro completes outlier screens based on statistical values of a numeric variable in a SAS data set. It is set up to remove any outlier records that are within a given number of Standard Deviations from the mean, and will run that screen a given number of times. For example, a "3-Pass-2" outlier screen will remove any values outside 3 standard deviations from the mean, and will run that outlier screen twice. The given numbers can be any integer. Author: Unknown Location: www.spikeware.com

  24. KEEPDBLS: Documents Duplicates TIP000367- KeepDbls %MACRO KeepDbls (SourceDs =_LAST_, TargetDs =, Overwrit =N, IdList =, Where =); Moves duplicate observations to another file. Author: Jim Groeneveld Location: www.sconsig.com

  25. LOG FILTER: Examines and Reports on SAS Log Log Filter checks your log for errors, warnings, and other "interesting" messages. It then displays what it finds in its summary window. Double-click on a row and it'll reposition the log window to display the message in context (if it's an external log file, it'll open it in a viewer window and position it for you). Author: Ratcliffe Location: http://ratcliffe.co.uk/rest_logfilt.htm

  26. III. Sources of SAS Macros and QC Code www.sas.com (SUGI Proceedings and examples) www.sconsig.com www.nesug.com www.sesug.com www.ratcliffe.co.uk www.statetechservices.com www.spikeware.com

  27. More Sources www.mcw.edu/pcor/rsparapa/sasmacro.html www.math.yorku.ca/scs/friendly.html www.stat.ncsu.edu/sas/samples/index.html www.dasconsultants.com SAS-L Books By Users:Ron Cody’s Data Cleaning Numerous books on Macros .... “By Example”

  28. Questions ? Gary McQuown mcquown@DASconsultants.com www.DASconsultants.com

More Related