worst, but still importable data I’ve ever seen - PowerPoint PPT Presentation

slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
worst, but still importable data I’ve ever seen PowerPoint Presentation
Download Presentation
worst, but still importable data I’ve ever seen

play fullscreen
1 / 18
worst, but still importable data I’ve ever seen
Download Presentation
Download Presentation

worst, but still importable data I’ve ever seen

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. worst, but still importable data I’ve ever seen Arthur Tabachneck Insurance Bureau of Canada

  2. suppose you had the following excel file: format: text format: as shown format: m/d/yyyy format: d/m/yyyy format: text format: d-mon format: text

  3. how the file got so bad: members of a secretarial pool were asked to enter the data, in Excel, while they were covering the front desk they (four different secretaries), obviously, weren’t given sufficient instructions their task was simply to enter some data, which happened to include a date

  4. proc import can only be used if: you licenseSAS/Access Interface to PC File Formats 1 andat least half of the relevant rows (based on your system’s and SAS guessingrows settings) are formatted as dates 2 oryou manually edit the spreadsheet and/or change your guessing rows settings so that condition #2 holds 3

  5. If proc import can be used, three steps are necessary step 1: use mixed=no

  6. which will import date formatted cells and assign missing values to the other cells

  7. step 2: use mixed=yes which will import all cells as text

  8. step 3 merge the two files and use inputn to read missing dates data want (drop=bdate); set inputa; set inputb (rename=(date=bdate)); if missing(date) then do; options datestyle=dmy; date=inputn(bdate, ‘anydtdte’, 20); end; if missing(date) then do; date=inputn(catt(scan(bdate,2,’-’), scan(bdate,1,’-’), scan(bdate,3,’-’)), ‘anydtdte’, 20); end;run;

  9. resulting in the following file

  10. however, if proc import can’t be usedorif you simply want a better solution

  11. you can do it with DDE step 1: set desired options and filename options noxsync noxwait xmin;filename sas2xl dde 'excel|system';

  12. data _null_; length fid rc start stop time 8; fid=fopen('sas2xl','s'); if (fid le 0) then do; rc=system('start excel'); start=datetime(); stop=start+10; do while (fid le 0); fid=fopen('sas2xl','s'); time=datetime(); if (time ge stop) then fid=1; end; end; rc=fclose(fid);run; Step 2: Open Excel

  13. Step 3: Open workbook and insert old-style macro sheet data _null_; file sas2xl; put '[open("c:\worst data.xls")]';run;data _null_; file sas2xl; put '[workbook.next()]'; put '[workbook.insert(3)]';run;filename xlmacro dde 'excel|macro1!r1c1:r99c1‘ notab lrecl=200;

  14. Step 4: Create and run Excel macro data _null_; file xlmacro; put '=set.name("Tag",!$b$1)'; put '=formula("<>",Tag)'; put '=set.name("OldValue",!$c$1)'; put '=set.name("NewValue",!$b$2)'; put '=for.cell("CurrentCell",sheet1!$a$2:$a$99,true)'; put '=formula(get.cell(5,CurrentCell),OldValue)'; put '=formula("=concatenate(Tag,OldValue)",NewValue)'; put '=formula(NewValue, CurrentCell)'; put '=next()'; put '=halt(true)'; put '!dde_flush'; file sas2xl; put '[run("macro1!r1c1")]'; put '[workbook.activate("sheet1")]'; put ‘[error(false)]’; put '[save.as(“c:\DateTest",6)]'; put '[quit()]';run;

  15. data want (keep=date); infile "c:\DateTest.csv" dsd dlm="," lrecl=32768 firstobs=2; informat rawdate $20.; input rawdate; format date date9.; rawdate=substr(rawdate,3); if anyalpha(rawdate) then do; options datestyle=dmy; date=inputn (rawdate , 'anydtdte' , 20 ); if missing(Date) then do; date=inputn(catt(scan(rawdate,2,'-'),scan(rawdate,1,'-'), scan(rawdate,3,'-')),'anydtdte' , 20) ; end; end; else Date=rawdate-21916;run; Step 5: Import the data

  16. and obtain the desired resultregardless of your system’s guessing rows settingor how your data is arranged

  17. Author Contact Information Your comments and questions are valued and encouraged. Contact the author: Dr. Arthur Tabachneck Director, Data Management Insurance Bureau of Canada Toronto, Ontario L3T 5K9 Canada atabachneck at ibc dot ca or art297 at netscape dot net

  18. Key References Microsoft Corporation. Function Reference Microsoft EXCEL Spreadsheet with Business Graphics and Database: Version 4.0 for Apple® Macintosh® Series or Windows™ Series. Document AB26298-0592, 1992. Vyverman, K. Excel Exposed: Using Dynamic Data Exchange to Extract Metadata from MS Excel Workbooks, SESUG 17, 2003, paper TU15, St. Pete Beach, FL Vyverman, K. Re: How to flag special formatting from Excel in a SAS dataset. SAS-L Post , 2002, http://www.listserv.uga.edu/cgi-bin/wa?A2=ind0209a&L=sas-l&D=1&O=A&P=12088 Vyverman, K. Re: MS Excel column widths. SAS-L Post , 2002, http://www.listserv.uga.edu/cgi-bin/wa?A2=ind0201b&L=sas-l&P=25268 Vyverman, K. Using Dynamic Data Exchange to Export Your SAS Data to MS Excel – Against All ODS, Part I, SUGI 26, 2001, paper 190-27, Long Beach, CA.