1 / 37

Kim Michalski Office of the Actuary Rick Andrews Office of Research, Development, and Information

Basic SAS Functions in Version 8.2. Kim Michalski Office of the Actuary Rick Andrews Office of Research, Development, and Information. There are over 300 SAS Function in Version 8.2 This presentation will describe approximately 2 dozen in the following areas: Truncation Character

ellenday
Download Presentation

Kim Michalski Office of the Actuary Rick Andrews Office of Research, Development, and Information

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Basic SAS Functionsin Version 8.2 Kim MichalskiOffice of the Actuary Rick AndrewsOffice of Research, Development, and Information

  2. There are over 300 SAS Function in Version 8.2 • This presentation will describe approximately 2 dozen in the following areas: • Truncation • Character • Conversion • Date and Time • For a complete list visit: • http://v8doc.sas.com/sashtml/ - Base SAS Software - SAS Language and Reference Dictionary - Functions and CALL Routines Where to begin?

  3. 1.A - Round up to the next largest integer - CEIL • 1.B - Round down to the next smallest integer - FLOOR • 1.C - Remove the fractional part of a number - INT • 1.D - Round a numeric value to specified unit - ROUND Truncation

  4. 2.A - Compute sum of values in a list - SUM • 2.B - Compute average values of a list - MEAN • 2.C - Determine smallest value in a list - MIN • 2.D - Determine largest value in a list - MAX Descriptive Statistics

  5. 3.A - Split text containing a delimiter (1) - SCAN • 3.B - Split text containing a delimiter (2) - SCAN • 3.C - Justify character values - LEFT • 3.D - Return part of character expression - SUBSTR • 3.E - Return location of character string - INDEX • Should we add TRIM? Character

  6. 3.A - Split text containing a delimiter (1) - SCAN • 3.B - Split text containing a delimiter (2) - SCAN • 3.C - Justify character values - LEFT • 3.D - Return part of character expression - SUBSTR • 3.E - Return location of character string - INDEX • Should we add TRIM? Character

  7. 5.A - Various date functions - Multiple • 5.B - Intervals of a time span - INTCK • 5.C - Advance a date or time interval - INTNX • 5.D- Compute age as of last birthday - YRDIF Date and Time

  8. DATA temp1a; rate1 = CEIL(3.2); rate2 = CEIL(3.8); rate3 = CEIL(-3.2); rate4 = CEIL(-3.8); payment1 = CEIL(999.22); payment2 = CEIL(999.88); RUN; ******************************************************** rate1 rate2 rate3 rate4 payment1 payment2 ------- ------- ------- ------- ---------- ---------- 4 4 -3 -3 1000 1000 ******************************************************** 1.A - Round up to the next largest integer( CEIL )

  9. DATA temp1b; rate1 = FLOOR(3.2); rate2 = FLOOR(3.8); rate3 = FLOOR(-3.2); rate4 = FLOOR(-3.8); payment1 = FLOOR(999.22); payment2 = FLOOR(999.88); RUN; ******************************************************** rate1 rate2 rate3 rate4 payment1 payment2 ------- ------- ------- ------- ---------- ---------- 3 3 -4 -4 999 999 ******************************************************** 1.B - Round down to the next smallest int.( FLOOR )

  10. DATA temp1c; rate1 = INT(3.2); * Same as FLOOR for positive numbers; rate2 = INT(3.8); rate3 = INT(-3.2); * Same as CEIL for negative numbers; rate4 = INT(-3.8); payment1 = INT(999.22); payment2 = INT(999.88); RUN; ********************************************************** rate1 rate2 rate3 rate4 payment1 payment2 ------- ------- ------- ------- ---------- ---------- 3 3 -3 -3 999 999 ********************************************************** 1.C - Remove the fractional part of a number( INT )

  11. DATA temp1d; rate1 = ROUND(3.2); rate2 = ROUND(3.8); rate3 = ROUND(-3.2); marg1 = ROUND(12.49); marg2 = ROUND(12.49,.1); dec2 = ROUND(450.98999,.01); five = ROUND(73.3,5); ten = ROUND(23.99,10); RUN; *********************************************************** rate1 rate2 rate3 marg1 marg2 dec2five ten ----- ----- ----- ----- ----- ------- ----- ---- 3 4 -3 12 12.5 450.99 75 20 *********************************************************** 1.D - Round a numeric value to specified unit( ROUND )

  12. 2.A - Compute sum of values in a list - SUM • 2.B - Compute average values of a list - MEAN • 2.C - Determine smallest value in a list - MIN • 2.D - Determine largest value in a list - MAX Descriptive Statistics

  13. DATA temp2a1; x=100; y=200; z=.; tot1=SUM(x,y,z); tot2=x+y+z; *<-- BEWARE: tot2 will be missing *; * since z is missing *; RUN; ********************************************************* x y z tot1 tot2 ----- ----- ----- ------ ------ 100 200 . 300 . ********************************************************* 2.A - Compute sum of values in a list ( SUM )

  14. DATA temp2a2; a1=300; a2=.; a3=400; a4=500; tot3=SUM(a1,a2,a3,a4); tot4=SUM(OF a1-a4); * If the OF argument is not used *; tot5=SUM(a1-a4); * then a4 will be subtracted a1 *; RUN; ********************************************************* a1 a2 a3 a4tot3 tot4 tot5 ----- ----- ----- ------ ------ ------ ------ 300 . 400 500 1200 1200 -200 ********************************************************* 2.A - Compute sum of values in a list ( SUM )

  15. DATA temp2a3; x=100; y=200; z=.; a1=300; a2=.; a3=400; a4=500; tot6=SUM(OF a1-a4,x,y,z); tot7=SUM(100,200,.,300,.,400,500); tot8=SUM(z,a2); * Add a zero to the list if a *; tot9=SUM(0,z,a2); * missingvalue is not desired *; RUN; * when all values are missing *; ********************************************************* tot6 tot7 tot8 tot9 ------ ------ ------ ------ 1500 1500 . 0 ********************************************************* 2.A - Compute sum of values in a list ( SUM )

  16. 4.A - Convert Character date to SAS date - INPUT • 4.B - Use of the YEARCUTOFF Option - INPUT • 4.C - Convert character to numeric - INPUT • 4.D - Convert numeric to character - PUT Conversion

  17. DATA temp3a; dir = 'G:\IMG\Data'; level1 = SCAN(dir,1,'\'); level2 = SCAN(dir,2,'\'); level3 = SCAN(dir,3,'\'); RUN; ***************************************** dir level1 level2 level3 ----------- ------ ------ ------ G:\IMG\Data G: IMG Data ***************************************** 3.A - Split text containing a delimiter( SCAN )

  18. DATA temp3b; state_cnty = 'Montgomery, MD'; county = SCAN(state_cnty,1,','); state = SCAN(state_cnty,2,','); RUN; ***************************************** state_cnty county state -------------- ---------- ----- Montgomery, MD Montgomery MD Note the space! ***************************************** 3.B - Split text containing a delimiter( SCAN )

  19. DATA temp3c; SET temp3b; state = LEFT(state); OUTPUT; state = RIGHT(state); OUTPUT; RUN; ***************************************** state_cnty county state -------------- ---------- ----- Montgomery, MD Montgomery MD Montgomery, MD Montgomery MD ***************************************** 3.C - Justify character values( LEFT / RIGHT)

  20. DATA temp3d; *1234567890; state_cnty = 'Montgomery, MD'; county = SUBSTR(state_cnty,1,10); state = SUBSTR(state_cnty,13,2); RUN; ***************************************** state_cnty county state -------------- ---------- ----- Montgomery, MD Montgomery MD ***************************************** 3.D - Return part of character expression( SUBSTR )

  21. DATA temp3e; state_cnty = 'Montgomery, MD'; idx = INDEX(state_cnty,','); county = SUBSTR(state_cnty,1, idx-1 ); state = SUBSTR(state_cnty, idx+2 ,2); RUN; ***************************************** state_cnty idx county state -------------- --- ---------- ----- Montgomery, MD 11 Montgomery MD ***************************************** Determine location of the comma. 3.E - Return location of character string( INDEX ) Use that location within the substring.

  22. OPTIONS YEARCUTOFF=1900; * Can cause an ERROR *; * for 2-digit years *; DATA temp4a; *1234567890; * Month / Day / Year *; char_date = '02/29/2004'; sas_date = INPUT(char_date,MMDDYY10.); OUTPUT; char_date = '02-29-2004'; sas_date = INPUT(char_date,MMDDYY10.); OUTPUT; char_date = '02-29-04'; sas_date = INPUT(char_date,MMDDYY8.); OUTPUT; char_date = '022903'; sas_date = INPUT(char_date,?? MMDDYY6.); OUTPUT; 4.A - Convert CHAR date into SAS date( INPUT ) Note that question marks can be used to suppress error messages in the log.

  23. ********************************************************* char_date sas_date Format --------- ---------- --------- 02/29/2004 02/29/2004 MMDDYY10. 02-29-2004 02/29/2004 MMDDYY10. 02-29-04 02/29/1904 MMDDYY8. 022903 . ?? MMDDYY6. Note the invalid date of February 29, 2003 ! ********************************************************* 4.A - Convert CHAR date into SAS date( Continued )

  24. OPTIONS YEARCUTOFF=1920; *<-- Set date greater than largest expected 2-digit year !!!; DATA temp4b; *1234567890; * Day / Month / Year *; char_date = '02/29/04'; sas_date = INPUT(char_date,DDMMYY8.); OUTPUT; char_date = '02/29/2004'; sas_date = INPUT(char_date,DDMMYY10.); OUTPUT; * Year / Month / Day *; char_date = '20040229'; sas_date = INPUT(char_date,YYMMDD8.); OUTPUT; char_date = '2004/02/29'; sas_date = INPUT(char_date,YYMMDD10.); OUTPUT; 4.B - Use of the YEARCUTOFF Option( INPUT )

  25. DATA temp4b; *1234567890; * Day / Char Month / Year *; char_date = '29feb04'; sas_date = INPUT(char_date,DATE7.); OUTPUT; char_date = '29feb2004'; sas_date = INPUT(char_date,DATE9.); OUTPUT; ********************************************************* char_date sas_date Format --------- ---------- --------- 02/03/04 03/01/2003 DDMMYY8. 02/03/2004 03/01/2003 DDMMYY10. 20040229 01/31/2003 YYMMDD8. 2004/02/29 01/31/2003 YYMMDD10. 29feb04 01/01/2003 DATE7. 29feb2004 01/01/2003 DATE9. ********************************************************* 4.B - Convert CHAR date into SAS date ( Continued )

  26. DATA temp4c; LENGTH char $25; FORMAT num 15.2; char = '123'; num = char + 0; OUTPUT; * Note ADDITION *; char = '123.45'; num = INPUT(char,6.2); OUTPUT; char = '12345'; num = INPUT(char,5.); OUTPUT; char = '123456789012345'; num = INPUT(char,15.); OUTPUT; char = '123,456,789,012'; num = INPUT(char,COMMA15.); OUTPUT; RUN; 4.C - Convert Character to Numeric( INPUT )

  27. ********************************************************** char num Statement --------------- --------------- -------------------- 123 123.00 char + 0 123.45 123.45 INPUT(char,6.2) 12345 12345.00 INPUT(char,5.) 123456789012345 123456789012345 INPUT(char,15.) 123,456,789,012 123456789012.00 INPUT(char,COMMA15.) ********************************************************** 4.C - Convert Character to Numeric( Continued )

  28. DATA temp4d; LENGTH char $25; FORMAT num 15.2; num = 123; char = num || ''; OUTPUT; * Note CONCATENATION *; num = 123.45; char = PUT(num,6.2); OUTPUT; num = 12345; char = PUT(num,5.); OUTPUT; num = 123456789012345; char = PUT(num,15.); OUTPUT; num = 123456789012; char = PUT(num,COMMA15.); OUTPUT; RUN; 4.D - Convert Numeric to Character( PUT )

  29. Note the leading spaces when using concatenation operator. ********************************************************** num char Statement --------------- --------------- ----------------- 123.00 123 num || '' 123.45 123.45 PUT(num,6.2) 12345.00 12345 PUT(num,5.) 123456789012345 123456789012345 PUT(num,15.) 123456789012.00 123,456,789,012 PUT(num,COMMA15.) 1234 01234 PUT(num,Z5.) ********************************************************** 4.D - Convert Numeric to Character( Continued ) Note the leading zero. The “Z5.“ informat can be used to convert numeric procedure codes imported from Excel.

  30. DATA temp5a; var1 = TODAY(); OUTPUT; var1 = DATETIME(); OUTPUT; var1 = DATEPART(var1); OUTPUT; var1 = MONTH( TODAY() ); OUTPUT; var1 = YEAR( TODAY() ); OUTPUT; var1 = DAY( TODAY() ); OUTPUT; var1 = QTR( TODAY() ); OUTPUT; RUN; 5.A - Date Functions

  31. ****************************************** var1 Statement ---------- ----------------- 17085 TODAY() 1476200685 DATETIME() 17085 DATEPART( var1 ) 1 MONTH( TODAY() ) 2007 YEAR( TODAY() ) 10 DAY( TODAY() ) 1 QTR( TODAY() ) ****************************************** 5.A - Date Functions

  32. DATA temp5b; FORMAT beg_date end_date MMDDYY10.; beg_date = '01JAN2001'D; end_date = '31DEC2002'D; example = INTCK('DAY', beg_date,end_date); OUTPUT; example = INTCK('MONTH',beg_date,end_date); OUTPUT; example = INTCK('MONTH',beg_date,end_date + 1); OUTPUT; example = INTCK('YEAR', beg_date,end_date); OUTPUT; example = INTCK('YEAR', beg_date,end_date + 1); OUTPUT; example = INTCK('QTR', beg_date,end_date); OUTPUT; RUN; 5.B - Intervals of a Time Span( INTCK )

  33. ********************************************************** beg_date (b) end_date (e) example Statement ----------- ------------ ------- ---------------------- 01/01/2001 12/31/2002 729 INTCK('DAY', b,e) 01/01/2001 12/31/2002 23 INTCK('MONTH',b,e) 01/01/2001 12/31/2002 24 INTCK('MONTH',b,e + 1) 01/01/2001 12/31/2002 1 INTCK('YEAR', b,e) 01/01/2001 12/31/2002 2 INTCK('YEAR', b,e + 1) 01/01/2001 12/31/2002 7 INTCK('QTR', b,e) ********************************************************** 5.B - Intervals of a Time Span( Continued )

  34. DATA temp5c; FORMAT orig_date example MMDDYY10.; orig_date = '31DEC2002'D; new_date = INTNX('DAY', orig_date,0); OUTPUT; new_date = INTNX('DAY', orig_date,1); OUTPUT; new_date = INTNX('MONTH',orig_date,1); OUTPUT; new_date = INTNX('MONTH',orig_date,3); OUTPUT; new_date = INTNX('YEAR', orig_date,1); OUTPUT; new_date = INTNX('YEAR', orig_date,2,'END'); OUTPUT; RUN; 5.C - Advance a Date or Time Interval ( INTNX )

  35. ***************************************************** orig_date(d) new_date Statement ------------- ---------- --------------------- 12/31/2002 12/31/2002 INTNX('DAY' ,d,0) 12/31/2002 01/01/2003 INTNX('DAY' ,d,1) 12/31/2002 01/01/2003 INTNX('MONTH',d,1) 12/31/2002 03/01/2003 INTNX('MONTH',d,3) 12/31/2002 01/01/2003 INTNX('YEAR' ,d,1) 12/31/2002 12/31/2004 INTNX('YEAR' ,d,2,'END') ****************************************************** 5.C - Advance a Date or Time Interval ( Continued )

  36. DATA temp5d; format dob1 dob2 mmddyy10.; dob = '23DEC1970'd; tmp = YRDIF(dob, TODAY(), 'ACTUAL'); age = FLOOR(tmp); RUN; ****************************** dob tmp age ---------- -------- ----- 12/23/1970 35.8740 35 ****************************** Note - Age in this example correspond to TODAY; 5.D - Compute Age as of Last Birthday( YRDIF )

  37. Kim Michalski (410) 786-0516 Kimberly.Michalski@cms.hhs.gov Rick Andrews (410) 786-4088 Richard.Andrews@cms.hhs.gov Centers for Medicare and Medicaid Services7500 Security BoulevardBaltimore, MD 21244 About the Speakers

More Related