1 / 163

Chapter 7

Chapter 7. DATA Step Programming. Section 7.1. Reading SAS Data Sets and Creating Variables. Objectives. Create a SAS data set using another SAS data set as input. Create SAS variables. Use operators and SAS functions to manipulate data values.

Download Presentation

Chapter 7

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 7 DATA Step Programming

  2. Section 7.1 Reading SAS Data Sets and Creating Variables

  3. Objectives • Create a SAS data set using another SAS data set as input. • Create SAS variables. • Use operators and SAS functions to manipulate data values. • Control which variables are included in a SAS data set.

  4. SAS Vocabulary • DATA Statement TODAY() • SET Statement MDY(month,day,year) • Assignment Statement YEAR(SAS-date) • Operators QTR(SAS-date) • Operands MONTH(SAS-date) • SAS Function WEEKDAY(SAS-date) • SUM DROP/KEEP

  5. Reading a SAS Data Set Create a temporary SAS data set named onboard from the permanent SAS data named ia.dfwlaxand create a variable that represents the total passengers on board. Sum FirstClass and Economy values to compute the new variable, Total. New Variable SAS date values ia.dfwlax

  6. Reading a SAS Data Set To create a SAS data set using a SAS data set as input, you must use a • 1. DATA statement - • 2. SET statement - ...

  7. Reading a SAS Data Set To create a SAS data set using a SAS data set as input, you must use a • 1. DATA statement – starts the DATA step and tells SAS the name of the data set you want to CREATE. (output data set:onboard) • 2. SET statement - tells SAS the data set to USE to create the data set named on the DATA statement. (input data set:ia.dfwlax)

  8. Reading a SAS Data Set General form of a DATA step: • By default, the SET statement reads all • observations from the input SAS data set • variables from the input SAS data set. DATA output-SAS-data-set; SET input-SAS-data-set;additional SAS statementsRUN;

  9. The DATA Step – Example libname ia 'SAS-data-library'; data onboard; set ia.dfwlax; run; … to create this data set. Use this data set… The DATA step creates a data set. The DATA statement tells SAS the name of the data set being created.

  10. The DATA Step – Example At compile time, SAS reads the descriptor portion of ia.dfwlax and creates a spot in the PDV for each of the variables in the order they appear in the data set. libname ia 'SAS-data-library'; data onboard; set ia.dfwlax; run; PDV

  11. Reading a SAS Data Set • To create a variable, you must use an • Assignment statement to assign the new variable, Total, as the sum of the values of the variables FirstClass and Economy. Example: Total = FClass + EClass; New variable

  12. Assignment Statements • An assignment statement • evaluates an expression • assigns the resulting value to a variable. General form of an assignment statement: variable=expression;

  13. SAS Expressions An expression contains operands andoperators that form a set of instructions that produce a value. • Operandsare • variable names • constants • Operators are • symbols that request arithmetic calculations • SAS functions We look at functions later in this section.

  14. Selected operators for basic arithmetic calculations in an assignment statement: Using Operators (Remember “Please Excuse My Dear Aunt Sally…”) You can also use parentheses to enforce a different order.

  15. Example libname ia 'SAS-data-library'; data onboard; set ia.dfwlax; Total=FirstClass+Economy; run;

  16. Compiling the DATA Step – Example libname ia 'SAS-data-library'; data onboard; set ia.dfwlax; Total=FirstClass+Economy; run; The + tells SAS that Total is numeric PDV

  17. Compiling the DATA Step – Example libname ia 'SAS-data-library'; data onboard; set ia.dfwlax; Total=FirstClass+Economy; run; PDV SAS creates the variables in the order that it sees them. Total is created at the end of the step, so Total is last in the PDV.

  18. Executing the DATA Step – Example First, SAS initializes the PDV to missing. Character variables get set to a blank, and the numeric variables get set to a period. data onboard; set ia.dfwlax; Total=FirstClass+Economy; run; PDV onboard

  19. Executing the DATA Step – Example There is a pointer to the first observation in the ia.dfwlax data set. The SET statement tells SAS to load the observation into the PDV. ia.dfwlax data onboard; set ia.dfwlax; Total=FirstClass+Economy; run; PDV

  20. Executing the DATA Step – Example The pointer then moves down to the next observation in the ia.dfwlax data set. ia.dfwlax data onboard; set ia.dfwlax; Total=FirstClass+Economy; run; PDV

  21. Executing the DATA Step – Example ia.dfwlax data onboard; set ia.dfwlax; Total=FirstClass+Economy; run; PDV onboard

  22. Automatic return Automatic output Executing the DATA Step – Example ia.dfwlax data onboard; set ia.dfwlax; Total=FirstClass+Economy; run; PDV onboard

  23. Remember Executing the DATA Step – Example Every variable in a SAS data set has to have a value. Variables that are not read from the data set in the SET statement are reinitialized in the PDV. Variables that were read from the SAS data set will always have a value.

  24. Executing the DATA Step – Example ia.dfwlax Reinitialize Total to missing data onboard; set ia.dfwlax; Total=FirstClass+Economy; run; PDV onboard

  25. Executing the DATA Step – Example ia.dfwlax data onboard; set ia.dfwlax; Total=FirstClass+Economy; run; PDV onboard

  26. Executing the DATA Step – Example ia.dfwlax data onboard; set ia.dfwlax; Total=FirstClass+Economy; run; PDV onboard

  27. Automatic return Automatic output Executing the DATA Step – Example ia.dfwlax data onboard; set ia.dfwlax; Total=FirstClass+Economy; run; PDV onboard

  28. Executing the DATA Step – Example ia.dfwlax data onboard; set ia.dfwlax; Total=FirstClass+Economy; run; PDV onboard

  29. Assignment Statements – Example proc print data=onboard; format Date date9.; run; The SAS System First Obs Flight Date Dest Class Economy Total 1 439 11DEC2000 LAX 20 137 157 2 921 11DEC2000 DFW 20 131 151 3 114 12DEC2000 LAX 15 170 185 4 982 12DEC2000 dfw 5 85 90 5 439 13DEC2000 LAX 14 196 210 6 982 13DEC2000 DFW 15 116 131 7 431 14DEC2000 LaX 17 166 183 8 982 14DEC2000 DFW 7 88 95 9 114 15DEC2000 LAX . 187 . 10 982 15DEC2000 DFW 14 31 45 Why is Totalmissing in observation 9?

  30. Assignment Statements – Example • If any of the variables have missing values in an expression involving the +, -, * or /operators, the result will be a missing value. • It could be useful to see a missing value for Total, as it might indicate that you need to check the data. • However, if you prefer to ignore missing values and see 187 for Total, you will need to use a SAS function.

  31. Using SAS Functions • A SAS function is a routine that returns a value that is determined from specified arguments. • General form of a SAS function: function-name(argument1,argument2, . . .) Total=sum(FirstClass,Economy);

  32. Using SAS Functions You mustinclude the parentheses. They are part of the function syntax. If you forget them, the function name will be mistaken for a variable name. If there are multiple arguments inside the parentheses, they must be separated by a comma. function-name(argument1,argument2, . . .)

  33. Using SAS Functions • SAS functions • perform arithmetic operations • compute sample statistics • for example: sum, mean, and standard deviation • manipulate SAS dates and process character values • perform many other tasks. Sample statistics functionsignore missing values.

  34. Using the SUM Function – Example data onboard; set ia.dfwlax; Total=FirstClass+Economy; run; proc print data=onboard; format Date date9.; run; Replace the equation that using + operator with the SUM function. Note: There is no limit on the number of variables you can list inside the SUM function. data onboard; set ia.dfwlax; Total=sum(FirstClass,Economy); run; proc print data=onboard; format Date date9.; run;

  35. Using the SUM Function – Example proc print data=onboard; format Date date9.; run; The SAS System First Obs Flight Date Dest Class Economy Total 1 439 11DEC2000 LAX 20 137 157 2 921 11DEC2000 DFW 20 131 151 3 114 12DEC2000 LAX 15 170 185 4 982 12DEC2000 dfw 5 85 90 5 439 13DEC2000 LAX 14 196 210 6 982 13DEC2000 DFW 15 116 131 7 431 14DEC2000 LaX 17 166 183 8 982 14DEC2000 DFW 7 88 95 9 114 15DEC2000 LAX . 187 187 10 982 15DEC2000 DFW 14 31 45

  36. Using SAS Functions • Example • What if you wanted the difference between the two values? • You could use the – operator. There is not a function to subtract values. However, you could use the SUM function by adding a negative sign in front of the variable you want to subtract. Total=sum(FirstClass,-Economy);

  37. Exercise • This exercise reinforces the concepts discussed previously.

  38. Exercises Use the depression data set to create a new data set called DepressionData and a list report about economic factors during the Great Depression. Add two columns to the above data set. The first new variable called Change should calculate the difference between the number of banks and the number of banks suspending operations that year. Format the variable Changewith commas and no decimal places. The second variable called PercentChange should calculate the percentage of banks that suspended operations during that year. Use the PERCENT5. format for the PercentChangecolumn. The list report should contain the variables: Year, UnemploymentRate, NumberOfBanks, BankSuspensions, Change, and PercentChange.

  39. Exercises data Depression; set depressiondata; Change=NumberOfBanks-BankSuspensions; PercentChange=(BankSuspensions/NumberOfBanks); format change comma7. percentchange percent5.; run; proc print data=Depression noobs; var Year UnemploymentRate NumberOfBanks BankSuspensions Change PercentChange; run; Partial Output Unemployment Number Bank Percent Year Rate OfBanks Suspensions Change Change 1929 3.3 24,633 659 23,974 3% 1930 8.9 22,773 1,350 21,423 6% 1931 15.9 19,970 2,293 17,677 11% 1932 23.6 18,397 1,453 16,944 8% 1933 24.9 15,015 4,000 11,015 27% 1934 21.7 16,096 57 16,039 0% 1935 20.1 . . . .

  40. Exercises • Use the purchases data set to create a new data set called totalpurchases and a list report. • The first variable is called BudgetedCost, which is the Total Items Needed times the Cost per Item (including the discount). • The second variable is called BackOrdered and is the difference between TotNeeded and TotPurchased, ignoring missing values. • Format Cost and BudgetedCost with a dollar sign and two decimal places.

  41. Exercises data totalpurchases; set purchases; BudgetedCost=totneeded*(Cost-Cost*PercentDiscount/100); BackOrdered=sum(TotNeeded, - TotPurchased); format cost dollar6.2 budgetedcost dollar9.2; run; proc print data=totalpurchases; run; Tot Tot Percent Budgeted Back Obs ItemNo ItemName Needed Purchased Cost Discount Cost Ordered 1 636589 Mugs 12 oz 128 64 $6.80 5 $826.88 64 2 167643 Mugs 20 oz 24 12 $7.20 0 $172.80 12 3 891674 Wine Glass Fluted 100 85 $3.50 0 $350.00 15 4 197646 Wine Glass Clear 100 . $3.50 0 $350.00 100 5 324346 Cutting Board Maple 15 15 $11.50 10 $155.25 0 6 354664 Carving Knife 10 inch 30 15 $43.60 15 $1,111.80 15 7 994643 Paring Knife 30 4 $23.50 15 $599.25 26 8 764646 Drink Carafe 48 48 $34.25 25 $1,233.00 0 9 546546 Fluted Pitchers 36 40 $34.99 20 $1,007.71 -4

  42. Using Date Functions • You can use SAS date functions to • create SAS date values • extract information from SAS date values.

  43. Date Functions: Create SAS Dates • TODAY( )obtains the date value from the system clock and returns the number of days the date is from January 1, 1960. Notice there are no arguments inside the ( ). You will get an error if you add an argument. SAS uses the system clock on the computer to return the date. If you set an incorrect date, the incorrect date will be returned.

  44. Date Functions: Create SAS Dates • MDY(month,day,year) uses numeric month, day, and year values to return the corresponding SAS date value. If you want to supply SAS with a month, day, and year, you can create a SAS date. This can be a numeric literal or a variable name that represents a number. • Example: The Declaration of Independence was signed July 4, 1776. To see this as a SAS date, use the MDY function. Date = mdy(7, 4, 1776); -67019

  45. Exercise • This exercise reinforces the concepts discussed previously.

  46. Exercises • What is your birthday as a SAS date value? Use PROC PRINT to display the value. Verify the date was calculated successfully by formatting the value using the DATE9. format. data bday; myBday = ; run;

  47. Exercises • Solution: Use Martin Luther King, Jr.’s birthday for the myBday value, which is January 15th, 1929. data bday; myBday = mdy(1,15,1929); format mybday date9.; run; proc print data=bday; run;

  48. Exercises • Modify the program. • What is today as a SAS date? (format permanently as DATE9.) • How many days old are you? • If you divide the number of days old you are by 365, how old are you by SAS calculations? data bday; myBday = ; today_is = ; num_days = ; Age = ; run;

  49. Exercises • Solution: Use Martin Luther King, Jr.’s Birthday for the myBday value, which is January 15th, 1929. data bday; myBday = mdy(1,15,1929); today_is = today(); num_days = today_is - myBday; Age = Num_days / 365; format mybday today_is date9.; run; proc print data=bday; run;

  50. Date Functions: Extracting Information • YEAR(SAS-date)extracts the year from a SAS date and returns a four-digit value for year. • Example: The Declaration of Independence was signed July 4, 1776, which is -67019 as a SAS date. To extract the year, use the YEAR function. IndepYear = year(-67019); 1776

More Related