1 / 11

SAS uses the usual arithmetic signs: + , - , * , / , **

Chapter 3 “Working With Your Data” concerns programming in the DATA step - putting lines of SAS code between a DATA and PROC statement… Creating new variables or modifying existing variables is one of the main tasks in DATA step programming. The syntax is: variable = expression ;

hamish
Download Presentation

SAS uses the usual arithmetic signs: + , - , * , / , **

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 3 “Working With Your Data” concerns programming in the DATA step - putting lines of SAS code between a DATA and PROC statement… • Creating new variables or modifying existing variables is one of the main tasks in DATA step programming. The syntax is: variable = expression ; The variable on the left side can be a new or existing variable; if the variable is a new one, then SAS adds it to the dataset and if it’s an existing variable, SAS redefines it (and so replaces the old values with the new values) defined by the expression on the right.

  2. SAS uses the usual arithmetic signs: + , - , * , / , ** • follows the usual rules for precedence (exponentiation first, then multiplication & division, then addition and subtraction) • follows the usual rules for parentheses (do what’s in parentheses first)… so when in doubt, use parentheses... • Example: write a simple SAS program using the 5 arithmetic operations to create new variables… then write some programming statements to show how parentheses can affect those variables…

  3. options ls=80; data test; input x @@; y=x+10; z=x/y; w=x**2; t=x**.5; bd=mdy(1,15,1966); new=round(log10(150)); datalines; 1 2 3 4 5 6 7 8 9 10 ; proc print; format bd mmddyy10.; run;

  4. Many times new variables are created with the built-in functions of SAS ... sections 3.2 & 3.3 have a sampling of those put in categories so you can see the variety... categories are: • Character, Date, Financial, Macro, Mathematical • Probability, Random Number, Simple Statistical • State & Zip Code • Functions operate on their arguments and are in the form function(arg1, …, argn); so to create a new variable with a function use new_variable = function(argument_list) • Functions can be nested within each other: newvar=sqrt(log10(X));

  5. DATA contest; *INFILE 'c:\MyRawData\Pumpkin.dat'; INPUT Name $16. Age 3. +1 Type $1. +1 Date MMDDYY10. (Scr1 Scr2 Scr3 Scr4 Scr5) (4.1); AvgScore = MEAN(Scr1, Scr2, Scr3, Scr4, Scr5); DayEntered = DAY(Date); Type = UPCASE(Type); DATALINES; Alicia Grossman 13 c 10-28-1999 7.8 6.5 7.2 8.0 7.9 Matthew Lee 9 D 10-30-1999 6.5 5.9 6.8 6.0 8.1 Elizabeth Garcia 10 C 10-29-1999 8.9 7.9 8.5 9.0 8.8 Lori Newcombe 6 D 10-30-1999 6.7 5.6 4.9 5.2 6.1 Jose Martinez 7 d 10-31-1999 8.9 9.510.0 9.7 9.0 Brian Williams 11 C 10-29-1999 7.8 8.4 8.5 7.9 8.0 ; PROC PRINT DATA = contest; TITLE 'Pumpkin Carving Contest'; RUN;

  6. Go over the functions in section 3.3 in detail - general statements on p. 80 and specific examples on p. 81. Note the variety of arguments for the various functions…

  7. Another way to create new variables is with so-called IF-THEN statements - syntax is IFconditionTHENaction ; Most of the conditions are based on comparison operators... EQ, NE, GT, LT, GE, LE, IN (or use = , ~= or ^=, >, <, >=, <=) You can also specify multiple conditions with logical operators... OR, AND, NOT (or use |, &, ~ or ^) You can also specify multiple actions using the DO-END loop... (see p. 82 - also next slide)

  8. In the conditional IFconditionTHENaction ; • condition can also be compound, formed by using the logical connectors AND, OR, NOT or by using the IN statement... see the examples on pages 82 and 83 • Can also use the following construction: • IF condition THEN DO; • action1; • action2; • END; • this DO group can contain many SAS statements, but they are all treated as a unit, and executed one after the other until the END statement is reached

  9. IF-THEN statements are often used to create new categorical variables from existing variables with many different values; or to convert numeric variables to character variables - do some examples… • IF-THEN-ELSE statements can help with this task by ensuring that the conditions you are using to create your categories are mutually exclusive. Sometimes the final ELSE statement has no IF…THEN • See example on page 85 - note how missing values are handled…

  10. * Group observations by cost; DATA homeimprovements; *INFILE 'c:\MyRawData\Home.dat'; INPUT Owner $ 1-7 Description $ 9-33 Cost; IF Cost = . THEN CostGroup = 'missing'; ELSE IF Cost < 2000 THEN CostGroup = 'low'; ELSE IF Cost < 10000 THEN CostGroup = 'medium'; ELSE CostGroup = 'high'; DATALINES; Bob kitchen cabinet face-lift 1253.00 Shirley bathroom addition 11350.70 Silvia paint exterior . Al backyard gazebo 3098.63 Norm paint interior 647.77 Kathy second floor addition 75362.93 PROC PRINT DATA = homeimprovements; TITLE 'Home Improvement Cost Groups'; RUN;

  11. HW: Email to me by noon on Wednesday: • Use the diabetes data to create the following new variables: • BMI (Body Mass Index = (weight (in kg)) divided by (height in meters)2 ) • A character variable based on the BMI that puts the participants into groups: below 18.5 is underweight, between 18.5 and 24.9 is normal, 25.0 to 29.9 is overweight, and 30.0 and above is obese. • age classes - make classes by decade (teens, 20s, 30s, etc.)

More Related