1 / 16

BMTRY 789 Lecture 4: Construction of Flat Files from vertical files, merging, subsetting

BMTRY 789 Lecture 4: Construction of Flat Files from vertical files, merging, subsetting. Readings – Chapter 14 In-Class Practice - 14.1, 14.4, 14.5, 14.6, 14.11. You have to have Goals …. Vertical & Flat File Definitions.

qamra
Download Presentation

BMTRY 789 Lecture 4: Construction of Flat Files from vertical files, merging, subsetting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BMTRY 789 Lecture 4: Construction of Flat Files from vertical files, merging, subsetting Readings – Chapter 14 In-Class Practice - 14.1, 14.4, 14.5, 14.6, 14.11

  2. You have to have Goals … BMTRY 789 Introduction to SAS Programming

  3. Vertical & Flat File Definitions • A vertical file is a file that has multiple observations per identifier (i.e. patient) • A flat file is a file that has one observation per identifier. Subj Age Date HR BP 001 26 07/23/98 58 12080 [Vertical] 001 27 02/14/99 62 11060 … Subj Age Date1 HR1 BP1 Date2 HR2 BP2 001 26 07/23/98 58 12080 02/14/99 62 11060 [Flat] 002 25 02/28/00 65 11070 09/01/01 65 12070 … BMTRY 789 Introduction to SAS Programming

  4. Data Manipulation • Why do you spend 85% of your programming time doing data manipulation? • You get data one way, but it needs to be another way in order to perform the analysis. • Will need to do things like, subset, merge, rename variables, create new variables, etc. in order to structure the data into the format that you need it to be. BMTRY 789 Introduction to SAS Programming

  5. Conditional Logic [If-Then-Else] • Frequently, you want an assignment statement to apply to some observations but not all - under some conditions, but not others. 1) IF condition THEN action; 2) IF condition THEN action; ELSE IF condition THEN action; ELSE IF condition THEN action; 3) IF condition THEN action; ELSE IF condition THEN action; ELSE action; 4) IF condition THEN DO; Perform a set of actions; END; BMTRY 789 Introduction to SAS Programming

  6. IF-THEN-ELSE Rules • A single IF-THEN statement can only have one action. If you add the keywords DO and END, then you can execute more than one action. • You can also specify multiple conditions with the keywords AND and OR • The IN operator also makes comparisons, IN compares the value of a variable to a list of values. *Remember SAS considers missing values to be smaller than non-missing values. BMTRY 789 Introduction to SAS Programming

  7. Comparison Operators • These operators can be coded using Symbols or Mnemonics. SymbolMnemonicMeaning = EQ Equals ~= NE Not Equal > GT Greater Than < LT Less Than >= GE Greater than or Equal <= LE Less than or Equal & AND All comparisons must be true | OR Only one comparison must be true BMTRY 789 Introduction to SAS Programming

  8. Subsetting • Often programmers find that they want to use some of the observations in a data set and exclude the rest. The most common way to do this is with a subsetting IF statement in a DATA step. Syntax: IF expression; Ex: IF Sex = ‘f’; BMTRY 789 Introduction to SAS Programming

  9. Computing a Difference Between first and last observations ProcSort Data= Annie.PATIENTS; By id Date; Run; Data First_Last; Set Annie.PATIENTS; By Id; Format Date mmddyy10.; Retain First_HR First_SBP First_DBP; If First.id and Last.id then delete; ***Purpose of code: if it is the first visit assign values to the first visit variables, same for last visit; If First.ID Then Do; First_HR = HR; First_SBP = SBP; First_DBP = DBP; End; If Last.ID Then Do; Last_HR = HR; Last_SBP = SBP; Last_DBP = DBP; Diff_HR = Last_HR - First_HR; Diff_SBP = Last_SBP - First_SBP; Diff_DBP = Last_DBP - First_DBP; Output; End; Run; BMTRY 789 Introduction to SAS Programming

  10. Subsetting (cont.) • If the expression is true, then SAS continues with the DATA step. If the expression is false, then no further statements are processed for that observation; that observation is not added to the data set being created; and SAS moves to the next observation. • While the subsetting IF statement tells SAS which observations to include, the DELETE statement tells SAS which observations to exclude: IF expression THEN DELETE; IF Sex = ‘f’ THEN DELETE; BMTRY 789 Introduction to SAS Programming

  11. Stacking Data Sets • Used to combine data sets with all or most of the same variables but different observations. • The order of observations is determined by the order of the list of old data sets • If one of the data sets has a variable not contained in the other data sets, then the observations from the other data sets will have missing values for that variable. DATA new-data-set; SET old-data-set1 old-data-set2; RUN; BMTRY 789 Introduction to SAS Programming

  12. Interleaving Data Sets • If your data sets are already sorted, combine data sets while preserving that order. PROC SORT first! • If one of the data sets has a variable not contained in the other data sets, then the observations from the other data sets will have missing values for that variable. DATA new-data-set; SET old-data-set1 old-data-set2; BY variable-list; RUN; BMTRY 789 Introduction to SAS Programming

  13. Combining Data Sets Using One-to-One Match Merge • When you have two data sets with related data and you want to combine them. • If you merge two data sets, and they have variables with the same names – besides the BY variables, then variables from the second data set will overwrite any variables having the same names in the first data set. • All observations from old data sets will be included in the new data set whether they have a match or not. BMTRY 789 Introduction to SAS Programming

  14. Match Merge Example Proc Sort Data = Rat; BY RatID Date; Run; Proc Sort Data = Rat2; BY RatID Date; Run; DATA BigRat; MERGE Rat Rat2; BY RatID Date; Run; BMTRY 789 Introduction to SAS Programming

  15. Tracking and Selecting Obs with the IN=Option • This is most often used with MERGE • IN=varName is temporary, existing only during the current DATA step • SAS gives the IN=varName variable a value of ‘0’ if the data set did not contribute to the current observation and a value of ‘1’ if the data set did contribute BMTRY 789 Introduction to SAS Programming

  16. IN=varName Example Proc Sort Data = Worker; By Year; Run; Proc Sort Data = Exp; By Year; Run; Data WorkExp; Merge Worker (IN=InWork) Exp; By Year; If InWork; Run; BMTRY 789 Introduction to SAS Programming

More Related