1 / 13

Chapter 20 Creating a Single Observation from Multiple Records

Chapter 20 Creating a Single Observation from Multiple Records. Objectives Read multiple records sequentially and create a single observation Read multiple records non-sequentially and create a single observation. Multiple Records Per Observation. Farr, Sue Anaheim, CA 869-7008

Download Presentation

Chapter 20 Creating a Single Observation from Multiple Records

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 20 Creating a Single Observation from Multiple Records Objectives Read multiple records sequentially and create a single observation Read multiple records non-sequentially and create a single observation

  2. Multiple Records Per Observation Farr, Sue Anaheim, CA 869-7008 Anderson, Kay B. Chicago, IL 483-3321 Tennenbaum, Mary Ann Jefferson, MO 589-9030 • A raw data file has three records per employee. • Record 1 contains the first and last names, • Record 2 contains the city and state of residence, and • Record 3 contains the employee’s phone number.

  3. Desired Output The SAS data set should have one observation per employee. LName FName City State Phone Farr Sue Anaheim CA 869-7008 Anderson Kay B. Chicago IL 483-3321 Tennenbaum Mary Ann Jefferson MO 589-9030

  4. The INPUT Statement • The SAS System loads a new record into the input buffer when it encounters an INPUT statement. • You can have multiple INPUT statements in one DATA step. • Each INPUT statement ends with a semicolon. DATASAS-data-set; INPUTvar-1 var-2 var-3; INPUTvar-4 var-5; additional SAS statements

  5. Multiple INPUT Statements data address; length LNameFName $ 20 City $ 25 State $ 2 Phone $ 8; infile 'raw-data-file' dlm=','; input LName $ FName $; input City $ State $; input Phone $; run; Load Record Load Record Load Record ...

  6. Use Line Pointer Controls • You can also use line pointer controls ‘/ ’ to control when SAS loads a new record. • SAS loads the next record when it encounters a forward slash. DATASAS-data-set; INPUTvar_1 var_2 var_3/var_4 var_5; additional SAS statements

  7. Reading Multiple Records Per Observation sequentially data address; length LNameFName $ 20 City $ 25 State $ 2 Phone $ 8; infile 'raw-data-file' dlm=','; input LName $ FName $ / City $ State $ / Phone $; run; Load Record Load Record Load Record ...

  8. Reading Multiple Records Per Observation Partial Log NOTE: 9 records were read from the infile 'addresses.dat'. The minimum record length was 8. The maximum record length was 20. NOTE: The data set WORK.ADDRESS has 3 observations and 5 variables.

  9. Reading Multiple Records Per Observation proc print data=address noobs; run; PROC PRINT Output LName FName City State Phone Farr Sue Anaheim CA 869-7008 Anderson Kay B. Chicago IL 483-3321 Tennenbaum Mary Ann Jefferson MO 589-9030

  10. Some issues when using sequential line pointer control The / line pointer control in the INPUT statement instructs SAS to go to next record and read variables as defined. • One can use as many / as needed to read records sequentially for an observation. • The number of records per observation must be the same. • If the # of records per observation is not the same, then, the / line point is not good.

  11. Reading Multiple Records Non-sequentially In some situations, you may want to read record #2 first, then go back to read record #1 and #3. This is non-sequential. SAS uses the #n line pointer control to specify the record # to be read. For example: The following INPUT statement reads line 2, then, line 1, the line 3. INPUT #2 department $ 1-12 Jobcode $ 15-19 #1 Lname $ Fname $ #3 salary comma10. ;

  12. Combining / and #n line pointer control for reading multiple records per observation Depending on the data structure, one can combine the use of / for sequential input and #n for non-sequential record input. Note: • The / line pointer control can only move forward, the pointer control is specified after the values in the current record are read. • The #n line pointer control can read any record in any order. It must be specified before the variable name are defined. • A semicolon is placed only at the end of the complete INPUT statement.

  13. Exercise Open program c20_1. Run and observe the results to learn how to read an observation from multiple records . A sample of data ADAMS, LARRY 1352 133 SALES 32,696.78 BICKETT, PAT 1238 133 SALES 35,099.50 BIRCH, CARLA 1252 133 SALES 37,098.71 CARTER, WANDA 1424 105 MARKETING 17,098.71

More Related