multiple uses for a simple sql procedure l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Multiple Uses for a Simple SQL Procedure PowerPoint Presentation
Download Presentation
Multiple Uses for a Simple SQL Procedure

Loading in 2 Seconds...

play fullscreen
1 / 28

Multiple Uses for a Simple SQL Procedure - PowerPoint PPT Presentation


  • 157 Views
  • Uploaded on

Multiple Uses for a Simple SQL Procedure. Rebecca Larsen University of South Florida. Introduction. The SQL Procedure…Equivalent or advantageous to other SAS data steps and procedures? Handy SQL Procedure: CREATE TABLE… INSERT INTO…SELECT

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Multiple Uses for a Simple SQL Procedure' - jacob


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
multiple uses for a simple sql procedure

Multiple Uses for a Simple SQL Procedure

Rebecca Larsen

University of South Florida

introduction
Introduction
  • The SQL Procedure…Equivalent or advantageous to other SAS data steps and procedures?
  • Handy SQL Procedure:

CREATE TABLE…

INSERT INTO…SELECT

  • Several brief examples of using the above procedure will be presented

SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are registered trademarks or Trademarks of their respective companies.

overview of topics
Overview of Topics
  • Inserting Rows from Multiple Tables into One Table
  • Inserting Rows from Multiple Queries into One Table
  • Changing the Order of Variables in a SAS Data Set
  • Changing Variable Attributes and/or Names in a SAS Data Set
1 inserting rows from multiple tables into one table
1. Inserting Rows from Multiple Tables into One Table
  • Why would you want to insert rows from multiple tables into one table?
    • Information related to one customer/patient/client is stored in multiple data sets and linked by a common identifier
    • Information related to one issue/problem you would like to study is stored in multiple data sets and needs to be combined
health care claims example
Health Care Claims Example
  • In this first example, there are multiple source files with different record layouts
    • HomeHealth_clm – record layout 1
    • Hospice_clm – record layout 1
    • SkilledNursing_clm – record layout 1
    • Inpatient_clm – record layout 1
    • Outpatient_clm – record layout 2
    • PartB_clm – record layout 2
  • All files contain information needed to complete your research study
  • You need to combine information from all source data sets into one “master” analysis data set
record layouts of health care claims example data sets
Record Layout 1:

person_id

health_status_cd

service_category_cd

race_cd

procedure_amt*

procedure_dt*

Record Layout 2:

person_id

health_status_cd

service_category_cd

race_cd

service_amt*

service_dt*

Record Layouts of Health Care Claims Example Data Sets

*In this example, source variables have different names in the different record layouts, although they contain analogous information and should be analyzed as one variable in the “master” data set

create a new blank table using the following code
Create a new, blank table using the following code:

PROC SQL;

CREATE TABLE allclaims

(

person_id char(11),

health_status_cd char(2),

service_category_cd char(2),

race_cd char(1),

serv_amt_cu num,

service_dt num

)

;

QUIT;

use a macro to insert values from multiple source files into one data set
Use a macro to insert values from multiple source files into one data set

/** 1. Define macro **/

%MACRO insert(amount, date, file);

PROC SQL;

INSERT INTO allclaims

SELECT person_id,

status_cd,

service_category_cd,

race_cd,

&amount.,

&date.

FROM &file.;

QUIT;

%MEND insert;

continued
(continued)

/** 2. Call macro **/

%insert(service_amt, service_dt, HomeHealth_clm);

%insert(service_amt, service_dt, Hospice_clm);

%insert(service_amt, service_dt, Inpatient_clm);

%insert(service_amt, service_dt, SkilledNursing_clm);

%insert(procedure_amt, procedure_dt, Outpatient_clm);

%insert(procedure_amt, procedure_dt, PartB_clm);

notice
Notice…
  • …that different source variables for the original files are inserted into a single variable in the new file
    • Both variables must be of the same type in the original file as being created in the new file (i.e. numeric into numeric)
      • service_amt (num) vs. procedure_amt (num)
      • service_dt (num) vs. procedure_dt (num)
  • One master data set was created for the purpose of completing analysis on information related to health care claims data from multiple source files
2 inserting rows from multiple queries into one table
2. Inserting Rows from Multiple Queries into One Table
  • Why would you want to insert rows from multiple queries into one table?
    • You want to create summary records from your master data set based on different “where clause” conditions in separate queries and insert them into one summary data set
health care claims example part 2
Health Care Claims Example, Part 2
  • Continuing with the health care claims example provided above, a summary table with one record per person is desired to be used in statistical analyses, i.e. ANOVA, PROC LOGISTIC, etc.
  • You want the summary data set to be created with the following characteristics:
    • One record per person,
    • “Race_cd” (5 class levels) to be categorized to create the new variable, “minority_cd” (2 class levels),
continued13
(continued)
  • Summary variable “mean_cu” to be created as the mean amount paid per person,
  • Records with missing service amount to be excluded,
  • Persons with more than one value for minority_cd were excluded (to eliminate persons with indistinguishable minority status)
    • For statistical tests with a CLASS variable such as minority status, you want to make sure the classes are mutually exclusive, or you will compromise the validity of your results
first create the blank summary table using the following code
First, create the blank summary table using the following code:

PROC SQL;

CREATE TABLE minority

(

person_id char(11),

health_status_cd char(2),

service_category_cd char(2),

minority_cd char(4),

mean_cu num

)

;

next insert values from multiple queries into one summary data set query 1
Next, Insert values from multiple queries into one summary data set (Query 1)

/** Query 1 **/

INSERT INTO minority

SELECT

distinct person_id,

health_status_cd,

service_category_cd,

case

when race_cd = ‘1’

then ‘1’

else ‘0’

end as minority_cd,

mean(serv_amt_cu) as mean_cu

FROM allclaims

WHERE serv_amt_cu is not missing

and service_category_cd = ‘mh’

and count(distinct minority_cd) = 1

GROUP BY person_id;

continue inserting values into the summary data set query 2
Continue inserting values into the summary data set (Query 2)

/** Query 2 **/

INSERT INTO minority

SELECT

distinct person_id,

health_status_cd,

service_category_cd,

case

when race_cd = ‘1’

then ‘1’

else ‘0’

end as minority_cd,

mean(serv_amt_cu) as mean_cu

FROM allclaims

WHERE serv_amt_cu is not missing

and service_category_cd = ‘ph’

and count(distinct minority_cd) = 1

GROUP BY person_id;

QUIT;

cool code
Cool Code…
  • The count function in this WHERE clause is an example of PROC SQL code that can accomplish in one step, what would take more than one step and several more lines of regular SAS code

where serv_amt_cu is not missing

and service_category_cd = ‘ph’

and count(distinct minority_cd) = 1

group by person_id

  • The case expression can be used to create a new variable that is a “re-categorization” of the values of another variable

case

when race_cd = ‘1’then ‘1’

else ‘0’

end as minority_cd

3 changing the order of variables in a sas data set
3. Changing the Order of Variables in a SAS Data Set
  • Why would you want to change the order of the variables in your data set?
    • Perhaps your data step manipulations have altered the order of the variables from the original, and you need to export the data in the same format you received it
    • Or you wish to arrange certain variable types together for coding processes to be run on sequential variables in your data set
example of changing the order of variables in your data set
Example of Changing the Order of Variables in Your Data Set
  • In the following example, oldfile (Table 1) has variables in a certain order that you would like to change, and newfile (Table 2) has variables in the order to which you would like to change
the following code was used to change the variable order
/** Step 1 **/

PROC SQL;

CREATE TABLE newfile

(

pin char(11),

sex_cd char(1),

race_cd char(1),

DOB num,

age num,

diagnosis_cd char(5),

procedure_cd char(5),

enrollment_dt num,

termination_dt num,

death_dt num

);

/** Step 2 **/

INSERT INTO newfile

SELECT

pin,

sex_cd,

race_cd,

DOB,

age,

diagnosis_cd,

procedure_cd,

enrollment_dt,

termination_dt,

death_dt

FROM oldfile;

QUIT;

The following code was used to change the variable order:
remember
Remember…
  • …the order in which you specify the variables in the INSERT INTO…SELECT statement must be the exact order of the variables that are created in the blank data set, NOT the order of the data set from which they came.
4 changing variable attributes and or names in a sas data set
4. Changing Variable Attributes and/or Names in a SAS Data Set
  • The names of the variables do not have to be identical in the newly created table as the table from which you are selecting the records
  • Other attributes (i.e. formats, informats, labels, character length, etc.) can be easily added or changed
  • For example, we will create a new file called, “newfile2”, with different variable names and attributes from the data set in the previous example called, “newfile”
use the following code to change the name of variables and add a date formats in your data set
/** Step 1 **/

PROC SQL;

CREATE TABLE newfile2

(

pin_id char(11),

sex_cd char(1),

race_cd char(1),

DOB_dt num format=date9.,

age_iv num,

diagnosis_cd char(5),

procedure_cd char(5),

enrollment_dt num format=date9.,

termination_dt num format=date9.,

death_dt num format=date9.

);

/** Step 2 **/

INSERT INTO newfile2

SELECT

pin,

sex_cd,

race_cd,

DOB,

age,

diagnosis_cd,

procedure_cd,

enrollment_dt,

termination_dt,

death_dt

FROM newfile;

QUIT;

Use the following code to change the name of variables and add a date formats in your data set:
notice25
Notice…
  • If all you want to do is change names or attributes of variables in a data set, then PROC DATASETS is a much more efficient tool; however,
  • If you have other steps that need to be accomplished by writing a new file, then the ability to change names and attributes is available in PROC SQL
conclusion
Conclusion
  • There are many more applications for the CREATE TABLE…INSERT INTO…SELECT statements in PROC SQL than the few examples shown
    • Many times it is just another way to do the same thing
    • Other times it may cut down the number of sorts, data steps & procedures or lines of code required
    • Try it out for yourself…you may find something you will wish you had always known!
slide27
My preferred SQL coding resource is the following user’s guide:
    • SAS Institute Inc. (2000), SAS® SQL Procedure User’s Guide, Version 8, Cary, NC: SAS Institute Inc.
about the speaker
Speaker:

Location of Company:

Telephone:

Fax:

Email:

Rebecca Larsen,

Assistant in Research

13301 Bruce B. Downs Boulevard

MHC2617

University of South Florida

Tampa, Florida 33612-3807

(813) 974-7206

(813) 974-6411

rlarsen@fmhi.usf.edu

About the Speaker

To view the presentation slides online, go to http://psrdc.fmhi.usf.edu/SQL_Savannah_2002.ppt