1 / 35

Module 5: SAS/SQL

Module 5: SAS/SQL. 15 July 2009. Introduction - What is PROC SQL. Proc SQL is the SAS implementation of SQL Proc SQL is a powerful SAS procedure that combines the functionality of the SAS data step with the SQL language Proc SQL can sort, subset, merge and summarize data – all at once

miyoko
Download Presentation

Module 5: SAS/SQL

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Module 5: SAS/SQL 15 July 2009

  2. Introduction - What is PROC SQL • Proc SQL is the SAS implementation of SQL • Proc SQL is a powerful SAS procedure that combines the functionality of the SAS data step with the SQL language • Proc SQL can sort, subset, merge and summarize data – all at once • Proc SQL can combine standard SQL functions with virtually all SAS functions • Proc SQL can work remotely with RDBMS such as Oracle

  3. PROC SQL – What can do? • To perform a query – Using SELECT statement. • To save queried result into SAS dataset – Using CREATE TABLE statement • To save the query itself – Using CREATE VIEW statement • To sort dataset • To merge more than one datasets in a number of ways • To import dataset from Oracle Clinical to SAS • To enter new records into a SAS dataset • To modify/ edit the SAS dataset

  4. PROC SQL - Why • The Advantage of using SQL • Combined functionality • Faster for smaller tables • SQL code is more portable for non-SAS applications • Not require presorting • Not require common variable names to join on. (need same type , length)

  5. Performing Query – SELECT Statement • It is used to perform a query. It does not create any dataset. • The simplest SQL code, need 3 statements • By default, it will print the resultant query, use NOPRINT option to suppress this feature • Begin with PROC SQL, end with QUIT; not RUN; • Need at least one SELECT… FROM statement

  6. Performing Query – SELECT Statement To select all the variables use ‘*’ after SELECT statement PROC SQL; SELECT * FROM VITALS; QUIT;

  7. Performing Query – SELECT Statement To select only particular variable(s) write down the variable names after SELECT statement. Variable names should be separated by commas. PROCSQL; SELECT Patient, pulse FROM VITALS; QUIT;

  8. Performing Query – SELECT Statement To select only distinct observations and to delete duplicate observations. PROCSQL; SELECT DISTINCT Patient FROM VITALS; QUIT;

  9. Ordering/Sorting Query Results PROCSQL ; SELECT * FROM Vitals ORDER BY date; QUIT; Sorting by Date • SELECT * means we select all variables from dataset VITALS • Put ORDER BY after FROM.

  10. Subsetting: - Character searching in WHERE Print observations with name containing ‘J’. PROCSQL; SELECT * FROM vitals WHERE Name CONTAINS 'J'; QUIT; • Always put WHERE after FROM • CONTAINS in WHERE statement only for character variables

  11. Subsetting- Character searching in WHERE Print observations with name containing ‘o’ in between. PROCSQL; SELECT * FROM vitals WHERE Name LIKE ‘%o%'; QUIT; • LIKE in WHERE statement only for character variables

  12. Creating New Data • In SELECT, the results of a query are converted to an output object (printing). • Query results can also be stored as data. • The CREATE TABLE statement creates a table with the results of a query. • The CREATE VIEW statement stores the query itself as a view. Either way, the data identified in the query can beused in later SQL statements or in other SAS steps.

  13. Creating New Data - Create Table PROCSQL; CREATE TABLE bp AS SELECT patient, date, pulse FROM Vitals WHERE temp>98.5; QUIT; CREATE TABLE … AS… Statement Creates a New table from an existing table. PROCSQL; CREATE TABLE bp ASSELECT * FROM Vitals WHERE temp>98.5; QUIT; These statements will copy all the variables to the new dataset

  14. Creating New Data - Create Table We can also assign different variable name, Label, Length, and format name PROCSQL; CREATE TABLE bp AS SELECT patient AS Patient LABEL='Subject number' LENGTH =5, date AS Date LABEL='Date of Expt' FORMAT=WORDDATE8., pulse, temp FROM Vitals WHERE temp>98.5; QUIT;

  15. Creating New Data - Create View PROC SQL; CREATEVIEW bp AS SELECT patient, date, pulse, temp FROM Vitals; WHERE temp>98.5 QUIT; • First step-creating a view,no output is produced. • When a table is created, the query is executed and the resulting data is stored in a file. When a view is created, the query itself is stored in the file. The data is not accessed at all in the process of creating a view.

  16. Source variable New Variable Case Logic - reassigning/recategorize PROCSQL; CREATE TABLE BP AS SELECT Patient, Pulse, CASE Patient WHEN101THEN 'Male' WHEN102THEN 'Female' WHEN103THEN 'Female' ELSE 'Male' ENDAS Gender FROM Vitals; QUIT; • The order of each statement is important • CASE …END AS should in between SELECT and FROM • Use WHEN … THEN ELSE… to redefine variables • New variable GENDER is created from PATIENT.

  17. Combining Datasets: Joins InnerJoin Full Join If a and b; If a or b; Right Join Left Join If b; If a;

  18. Combining Datasets: Joins Dataset: Dosing

  19. Combining Datasets: Joins Dataset: Vitals

  20. Join Tables (Merge datasets) - Inner Join: Using WHERE PROCSQL; CREATETABLE new AS SELECT dosing.patient, dosing.date, dosing.med, vitals.pulse, vitals.temp FROM dosing, vitals WHERE dosing.patient=vitals.patient AND dosing.date=vitals.date; QUIT; • No prior sorting required – one advantage over DATA MERGE • Use comma (,) to separate two datasets in FROM • Without WHERE, all possible combinations of rows from each tables is produced, all columns are included

  21. Join Tables (Merge datasets) - Inner Join

  22. Join Tables (Merge datasets) - Left Joins using ON PROCSQL; CREATETABLE new1 AS SELECT dosing.patient, dosing.date, dosing.med, vitals.pulse, vitals.temp FROM dosing LEFT JOIN vitals ON dosing.patient=vitals.patient AND dosing.date=vitals.date; QUIT; Resultant dataset will contain all & only those observations which comes from DOSING dataset.

  23. Join Tables (Merge datasets) - Left Joins using ON

  24. Join Tables (Merge datasets) - Right Joins using ON PROCSQL; CREATE TABLE new1 AS SELECT dosing.patient, dosing.date, dosing.med, vitals.pulse, vitals.temp FROM dosing RIGHT JOIN vitals ON dosing.patient=vitals.patient AND dosing.date=vitals.date; QUIT; Resultant dataset will contain all & only those observations which comes from VITALS dataset.

  25. Join Tables (Merge datasets) - Right Joins using ON

  26. Join Tables (Merge datasets) - Full Joins using ON PROCSQL; CREATE TABLE new1 AS SELECT dosing.patient, dosing.date, dosing.med, vitals.pulse, vitals.temp FROM dosing FULL JOIN vitals ON dosing.patient=vitals.patient AND dosing.date=vitals.date; QUIT; Resultant dataset will contain all observation if they come from at least one of the datasets.

  27. Join Tables (Merge datasets) - Full Joins using ON

  28. SQL Functions • PROC SQL supports almost all the functions available to the SAS DATA step that can be used in a proc sql select statement • Common Functions: • COUNT • DISTINCT • MAX • MIN • SUM • AVG • VAR • STD • STDERR • NMISS • RANGE • SUBSTR • LENGTH • UPPER • LOWER • CONCAT • ROUND • MOD

  29. PROC SQL functions PROCSQL; SELECT avg(Age) AS mean, std(Age) AS sd, min(Age) AS min, max(Age) AS max, count(Age) AS count, N (Age) AS Count FROM sashelp.class; quit;

  30. PROC SQL functions PROCSQL; SELECT sex, avg(Age) AS mean, std(Age) AS sd, min(Age) AS min, max(Age) AS max, count(Age) AS count, N (Age) AS Count FROM sashelp.class; GROUP BY Sex quit;

  31. Editing Data – Deleting rows and Dropping columns /*Deleting rows*/ PROCSQL; DELETE FROM class WHERE age le 13; QUIT; /*Droping variables*/ PROCSQL; CREATE TABLE New (DROP=age) AS SELECT * FROM Class; QUIT; • Deleting columns can be done in SELECT or in DROP on created table

  32. Importing data from OC to SAS

  33. Importing data from OC to SAS

  34. Importing data from Excel to SAS PROCIMPORT DATAFILE="c:\myfile.xls" OUT=sasuser.accounts DBMS=EXCEL2000 REPLACE; SHEET='Prices'; GETNAMES=yes; RUN;

  35. Finale • PROC SQL is an additional tool with its own strengths and challenges • Many times it is just another way to do the same thing • BUT other times it might be much more efficient and may cut down the number of sorts, data steps & procedures or lines of code required.

More Related