1 / 48

PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com

PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com. Nethra Sambamoorthi, PhD Northwestern University Master of Science in Predictive Analytics Program . Data Processing Terminologies Across Data Sciences…. Why PROC SQL or What Can It Do For Analysts?.

erica
Download Presentation

PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PROC SQL – Select Codes To Master For Power Programming Codes and Examples from SAS.com Nethra Sambamoorthi, PhD Northwestern University Master of Science in Predictive Analytics Program

  2. Data Processing Terminologies Across Data Sciences…

  3. Why PROC SQL or What Can It Do For Analysts? • Generate reports • Generate summary statistics • Retrieve data from tables or views • Combine data from tables or views • Create tables, views, and indexes • Update the data values in PROC SQL tables • Update and retrieve data from database management system (DBMS) tables • Modify a PROC SQL table by adding, modifying, or dropping columns • PROC SQL can be used in an interactive SAS session or within batch programs, and it • Can include global statements, such as TITLE and OPTIONS.

  4. An Example of Extracting, Summarizing, and Printing Using Data Step title 'Large Countries Grouped by Continent'; proc summary data=sql.countries; where Population > 1000000; class Continent; var Population; output out=sumPopsum=TotPop; run; procsort data=SumPop; by totPop; run; procprint data=SumPopnoobs; var Continent TotPop; format TotPop comma15.; where _type_=1; run; /* Extracting and summarizing */ /* Sorting to arrange the output */ /* Printing */

  5. Creating The Same Using PROC SQL procsql; title 'Population of Large Countries Grouped by Continent'; select Continent, sum(Population) as TotPop format=comma15. from sql.countries where Population gt 1000000 group by Continent order by TotPop; quit;

  6. Countries Table

  7. WordCityCoords Table

  8. USCityCoords Table

  9. UnitedStatesTable

  10. PostalCodes Table

  11. Worldtemps Table

  12. Oilprod Table

  13. OILRSRVS Table

  14. CONTINENTS Table

  15. FEATURES Table

  16. SELECT statement

  17. Three Important Aspects – Describe, Print, Quit /* Helps understand the structure of the table */ PROC SQL; Describe table sql.unitedstates; Quit;

  18. SELECT means PRINTING is Included Unless • SELECT * /* all columns */ • SELECT city, state /* specific columns */ • SELECT distinct continent /* specific columns but avoid dup */ So it is possible to run this

  19. The output is…

  20. Suppress column headings…

  21. Calculated columns and alias name…

  22. Retrieving Data From Multiple Tables • Means we are JOINING tables • If there is no JOIN statement, it means (1) Cartesian product of records [no subset condition ] or (2) inner joins [ we need some subset condition] • Alias names can be used for tables too; it helps simplify calling specific columns of a table

  23. SELECT … FROM table1, table2; A Cartesian Product

  24. Order the output from INNER JOIN INNER JOIN can be used explicitly

  25. INNER JOIN with comparison values on another column…

  26. Effect of Null Values on JOINS

  27. NOT MISSING option

  28. Multicolumn JOINS

  29. Columns are directly comparable between two tables… Capitals FROM sql.unitedstates City FROM sql.uscitycoord Postalcodes FROM sql.postalcodes

  30. Is it possible to do SELFJOIN?

  31. Two Types of OUTERJOIN – LEFTJOIN and RIGHTJOIN

  32. FULLJOIN …

  33. SPECIALTY JOINS

  34. NATURAL is applicable for both LEFT and RIGHT JOIN. The purpose is to reduce verbose to match on multiple common columns… Gives the same output; Non matching rows have missing values

  35. Use COALESCE to combine multiple columns to create new matching variables

  36. Using SUB QUERY or NESTED QUERY – SINGLE VALUE =

  37. Correlated SUBQUERY = NESTED QUERY

  38. Where “EXISTS” option

  39. Multiple NESTED QUERY

  40. Combine a JOIN with a SUBQUERY

  41. QUERY strategies…

  42. UNION is ROWWISE (PROC APPEND), while JOIN is COLUMNWISE (MERGE by) Keep the dups

  43. OUTER UNION = KEEP ONLY FROM – Key word EXCEPT

  44. To overlay data better: keyword CORRESPONDING

More Related