Quick data summaries in sas
This presentation is the property of its rightful owner.
Sponsored Links
1 / 5

Quick Data Summaries in SAS PowerPoint PPT Presentation


  • 63 Views
  • Uploaded on
  • Presentation posted in: General

Quick Data Summaries in SAS. Start by bringing in data Use permanent data set for these examples Proc Summary Produces summaries relatively easily Designed to produce a table of output that can be manipulated further ***This is a critical difference from tabulate***

Download Presentation

Quick Data Summaries in SAS

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Quick data summaries in sas

Quick Data Summaries in SAS

  • Start by bringing in data

    • Use permanent data set for these examples

  • Proc Summary

    • Produces summaries relatively easily

    • Designed to produce a table of output that can be manipulated further ***This is a critical difference from tabulate***

    • Need to pre-sort data by any “by” groups

    • Need to print out results


Quick data summaries in sas1

Quick Data Summaries in SAS

  • Basic Summary Syntax:

Proc sort;

By var1 var2;

Run;

Proc summary;

By var1 var2;

Var variable3;

Output out=new_table mean=mean_name n=n_name….;

Run;

Proc print;

Run;


Statistics available in proc summary

Statistics available in Proc Summary

  • Mean, n, standard deviation, standard deviation, variance, coefficient of variation, sum

  • Minimum, maximum, range, number of missing observations, median


Some quirks of proc summary

Some Quirks of Proc Summary

  • Whenever you use proc summary, it adds two new variables: _type_ and _freq_ (note underscores at beginning and end of variable names

    • _freq_ indicates the number of observations

    • _type_ indicates whether the output is a matrix or not

  • You can ignore these variables in virtually all cases

  • You need to remember what is the “active” dataset, or specify the dataset that summary will operate on

    • The active dataset is the most recently used dataset by default


Shannon s diversity index

Shannon’s Diversity Index

H= -∑ pi ln(pi)


  • Login