1 / 20

Part 2 UNC – Charlotte Department of Accounting

WRDS Workshop. Part 2 UNC – Charlotte Department of Accounting. By : Keejae Hong. COMPUSTAT. Compustat changed variable names (#18 -> ib ) and database structure a few years ago. Tables , funda or fundq (old format compann ( compqtr ,) Industrial Annual

tara-kelley
Download Presentation

Part 2 UNC – Charlotte Department of Accounting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. WRDS Workshop Part 2 UNC – Charlotte Department of Accounting By: Keejae Hong

  2. COMPUSTAT • Compustat changed variable names (#18 -> ib) and database structure a few years ago. • Tables, fundaor fundq (old format compann(compqtr,) • Industrial Annual • NYE, American Stock Exchange, S&P500, utility subsidiaries • Primary, Supplementary and Tertiary (NO distinction in WRDS) • Full Coverage Annual • NASDAQ companies, firms listed in regional exchanges • Backdata (20 year + 20 years Industrial and Full coverage ) • Research • We don’t have to worry about this in WRDS.

  3. COMPUSTAT • Unique Identifier • Gvkey • CUSIP is not an identifier • Annual Compustat (Gvkeyyeara)  (gvkeydatadate) + others ( if indfmt='INDL'; • if datafmt='STD'; • if POPSRC='D'; • if Consol='C'; • Quarterly Compustat (Gvkeyyear qtr)  (gvkeydatadate) + others • Delete duplicates to begin with… • CUSIP available now: Cnum (6digit) + first 2 digits of CIC(3digit) = CUSIP (8digit number) • CUSIPstands for Committee on Uniform Securities Identification Procedures. A CUSIP number identifies most securities. • When there is a change in Cnum (or cusip), compustat changes all previous years’cnum retrospectively.  this may create a problem when you combine the data with CRSP or IBES Per-share-basis numbers (e.g. EPS ) are reported as shown on the current financial statements. In other words, they are unadjusted numbers. Use adjustment factor (#27  ajx, annual compustat) to adjust number. • Firms with fiscal year end January through May have yeara +1 will give the calendar year. For example, • CONAME FYR YEARA • AAA 12 1994  199401 - 199412 • BBB 9 1994  199310 - 199409 • CCC 5 1994  199406 - 199505

  4. COMPUSTAT

  5. CRSP • Many different Tables (mse, msf, and etc.) • Unique Identifier • PERMNO • CUSIP or NCUSIP is not an identifier • Use CUSIP(current cusip) to merge crsp data with compustat • Use NCUSIP (historical cusip) to merge crsp data with IBES • Web Interface option often gives you limited choice. Be prepared to use crsp tables directly when needed.

  6. IBES • Unique Identifier • IBES Ticker (not the same as the stock market ticker, which is referred as “official ticker”) • CUSIP in IBES is a historical CUSIP. • Forecast numbers are already split-adjusted. BE VERY CAREFUL when you merge IBES and COMPUSTAT data. • You don’t want to merge 1993 IBES with 1995 IBES. • Unadjusted numbers are available at request • SAS code available through wrds

  7. WRDS Data Structure • Relational Database • Stores data in a structure consisting of one or more tables of rows and columns, which may be interconnected. • A row corresponds to a record; columns correspond to attributes (fields) in the record. • Each database (i.e., COMPUSTAT or CRSP) consists of many tables. • Each table consists of many rows and columns.

  8. SQL (Structured Query Language) • The advantage of using SQL • Combined functionality • Faster for smaller tables • Not require presorting • Not require common variable names to join on (need same type, length)

  9. Proc SQL

  10. 1. Creating a new variable of grand mean proc sql; create table kids1 as select *, mean(age) as mean_age from kids; quit; proc print data=kids1 noobs; run;

  11. 2. Creating a new variable of group mean proc sql; create table kids2 as select *, mean(age) as mean_age from kids group by famid; quit;

  12. 3. Creating multiple variables of summary statistics at once proc sql; create table kids3 as select famid, mean(age) as mean_age , std(age) as std_age, mean(wt) as mean_wt, std(wt) as std_wt from kids group by famid; quit;

  13. 4. Creating multiple summary statistics variables in the original data set proc sql; create table fam5 as select *, mean(age) as mean_age, std(age) as std_age, mean(wt) as mean_wt, std(wt) as std_wt from kids group by famid order by famid, kidname desc; quit;

  14. Proc SQL : one-to-one merge proc sql; create table dadfam1 as select * from dads,faminc where dads.famid=faminc.famid order by dads.famid; quit;

  15. Proc SQL : one-to-many merge proc sql; create table dadkid2 as select * from dads, kids where dads.fid=kids.famid order by dads.fid, kids.kidname; quit;

  16. Using full join to handle mismatching records in a one-to-one merge proc sql; create table dadkid3 as select * from dads, faminc where dads.famid=faminc.famid order by dads.famid;

  17. Using full join to handle mismatching records in a one-to-one merge proc sql; create table dadkid4 as select *, (dads.famid=faminc.famid) as indic, (dads.famid ~=.) as dadind, (faminc.famid ~=.) as famind, coalesce(dads.famid, faminc.famid) as fid from dads full join faminc on dads.famid=faminc.famid; quit;

  18. Producing all the possible distinct pairs of the values in a column proc sql; create table discity as select distinct city from places; quit;

  19. All the possible distinct pairs of the values in a column (con’t) proc sql; create table pair_places as select f1.city as orig , f2.city as dest from discity as f1 , discity as f2 where f1.city ne ' ' & f1.city < f2.city order by f1.city, f2.city; quit;

  20. Source • http://www.ats.ucla.edu/stat/SAS/modules/sqlmerge.htm • SAS Reference books • The little SAS book by Delwiche and Slaughter • Using SAS in Financial Research by Boehmer, Broussard, Kallunki • http://belkcollegeofbusiness.uncc.edu/khong5/workshop_sch.htm

More Related