1 / 17

SDA: a tool for teaching and research with microdata

SDA: a tool for teaching and research with microdata. Laine Ruus laine.ruus@utoronto.ca University of Toronto. Data Library Service 2007/05/17. What this poster covers:. Introduction Demo of main SDA capabilities Advantages and disadvantages for teaching and research

salim
Download Presentation

SDA: a tool for teaching and research with microdata

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SDA: a tool for teaching and research with microdata Laine Ruus laine.ruus@utoronto.ca University of Toronto. Data Library Service 2007/05/17

  2. What this poster covers: • Introduction • Demo of main SDA capabilities • Advantages and disadvantages for teaching and research • Common questions about SDA • What we know to be coming in future SDA editions

  3. SDA@UT is brought to you by: • University of California, Berkeley. Computer-assisted Survey Methods Program (CSM) – writes and supports the server-side software • University of Toronto. Centre for Computing in the Humanities and Social Sciences (CHASS) – provides the hardware, buys the software, and provides system support wetware • University of Toronto. Libraries – provides the budget to purchase the data, and care, feeding and user support wetware

  4. Our experience with SDA • CHASS installed SDA in the fall of 2004 • At last count, have 474 data files in SDA • Some have only the metadata that was generated from the original syntax files (SAS/SPSS/Stata), but a number also have full question text. • Most are microdata, but a few are aggregate statistics (census files) • A number of voracious users now expect to find the data in SDA

  5. Advantages for teaching: • Stable environment, 24x7 access • Very easy to explain to novice users • Reduce/eliminates need for computer labs or statistical software • Teach statistics rather than software • Students get hands on data quickly • Switch easily between weighted and unweighted distributions

  6. Advantages for teaching (2): • Measures of association and tests of significance comparable to SAS • Design effects, where cluster/sample variables available • Interactive demonstration of statistical concepts • Share recoded variables • Can quickly mount additional data to fulfill your teaching needs

  7. Advantages for research: • Stable environment, 24x7 access • Access to latest available version of the data • Basic exploratory data analysis: eg are there enough cases for my subset? • Download data and import to SAS/SPSS/Stata on own workstation • Share recoded variables • Integrated variable descriptions (selected data files)

  8. Advantages for data management: • Creates metadata from SAS/SPSS/Stata syntax or DDI format xml files • Very easy and fast to import files with good syntax files • Control over what users can and cannot do • Outputs include SAS/SPSS/Stata syntax or DDI format xml files • Overhead: size of uncompressed data + about 50%

  9. Disadvantages of SDA: • Can’t search for variables/values within/between data files (yet) • Can’t download created/recoded variables (yet) • No random sampling function • Graphics minimal, eg no stem-and-leaf, box-plots etc • Can only output to Word/Excel from IE, not from Netscape/Mozilla/Firefox • Doesn’t output SAS/SPSS/Stata system/export files • Little support for Study/File level metadata (DDI) • No support for nCubes (DDI 2)

  10. Common questions from researchers & students: • When to weight versus not to weight • Does it only do cross-tabs? • But I want the raw data, not a cross-tabulation! • Why can’t I get a cross-tab of this [eg continuous income] variable? • Differences between syntax, data, and system files.

  11. An application we wouldn’t have tackled without SDA: • Q: I need the average expenditure on eye care in Canada by age group of household head for as long a time-period as possible. • A: Once we explained SDA, the student had generated this statistics from each of the FAMEX/SHS files, 1969-2004 in under 30 mins. (He knew only Stata.)

  12. Functions we know to be coming in SDA • Within and between file variable searching • Will allow users to load own data files (Archiver in SDA 3.1) – we have not played with this yet

  13. Questions: • Question 1: Where will I find the SDA server at University of Toronto? • Answer 1: The URL is: http://www.chass.utoronto.ca/datalib/ Select ‘Microdata analysis and extraction’

  14. Question 2 How are files chosen to be mounted on the SDA server at UT? Answer 2 All significant Canadian microdata files, eg by Statistics Canada as released by DLI Other files based on faculty/student requests Questions (cont’d):

  15. Question 3: My research is being done collaboratively with a colleague at another Canadian university. Can my colleague get access to SDA? Answer 3: SDA is available as a subscription service to other Canadian DLI-member universities and colleges. Current subscribers include: U of Victoria, Ryerson U, and Memorial U Questions (cont’d):

More Related