1 / 34

Evolution of Data Documentation

Evolution of Data Documentation. In the beginning…. …was the codebook. …early digital codebooks…. Codebook listed to tape. …early digital codebooks…. OSIRIS Dictionaries. …early digital codebooks…. SPSS (and SAS) code. …early digital codebooks…. PDFs.

donny
Download Presentation

Evolution of Data Documentation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evolution of Data Documentation

  2. In the beginning… …was the codebook.

  3. …early digital codebooks… Codebook listed to tape

  4. …early digital codebooks… OSIRIS Dictionaries

  5. …early digital codebooks… SPSS (and SAS) code

  6. …early digital codebooks… PDFs

  7. What do early digital codebooks have in common? 1. Tied to a particular physical layout of a data file VARIABLE 6 OPINION OF COUNTRY OVERALL DECK 1/35

  8. What do early digital codebooks have in common? 1. Tied to a particular physical layout of a data file 2. Each uses its own special syntax. VARIABLE 6 OPINION OF COUNTRY OVERALL DECK 1/35 D HUFAMINC 2 39 CITY $ 77-94

  9. What do early digital codebooks have in common? 3. Some included information intended for human consumption. Q1. THINKING ABOUT THE COUNTRY OVERALL, DO YOU THINK THINGS IN THE U.S. ARE GENERALLY GOING IN THE RIGHT DIRECTION, OR DO YOU FEEL THINGS ARE SERIOUSLY OFF ON THE WRONG TRACK? VALUE LABEL VALUE N OF CASES ----------- ----- ---------- RIGHT DIRECTION 1 223 WRONG TRACK 2 237 NO OPINION 8 48 NOT APPLICABLE* 9 500 ------- TOTAL 1008 *NOT FORM A

  10. Problems of early digital codebooks (part 1) PDF Osiris Osiris dictionary SPSS cards SPSS CBLT Book

  11. (user has to re-create information inorder to re-use information) PDF Machine “readable” but not Machine “actionable” Osiris Osiris dictionary SPSS cards SPSS CBLT Book

  12. XML helps solve the problem • XML is not tied to any single piece of software. • XML is designed to be easily parsed by computer. • XML is (to some extent) self-documenting or self-descriptive. • XML can include information intended both for humans and machines. • XML is non-proprietary, open, flexible.

  13. XML helps solve the problem • Many tools exist to read/convert XML. (Java, javascript, perl, PHP, etc.) • XSL and XSLT were created explicitly for converting XML. With them XML can be converted to HTML, PDF, other XML, etc. • XML is highly structured so it can be predictably converted.

  14. DDI 1 and 2 Built to emulate early code BOOKS and digital Codebooks… 1.0 DOCUMENT DESCRIPTION 2.0 STUDY DESCRIPTION 3.0 DATA FILES DESCRIPTION 4.0 VARIABLE DESCRIPTION 5.0 OTHER STUDY-RELATED MATERIALS

  15. Problems of early digital codebooks(part 2) • Static, inflexible. • Meant to document the end point of research -- Views research as linear. • Hard to re-use the information for new research.

  16. Problems of DDI 1 and 2 • Emulated the Code Book • Not flexible enough • We could do so much more…

  17. Three Stages of Technological Change

  18. Three Stages of Technological Change

  19. Three Stages of Technological Change

  20. Document Description Study Description Data Files Description Variable Description Other Study-Related Materials DDI 1 and 2

  21. Document Description Study Description Data Files Description Variable Description Other Study-Related Materials Study Concept Data Collection Data Processing Data Distribution Data Archiving Data Discovery Data Analysis Repurposing DDI 1 and 2 DDI 3

  22. Life Cycle of Research,Data, Documentation

  23. A modular approach • Study Unit - Research question - Funding - Concepts - Background research

  24. A modular approach • Study Unit • Data Collection - Instrument - Data collection process - Questionnaire

  25. A modular approach • Study Unit • Data Collection • Logical Product - Intellectual content of data - Relationship to questions and concepts - Relationship to processing (recodes, weighting, derivations, imputations)

  26. A modular approach • Study Unit • Data Collection • Logical Product • Physical Data Product - Describes the structure (microdata, tabular,aggregate, Ncube…) (e.g., STF 1A)

  27. A modular approach • Study Unit • Data Collection • Logical Product • Physical Data Product • Physical instance - Each describes a single data file (e.g., STF1 A by state...each state is an instance)

  28. A modular approach • Study Unit • Data Collection • Logical Product • Physical Data Product • Physical instance • “Instance” • An instance module “wraps” the other modules. Like a table of contents to a group of studies and files and modules it brings everything together.

  29. A modular approach • Study Unit • Data Collection • Logical Product • Physical Data Product • Physical instance • “Instance” • Archive - Each archive can add its own local information with an archive module.

  30. A modular approach • Study Unit • Data Collection • Logical Product • Physical Data Product • Physical instance • “Instance” • Archive

  31. A modular approach(but wait… there’s more!) • Group module • Describe concepts, questions, and variables that occur in several studies. • Describe a series (e.g., CBP, CPS, Eurobarometer) • - Describe a collection of studies (not a series) and identify the common comparable concepts, questions and variables.

  32. A modular approach(but wait… there’s more!) • Group module • Comparative module • The Comparative module contains information for comparing concepts, questions, and variables between or among Study Units that have been housed in a Group.

  33. A modular approach(but wait… there’s more!) • Group module • Comparative module • Conceptual components module - Describe concepts and their relationships as concept groups. - Use known vocabularies and can indicate the level of similarity between two concepts by describing the extent of difference.

  34. Study Unit Data Collection Logical Product Physical Data Product Physical instance “Instance” Archive Group module Comparative module Conceptual components module A modular approach

More Related