1 / 32

The Data Documentation Initiative (DDI)

The Data Documentation Initiative (DDI). Ron Nakao Social Science Data and Software (SSDS) Stanford University Libraries With input from Gretchen Gano, Sanda Ionescu, Jim Jacobs, Nancy McGovern, Wendy Thomas, Mary Vardigan Presented to the DLF Fall Forum 2007 - Philadelphia, PA.

yale
Download Presentation

The Data Documentation Initiative (DDI)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Data Documentation Initiative (DDI) Ron Nakao Social Science Data and Software (SSDS) Stanford University Libraries With input from Gretchen Gano, Sanda Ionescu, Jim Jacobs, Nancy McGovern, Wendy Thomas, Mary Vardigan Presented to the DLF Fall Forum 2007 - Philadelphia, PA A Metadata Specification for Social Science Data

  2. Presentation Overview • What is the DDI? • What is the DDI Alliance? • A taste of the DDI specification • Futures

  3. What is the DDI? “The Data Documentation Initiative (DDI) is an effort to establish an international XML-based standard for the content, presentation, transport, and preservation of documentation for datasets in the social and behavioral sciences.”

  4. What is the DDI Alliance? • Host Institutions • Member Institutions • Organization Structure • Director • Steering Committee • Expert Committee • Working Groups

  5. DDI Alliance Host Institutions and Associations • Inter-University Consortium for Political and Social Research (ICPSR) - b.1962 • Roper Center for Public Opinion Research - b.1946+ • Council of European Social Science Data Archives (CESSDA) - b.1976 • International Federation of Data Organizations (IFDO) - b.1977 • International Association of Social Science Information, Service, and Technology (IASSIST) - b.1974+

  6. DDI Alliance Member Institutions (30) • University of Alberta, Canada • University of California, Berkeley -- Computer-Assisted Survey Methods Program and UCDATA • University of California, California Digital Library • Centre for Survey Research and Methodology (ZUMA) • Centro De Investigaciones Sociologicas (CIS), Spain

  7. DDI Alliance Member Institutions (30) • CEPS/INSTEAD -- Luxembourg • Danish Data Archive • Data Archiving and Networked Services (DANS), The Netherlands • Emory University • Finnish Social Science Data Archive • German Socio-Economic Panel Study (SOEP) • University of Guelph, Canada • Harvard-MIT Data Center

  8. DDI Alliance Member Institutions (30) • Inter-university Consortium for Political and Social Research (ICPSR) • Massachusetts Institute of Technology (MIT) • University of Minnesota • National Opinion Research Center (NORC) • Norwegian Social Science Data Service (NSD) • Open Data Foundation, Tucson, Arizona • Princeton University • Roper Center • Stanford University

  9. DDI Alliance Member Institutions (30) • University of Surrey, United Kingdom • Swedish Social Science Data Service (SSD) • Swiss Data Archive for the Social Sciences (SIDOS) • United Kingdom Data Archive (UKDA) • University of Wisconsin • World Bank, Development Data Group (DECDG) • Yale University • Zentralarchiv fuer Empirische Sozialforschung, University of Koeln

  10. DDI Alliance Structure Technical Implementers Committee Director Controlled Vocabularies Steering Committee Expert Committee (Voting members + observers) Qualitative Data Outreach and Usability Aggregate, Geography & Time Comparative Data/Families of Datasets Instrument Documentation

  11. What is the DDI specification? • A word about social science data & codebooks • DDI 1 & 2 • Data Life Cycle & DDI 3

  12. Here’s some Data… 00100 1 D10 9999004924100470150049783005023700510840052982005469900556410057759005778500587170059279006230000641310064859006871800710760072313007503300765530079731008055200976350139703016589301907750227186023402202644340325362000000076292 00100 1A D10 9999003395200344030034779003505500353130035740003639400370300037711003842100390550039919004023700408420041169004208700426620042901004305500414920040699004080500410920044523004475400496720050359004810500473330049092000000076292 00100 1B D10 9999000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000034140006909001011300112370011405001090400108220011315001213576292 00100 1C D10 9999000156000016580001672000171300017770001891000184500018800002278000231300025570003250000357000041590003795000394000041970005419000637300057480006488000672600076970006896000686700074410010829001477800206060021973001933376292 00100 1D D10 9999001372900109540013332001346900139940015351001646000167310017770001705100171050016110001849300191300019895002269100242170023993002560500293130032544003302100454320081375010415901224250154593016023501856730242982000000076292 00100 1 SS0 9999004924100470150049783005023700510840052982005469900556410057759005778500587170059279006230000641310064859006871800710760072313007503300765530079731008055200976350128679015279901581420185555019990502276000267854000000076292 00100 1A SS0 9999003395200344030034779003505500353130035740003639400370300037711003842100390550039919004023700408420041169004208700426620042901004305500414920040699004080500410920041012004122000411750041133004109200407390040416000000076292 00100 1B SS1 9999000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000341420063634009314800931490093149009314800931500093147009315076292 00100 1C SS1 9999001560300165790016715001712900177740018914001844800187960022777002312800255730032498003570200415870037950003940400419720054185006372500574800064883006725700769680063514006324700616810088444012624001773610180887014839476292 00100 1D SS0 9999001372900109540013332001346900139940015351001646000167310017770001705100171050016110001849300191300019895002269100242170023993002560500293130032544003302100454320074952009593901014840126263013687401598100200035000000076292 0010334 I62 9999000000000000000000000000167200017610001837000195300020640002163000213300022350002390000251000026810002870000313100034470003739000402200043020004692000510000054890006141000697200079410008859001000000115010012998000000076292 0010334 X51 9999000000000000000000000000000000000530000043000006300000570000048-00001400000480000069000005000000680000070000009100001010000085000007600000700000091000008700000760000119000013500001390000116000012900001500000130000000076292 0010664 62 9999000000000000000000000000000000000000003312000335000033950003497000363400037880003913000402700041320004281000445500046590004888000513700053540005591000587800062300006598000698200076510008821001000000111020012344000000076292 0010664 X51 9999000000000000000000000000000000000000000000000001100000130000030000003900000420000033000002900000260000036000004100000460000049000005100000420000044000005100000600000059000005800000960000153000013400001100000112000000076292 0010770 D32 9999000534500054630005661000773800074450007554000777700084850009454001015000096630010255001145900119760012518001374300153960016712001836000192730021524002466400283720031735003768100524250077311007961200906610102778000000076292 0010771 D32 9999000606100059570005903000817400080510007672000796600089170009847001082200101320010723001201600125750013276001446000162190017616001938900203420022653002581700297110033208003892000536110078556008140500923040105775000000076292 0010774 D12 9999000000000000000000000000000000000000004039000397900039840004087000418100040620004023000407300040900004070000411800042020004266000435800043670004335000448300047300004989000542600066710009176001000000101620011100000000076292 0010775 D12 9999000000000000000003832000481800046770004340000428800043290004444000457500042740004156000419700041730004130000418100042700004323000438800043850004342000443500047710005006000539800065790009243001000000101060011001000000076292 0010776AXD61 1975000000000000000000000000000000000000000000000000000000000000000000057000005190000512000051200004920000482000051200005470000536000055300005180000514000055500005750000542000061800009520001216000100000011230001353000128976292 0010776IAD61 1975000020600002030000243000027700002030000166000015900001580000170000025100001710000145000015300001420000145000041500002860000103000009100000990000097000016400001830000221000036200004700001461000100000005640000395000038176292 0010776IAZ 2 9999000042300004160000498000056700004170000341000032600003240000348000051500003500000297000031400002910000298000085000005870000212000018600002030000198000033700003750000453000074300009630002996000205000011570000810000078176292

  13. And Here’s the Codebook

  14. A digital Codebook (pdf)

  15. Evolution of the DDI • Concept of DDI and definition of needs grew out of the data archival community • 1995 - DDI efforts initiated by ICPSR • 1997 - XML DTD released • 2000 - DDI 1.0 released • 2003 - DDI 2.0 released - DDI Alliance formed • 2007 - DDI 3.0 Candidate Draft Release • 2008 - DDI 3.0 Final Release

  16. DDI: Early Development • 2000 – DDI 1.0 • Simple survey • Archival data formats • Microdata only • 2003 – DDI 2.0 • Aggregate data (based on matrix structure) • Added geographic material to aid geographic search systems and GIS users

  17. DDI versions 1 & 2 • Document Description • Study Description • Data Files Description • Variable Description • Other Study-Related Materials

  18. DDI 3: The Data Life Cycle

  19. Capturing the Data Life Cycle • Study Unit - Research question - Funding - Concepts - Background research

  20. Capturing the Data Life Cycle • Study Unit • Data Collection - Instrument - Data collection process - Questionnaire

  21. Capturing the Data Life Cycle • Study Unit • Data Collection • Logical Product - Intellectual content of data - Relationship to questions and concepts - Relationship to processing (recodes, weighting, derivations, imputations)

  22. Capturing the Data Life Cycle • Study Unit • Data Collection • Logical Product • Physical Data Product - Describes the structure (microdata, tabular,aggregate, Ncube…)

  23. Capturing the Data Life Cycle • Study Unit • Data Collection • Logical Product • Physical Data Product • Physical instance - Each describes a single data file (e.g., Census data by state...each state is an instance)

  24. Capturing the Data Life Cycle • Study Unit • Data Collection • Logical Product • Physical Data Product • Physical instance • “Instance” (METS-inspired) • An instance module “wraps” the other modules. Like a table of contents to a group of studies and files and modules it brings everything together.

  25. Capturing the Data Life Cycle • Study Unit • Data Collection • Logical Product • Physical Data Product • Physical instance • “Instance” • Archive - Each archive can add its own local information with an archive module.

  26. Capturing the Data Life Cycle • Group module • Describe concepts, questions, and variables that occur in several studies. • Describe a series (e.g., CPS, Eurobarometer) • - Describe a collection of studies (not a series) and identify the common comparable concepts, questions and variables.

  27. Capturing the Data Life Cycle • Group module • Comparative module • The Comparative module contains information for comparing concepts, questions, and variables between or among Study Units that have been housed in a Group.

  28. DDI 3.0 Geography Example <?xml version="1.0" encoding="UTF-8"?> <r:Coverage xmlns:r="ddi:reusable:0_1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xhtml="http://www.w3.org/1999/xhtml" xsi:schemaLocation="ddi:reusable:0_1 Schemas/reusable.xsd"> <r:SpatialCoverage> <r:Identification> <r:ID>GEOCOV</r:ID> <r:IdentifyingAgency>MPC</r:IdentifyingAgency> <r:Version>1.0</r:Version> </r:Identification> <r:BoundingBox> <r:WestLongitude>-177.1</r:WestLongitude> <r:EastLongitude>-61.48</r:EastLongitude> <r:SouthLatitude>+13.71</r:SouthLatitude> <r:NorthLatitude>+76.63</r:NorthLatitude> </r:BoundingBox> <r:Description translated="false" translatable="true"> <xhtml:p>United States, Region, Division, State, County, County Subdivision, Place, Tract/Block Numbering Area within Place/Remainder within County Subdivision.</xhtml:p> </r:Description> <r:SpatialObject>Polygon</r:SpatialObject> <r:GeographicStructure> <r:Geography> <r:Identification> <r:ID>G001</r:ID>

  29. DDI - User Community • Data archives and libraries world-wide (e.g., ICPSR, CESSDA) • Health Canada • Statistics Canada • World Bank • WHO (World Health Surveys) • Gallup-Europe • Metadata Management Toolkit (IHSN)

  30. International Household Survey Network (IHSN) • To coordinate and improve survey collecting operations in developing countries • Developed to support the survey collection activities of the International Household Survey Network (IHSN) • Sponsors: 18 organizations, such as ILO, UNESCO, World Bank, UNICEF, WHO, UNDP, Eurostat • Goal: improve the quality of collected data and encourage more dissemination and long-term preservation • 100% DDI compliant

  31. Futures • Continued development of DDI • Outreach, train, promote • Expand Alliance membership • Foster tools development • Build ties & interoperability with other metadata specifications • Funding • ISO Standard status

  32. That’s all folks! Thanks! http://www.ddialliance.org/

More Related