1 / 11

Data-Sharing: Challenges

Data-Sharing: Challenges. Johanna Walter MDRC April 22, 2015. Data Sharing: Goals. To share data with other researchers, thereby permitting additional analyses that test our conclusions and/or lead to additional findings . To protect the identity of our study participants.

zaltana
Download Presentation

Data-Sharing: Challenges

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data-Sharing: Challenges Johanna Walter MDRC April 22, 2015

  2. Data Sharing: Goals • To share data with other researchers, thereby permitting additional analyses that test our conclusions and/or lead to additional findings. • To protect the identity of our study participants.

  3. MDRC has a long-standing commitment to sharing data, and does so as often as is feasible. • Selected public use files (PUFs) and restricted access files (RAFs) in national repositories: • National Supported Work Evaluation Study, 1975-1979 • National Evaluation of Welfare-to-Work Strategies (NEWWS) • New Hope Project: Income and Employment Effects on Children and Families, 1994-2003 • Employment Retention and Advancement Project, 2000-2007 • Evaluation of Enhanced Academic Instruction in After-School Programs • Supporting Healthy Marriage Evaluation: Eight Sites within the United States, 2003-2013 • Head Start CARES Demonstration: National Evaluation of Three Approaches to Improving Preschoolers' Social and Emotional Competence, 2009-2011 • A number of PUFs are available directly from MDRC. Visit http://www.mdrc.org/available-public-use-files for more information.

  4. What prevents you from sharing your own data more often? • Our research has many constituents: • Funders • Sample members • Data providers • Secondary data users/research community • As a result, there are a number of needs and concerns to balance in sharing research data.

  5. Funders • Some require creation of PUF or RAF. • Others are silent on the matter, but may not provide (adequate) funding for creation of PUF/RAF. • Yet others restrict sharing of data, or require data to be provided only to funder. • At times, we have been required to destroy all data at the end of a study.

  6. Sample Members • In order to protect the privacy of individuals who agree to participate in our studies and share data, including sensitive data, such as health measures, criminal activity, etc., all MDRC projects: • Require IRB review, based on formal plan for protections of human subjects. • Provide assurances to sample members regarding how their data will be used. • Use Informed consent forms (as appropriate) to inform sample members of these protections and provide documentation to data providers allowing release of data.

  7. Data Providers • Data providers control access to data – • whether we may acquire data, • at what cost, • what form the data will take, and • impose restrictions on how data can be used – e.g., whether it can be included in PUF/RAF. • These decisions are driven by issues such as their costs, data security and privacy concerns, statutes, etc.

  8. Limitations of Administrative Data • These data and systems are established for other uses, and may not be well-suited to all research uses. • For example, the National Directory of New Hires (NDNH) is a good source for national quarterly wage and new hires data, but: • Data are deleted from the database 24 months after date of entry • Data are available for research use only for federally-funded projects • For research likely to contribute to achieving the purposes of Part A (TANF) or Part D (Child Support) of the Social Security Act. • Data are provided without identifiers • Data are not available for inclusion in a public use file • MDRC has been able to make good use of the NDNH in a number of studies, but these data may not be a viable option for other research. Other considerations, in addition to those above: • Time-consuming request process • High cost • For more information on the NDNH, see: http://www.acf.hhs.gov/programs/css/resource/a-guide-to-the-national-directory-of-new-hires • Also, see Investigating Alternative Sources of Quarterly Wage Data (Christin Durham and Laura Wheaton, 2012) for an overview of several federal data sources.

  9. Secondary Data Users/Research Community • In many MDRC studies, we link key administrative records from various sources, representing a rich array of outcomes. • In some studies, we also conduct surveys of sample members in order to obtain data that are not readily available through administrative records, such as direct child assessments, health measures, criminal activity, more detailed job data, etc. • Linking of several rich data sources provides a unique opportunity to analyze important outcomes that may not be found in a single data set.

  10. Special Concerns Regarding Linked Data Sets • In order to share linked data sets, including extremely sensitive data, as well as other information that together could allow identification of individuals or groups of individuals: • Data masking is required to protect the identity of sample members. • National repositories such as ICPSR or NCES have been good partners – they are also committed to protecting participants and their data, and provide helpful guidance in producing usable research data while protecting the identity of sample members. • Main challenge is costs associated with producing these files.

  11. MDRC’s goal is to share data we collect, thereby permitting additional analyses that test our conclusions and lead to more findings.  We are committed to transparency and reproducibility.  However, we are also committed to protecting the identity of our study participants. • Our recommendations: • Public Use Files (PUFs) and Restricted Access Files (RAFs) • Hosted by official repositories, such as ICPSR or NCES • Adequate funding needed to create masked datasets for sharing • In sum, balancing needs and concerns of various constituent groups present challenges that affect our ability to mount studies, acquire data, and share data for secondary analysis.

More Related