Collaborative data management for longitudinal studies
Download
1 / 14

Collaborative Data Management for Longitudinal Studies - PowerPoint PPT Presentation


  • 322 Views
  • Updated On :

Collaborative Data Management for Longitudinal Studies. Stephen Brehm [coauthors: L. Philip Schumm & Ronald A. Thisted] University of Chicago (Supported by National Institute on Aging Grant P01 AG18911-01A1). Agenda. 1. Background on Study. 2. Problem – Data Management Deficiencies.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Collaborative Data Management for Longitudinal Studies' - elina


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Collaborative data management for longitudinal studies l.jpg

Collaborative Data Management for Longitudinal Studies

Stephen Brehm

[coauthors: L. Philip Schumm & Ronald A. Thisted]

University of Chicago

(Supported by National Institute on Aging Grant P01 AG18911-01A1)


Agenda l.jpg
Agenda

1. Background on Study

2. Problem – Data Management Deficiencies

3. Solution – Collaborative Data Management

4. STATA Programs – maketest & makedata


Background on study l.jpg
Background on Study

  • NIH-funded Longitudinal Study

  • Loneliness & Health

  • Thousands of Measures

    • Loneliness

    • Depression

  • 230 subjects

  • Repeated Yearly


Problem data management deficiencies l.jpg
Problem – Data Management Deficiencies

  • Code Not Modular

    …Difficult to manage the data cleaning code

    …Limited code reuse from year to year …Difficult to collaborate among interns

  • No Established Set of Data Cleaning Steps

    …Difficult for research assistants (turn-over)

    …Inconsistent data cleaning techniques

    …Data cleaning code difficult to read


Problem data management deficiencies5 l.jpg
Problem – Data Management Deficiencies

Research

Assistant

Research

Assistant

Research

Assistant

Core File Set

Research

Assistant

Research

Assistant


Solution collaborative data management l.jpg
Solution – Collaborative Data Management

  • Process

    • Established Steps

    • File System Layout

    • Automated Tests

    • Collaboration

  • Concepts

    • Module

    • Batch

    • “Data Certification”

  • STATA Programs

    • maketest

    • makedata


Solution collaborative data management7 l.jpg
Solution – Collaborative Data Management

  • Process

    • Established Steps

    • File System Layout

    • Automated Tests

    • Collaboration

  • Concepts

    • Module Ex:loneliness

    • Batch

    • “Data Certification”

  • STATA Programs

    • maketest

    • makedata


Solution collaborative data management8 l.jpg
Solution – Collaborative Data Management

  • Process

    • Established Steps

    • File System Layout

    • Automated Tests

    • Collaboration

  • Concepts

    • Module Ex:loneliness

    • Batch Ex:yr1, yr2, yr3

    • “Data Certification”

  • STATA Programs

    • maketest

    • makedata


Solution collaborative data management9 l.jpg
Solution – Collaborative Data Management

Set of Files for Each Module

acquire-[module].do & fix-[module].do

test-[module].do

derive-[module].do

label-[module].do

Year-Specific

60% Code Reuse – Files Shared Between Years

Acquire

& Fix

Test

Derive

Label


Stata program maketest l.jpg
STATA Program – maketest

  • Purpose:

    • Auto-generation of Data Certifying Tests

  • Functionality:

    • Tests Variable Type

    • Checks Consistency of Value Labels

    • Verifies Existence of Variable


Stata program maketest11 l.jpg
STATA Program – maketest

  • Syntax:

    • maketest [varlist] using, [REQuire(varlist) append replace]

  • Example:

    • maketest using filename.do, replace

  • Options:

    • using: specifies file to write

    • REQ: requires presence of variables in list

    • append: add to existing test .do file

    • replace: overwrite existing .do file


Stata program makedata l.jpg
STATA Program – makedata

“Bringing it all together”


Stata program makedata13 l.jpg
STATA Program – makedata

  • Syntax:

    • makedata [namelist], Pattern(string) [replace clear Noisily Batch(namelist) TESTonly]

  • Example:

    • makedata ats, p("acquire-*.do") b(yr1) clear replace

  • Options:

    • p: pattern – file naming convention

    • replace: overwrite existing data file

    • clear: clear current data in memory

    • Noisily: full output (default = summary)

    • b: batch – year, wave, center

    • TESTonly: only run tests step


Other applications l.jpg
Other Applications

  • Beyond Longitudinal Data

  • Teaching Data Cleaning with STATA

  • Contact Information

    • Stephen Brehm:

      [email protected]

    • L. Philip Schumm:

      [email protected]

    • Ronald A. Thisted:

      [email protected]

  • Supported by National Institute on Aging

    Grant P01 AG18911-01A1


ad