collaborative data management for longitudinal studies l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Collaborative Data Management for Longitudinal Studies PowerPoint Presentation
Download Presentation
Collaborative Data Management for Longitudinal Studies

Loading in 2 Seconds...

play fullscreen
1 / 14

Collaborative Data Management for Longitudinal Studies - PowerPoint PPT Presentation


  • 331 Views
  • Uploaded on

Collaborative Data Management for Longitudinal Studies. Stephen Brehm [coauthors: L. Philip Schumm & Ronald A. Thisted] University of Chicago (Supported by National Institute on Aging Grant P01 AG18911-01A1). Agenda. 1. Background on Study. 2. Problem – Data Management Deficiencies.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Collaborative Data Management for Longitudinal Studies' - elina


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
collaborative data management for longitudinal studies

Collaborative Data Management for Longitudinal Studies

Stephen Brehm

[coauthors: L. Philip Schumm & Ronald A. Thisted]

University of Chicago

(Supported by National Institute on Aging Grant P01 AG18911-01A1)

agenda
Agenda

1. Background on Study

2. Problem – Data Management Deficiencies

3. Solution – Collaborative Data Management

4. STATA Programs – maketest & makedata

background on study
Background on Study
  • NIH-funded Longitudinal Study
  • Loneliness & Health
  • Thousands of Measures
    • Loneliness
    • Depression
  • 230 subjects
  • Repeated Yearly
problem data management deficiencies
Problem – Data Management Deficiencies
  • Code Not Modular

…Difficult to manage the data cleaning code

…Limited code reuse from year to year …Difficult to collaborate among interns

  • No Established Set of Data Cleaning Steps

…Difficult for research assistants (turn-over)

…Inconsistent data cleaning techniques

…Data cleaning code difficult to read

problem data management deficiencies5
Problem – Data Management Deficiencies

Research

Assistant

Research

Assistant

Research

Assistant

Core File Set

Research

Assistant

Research

Assistant

solution collaborative data management
Solution – Collaborative Data Management
  • Process
    • Established Steps
    • File System Layout
    • Automated Tests
    • Collaboration
  • Concepts
    • Module
    • Batch
    • “Data Certification”
  • STATA Programs
    • maketest
    • makedata
solution collaborative data management7
Solution – Collaborative Data Management
  • Process
    • Established Steps
    • File System Layout
    • Automated Tests
    • Collaboration
  • Concepts
    • Module Ex:loneliness
    • Batch
    • “Data Certification”
  • STATA Programs
    • maketest
    • makedata
solution collaborative data management8
Solution – Collaborative Data Management
  • Process
    • Established Steps
    • File System Layout
    • Automated Tests
    • Collaboration
  • Concepts
    • Module Ex:loneliness
    • Batch Ex:yr1, yr2, yr3
    • “Data Certification”
  • STATA Programs
    • maketest
    • makedata
solution collaborative data management9
Solution – Collaborative Data Management

Set of Files for Each Module

acquire-[module].do & fix-[module].do

test-[module].do

derive-[module].do

label-[module].do

Year-Specific

60% Code Reuse – Files Shared Between Years

Acquire

& Fix

Test

Derive

Label

stata program maketest
STATA Program – maketest
  • Purpose:
    • Auto-generation of Data Certifying Tests
  • Functionality:
    • Tests Variable Type
    • Checks Consistency of Value Labels
    • Verifies Existence of Variable
stata program maketest11
STATA Program – maketest
  • Syntax:
    • maketest [varlist] using, [REQuire(varlist) append replace]
  • Example:
    • maketest using filename.do, replace
  • Options:
    • using: specifies file to write
    • REQ: requires presence of variables in list
    • append: add to existing test .do file
    • replace: overwrite existing .do file
stata program makedata
STATA Program – makedata

“Bringing it all together”

stata program makedata13
STATA Program – makedata
  • Syntax:
    • makedata [namelist], Pattern(string) [replace clear Noisily Batch(namelist) TESTonly]
  • Example:
    • makedata ats, p("acquire-*.do") b(yr1) clear replace
  • Options:
    • p: pattern – file naming convention
    • replace: overwrite existing data file
    • clear: clear current data in memory
    • Noisily: full output (default = summary)
    • b: batch – year, wave, center
    • TESTonly: only run tests step
other applications
Other Applications
  • Beyond Longitudinal Data
  • Teaching Data Cleaning with STATA
  • Contact Information
    • Stephen Brehm:

sbrehm@uchicago.edu

    • L. Philip Schumm:

pschumm@uchicago.edu

    • Ronald A. Thisted:

thisted@health.bsd.uchicago.edu

  • Supported by National Institute on Aging

Grant P01 AG18911-01A1