1 / 20

Distributed Data Management for Biomedical Research

Distributed Data Management for Biomedical Research. UC Cloud Summit 2011. Dingying Wei and David B. Keator, UCI. FBIRN Consortium. = PostgreSQL HID database. = Firewall. = GridFTP server. UMN. BWH. MGH. UI. Yale. UCSF. VA. UCLA. Duke. UCI. VA. UCSD. MIND/.

romeo
Download Presentation

Distributed Data Management for Biomedical Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Data Management for Biomedical Research UC Cloud Summit 2011 Dingying Wei and David B. Keator, UCI

  2. FBIRN Consortium = PostgreSQL HID database = Firewall = GridFTP server UMN BWH MGH UI Yale UCSF VA UCLA Duke UCI VA UCSD MIND/

  3. Data Management Requirements • Each site controls its own data • Collected, processed and meta data • Access control • Site=RW, Project Group= R, User=Site • Replication

  4. Data Flow Result Images and XML wrapper in Data Grid Image / Behavior Data RLS GridFTP Processing Pipeline Analysis Results Genetic Data Database Multi-Site Query Clinical Data

  5. Data Storage and Replication • Data files are stored in BIRN hierarchy in gridFTP servers, metadata and clinical data are stored in Postgres databases • BIRN Portal manages Users/groups • MyProxy is used for single sign-on authentication • Data access checks at both file system and application level • Replica Location Service is for replication

  6. Extensible Database with Web Interface

  7. Form Builder Client

  8. Data Input from Web Form

  9. Data Input from Portable Device • Data are submitted in XML files • Web service loads data into the database

  10. Data Input from Excel File • Meta Data Worksheet • Data Codes Worksheet • Subject Data Worksheet

  11. Data Input from Command Line • Anonymizing Data Files • Validating Data • Extracting Meta Data • Performing Quality Assurance • Uploading Data

  12. Dynamic Multi-Database Query

  13. Data Export • Shopping cart in the web application • Add scans and assessments from multiple sites for download (via job scheduler) • CSV values file for assessment data • Excel spreadsheet

  14. Data Exploration (Statistics)

  15. XCEDE XML for Data Exchange Data provenance

  16. Project Status Tracking Data from Database Data from Grid

  17. QC Status

  18. Open-Source Software • XCEDE XML schema (www.xcede.org) • XML schema for describing/documenting research and clinical studies • Database (www.nitrc.org/projects/hid) • Query interface, workflow pipeline documentation, image download • Clinical Assessment Layout Manager (www.nitrc.org/projects/hid) • Graphical web enabled form builder for data entry

  19. Imaging Processing Example • FSL package for the comprehensive management of large-scale multi-site fMRI projects, including data storage, retrieval, calibration, analysis, multi-modal integration, and quality control.

  20. Cloud?

More Related