1 / 6

Prototyping a virtual filesystem for storing and processing petascale neural circuit datasets .

Prototyping a virtual filesystem for storing and processing petascale neural circuit datasets. Art Wetzel, Greg Hood and Markus Dittrich National Resource for Biomedical Supercomputing Pittsburgh Supercomputing Center awetzel@psc.edu 412-268-3912 www.psc.edu and www.nrbsc.org.

sheera
Download Presentation

Prototyping a virtual filesystem for storing and processing petascale neural circuit datasets .

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Prototyping a virtual filesystem for storing and processing petascale neural circuit datasets. Art Wetzel, Greg Hood and Markus Dittrich National Resource for Biomedical Supercomputing Pittsburgh Supercomputing Center awetzel@psc.edu 412-268-3912 www.psc.edu and www.nrbsc.org • R. Clay Reid, Jeff Lichtman, Wei-Chung Allen Lee • Harvard Medical School, Allen Institute for Brain Science • Center for Brain Science, Harvard University • Davi Bock • HMMI Janelia Farm • David Hall and Scott Emmons • Albert Einstein College of Medicine Jan 11, 2012 Connectomics Data Project Overview

  2. Reconstructing brain circuits requires high resolution electron microscopy over “long” distances == BIGDATA Vesicles ~30 nm diam. A synaptic junction >500 nm wide with cleft gap ~20 nm www.coolschool.ca/lor/BI12/unit12/U12L04.htm Dendritic spine Recent ICs have 32nm features 22nm chips are being delivered. Dendrite Gate oxide 1.2nm thick

  3. A10 Tvoxel dataset aligned by our groupwas an essential part of the March 2011 Nature paper with Davi Bock, Clay Reid and Harvard colleaguesNow we are working ontwo datasets of 100TB each and expect to reach PBs in 2-3 years.

  4. The CS project is to implement and test a prototype virtual filesystem to address common problems associated with neuralcircuit and other massive datasets. • The most important aim is reducing unwanted data duplication as raw data are preprocessed for final analysis. The virtual filesystem addresses this by replacing redundant storage by on-the-fly computing. • The second aim is to provide a convenient framework for efficient on-the-fly computation on multidimensional datasets within high performance parallel computing environments using both CPU and GPGPU processing. • The Filesystem in User Space mechanism (FUSE) provides a convenient implementation basis that will work across a variety of systems. There are many existing FUSE codes that serve as useful examples.

  5. We would eventually like to have a flexible software framework that allows a combination of common prewritten and user written application codes to operate together and take advantage of parallel CPU and GPGPU technologies.

  6. Multidimensional data structures to provide efficient random and sequential access analogous to the 1D representations provided by standard filesystems will be part of this work. Students working on this project will have access to a parallel cluster which holds our large datasets along with the compilers and other tools required. Minimal end-to-end functionality with simple linear transforms can likely be achieved in about 8 weeks and then extended as time permits. Please contact Art Wetzel if there are further questions – awetzel@psc.edu.

More Related