Lightweight replication of heavyweight data
1 / 11

Lightweight Replication of Heavyweight Data - PowerPoint PPT Presentation

  • Uploaded on

Lightweight Replication of Heavyweight Data. Scott Koranda University of Wisconsin-Milwaukee & National Center for Supercomputing Applications. Heavyweight Data from LIGO. Sites at Livingston, LA (LLO) and Hanford, WA (LHO) 2 interferometers at LHO, 1 at LLO

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Lightweight Replication of Heavyweight Data' - marvela

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Lightweight replication of heavyweight data l.jpg

Lightweight Replication ofHeavyweight Data

Scott Koranda

University of Wisconsin-Milwaukee


National Center for Supercomputing Applications

Heavyweight data from ligo l.jpg
Heavyweight Data from LIGO

  • Sites at Livingston, LA (LLO) and Hanford, WA (LHO)

  • 2 interferometers at LHO, 1 at LLO

  • 1000’s of channels recorded at rates of 16 KHz, 16 Hz, 1 Hz,…

  • Output is binary ‘frame’ files holding 16 seconds data with GPS timestamp

    ~ 100 MB from LHO

    ~ 50 MB from LLO

  • ~ 1 TB/day in total

  • S1 run ~ 2 weeks

  • S2 run ~ 8 weeks

4 km LIGO interferometer at Livingston, LA

Networking to ifos limited l.jpg
Networking to IFOs Limited

  • LIGO IFOs remote, making bandwidth expensive

  • Couple of T1 lines for email/administration only

  • Ship tapes to Caltech (SAM-QFS)

  • Reduced data sets (RDS) generated and stored on disk

    ~ 20 % size of raw data

    ~ 200 GB/day

GridFedEx protocol

Replication to university sites l.jpg
Replication to University Sites








Why bulk replication to university sites l.jpg
Why Bulk Replication to University Sites?

  • Each has compute resources (Linux clusters)

    • Early plan was to provide one or two analysis centers

    • Now everyone has a cluster

  • Cheap storage is cheap

    • $1/GB for drives

    • TB RAID-5 < $10K

    • Throw more drives into your cluster

  • Analysis applications read a lot of data

    • Different ways to slice some problems, but most want access to large sets of data for a particular instance of search parameters

Ligo data replication challenge l.jpg
LIGO Data Replication Challenge

  • Replicate 200 GB/day of data to multiple sites securely, efficiently, robustly (no babysitting…)

  • Support a number of storage models at sites

    • CIT → SAM-QFS (tape) and large IDE farms

    • UWM → 600 partitions on 300 cluster nodes

    • PSU → multiple 1 TB RAID-5 servers

    • AEI → 150 partitions on 150 nodes with redundancy

  • Coherent mechanism for data discovery by users and their codes

  • Know what data we have, where it is, and replicate it fast and easy

Prototyping realizations l.jpg
Prototyping “Realizations”

  • Need to keep “pipe” full to achieve desired transfer rates

    • Mindful of overhead of setting up connections

    • Set up GridFTP connection with multiple channels, tuned TCP windows and I/O buffers and leave it open

    • Sustained 10 MB/s between Caltech and UWM, peaks up to 21 MB/s

  • Need cataloging that scales and performs

    • Globus Replica Catalog (LDAP) < 105 and not acceptable

    • Need solution with relational database backend scales to 107 and fast updates/reads

  • No need for “reliable file transfer” (RFT)

    • Problem with any single transfer? Forget it, come back later…

  • Need robust mechanism for selecting collections of files

    • Users/sites demand flexibility choosing what data to replicate

  • Need to get network people interested

    • Do your homework, then challenge them to make your data flow faster

Ligo err lightweight data replicator ldr l.jpg
LIGO, err… Lightweight Data Replicator (LDR)

  • What data we have…

    • Globus Metadata Catalog Service (MCS)

  • Where data is…

    • Globus Replica Location Service (RLS)

  • Replicate it fast…

    • Globus GridFTP protocol

    • What client to use? Right now we use our own

  • Replicate it easy…

    • Logic we added

    • Is there a better solution?

Lightweight data replicator l.jpg
Lightweight Data Replicator

  • Replicated 20 TB to UWM thus far

  • Just deployed at MIT, PSU, AEI

  • Deployment in progress at Cardiff

  • LDRdataFindServer running at UWM

Lightweight data replicator10 l.jpg
Lightweight Data Replicator

  • “Lightweight” because we think it is the minimal collection of code needed to get the job done

  • Logic coded in Python

    • Use SWIG to wrap Globus RLS

    • Use pyGlobus from LBL elsewhere

  • Each site is any combination of publisher, provider, subscriber

    • Publisher populates metadata catalog

    • Provider populates location catalog (RLS)

    • Subscriber replicates data using information provided by publishers and providers

  • Take “Condor” approach with small, independent daemons that each do one thing

    • LDRMaster, LDRMetadata, LDRSchedule, LDRTransfer,…

Future l.jpg

  • LDR is a tool that works now for LIGO

  • Still, we recognize a number of projects need bulk data replication

    • There has to be common ground

      • What middleware can be developed and shared?

    • We are looking for “opportunities”

      • Code for “solve our problems for us…”

    • Want to investigate Stork, DiskRouter, ?

    • Do contact me if you do bulk data replication…