storage on the lunatic fringe l.
Skip this Video
Loading SlideShow in 5 Seconds..
Storage on the Lunatic Fringe PowerPoint Presentation
Download Presentation
Storage on the Lunatic Fringe

Loading in 2 Seconds...

play fullscreen
1 / 20

Storage on the Lunatic Fringe - PowerPoint PPT Presentation

  • Uploaded on

Storage on the Lunatic Fringe Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium Orientation Who are the lunatics? What are their requirements? Why is this interesting to the Storage Industry?

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Storage on the Lunatic Fringe' - jana

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
storage on the lunatic fringe

Storage on the Lunatic Fringe

Thomas M. Ruwart

University of Minnesota

Digital Technology Center

Intelligent Storage Consortium

  • Who are the lunatics?
  • What are their requirements?
  • Why is this interesting to the Storage Industry?
  • What is SNIA doing about this?
  • Conclusions
who are the lunatics
Who are the Lunatics?
  • DoE Accelerated Strategic Computing Initiative (ASCI)
    • BIG data, locally and widely distributed, high bandwidth access, relatively few users, secure, short-term retention
  • High Energy Physics (HEP) – Fermilab, CERN, DESY
    • BIG data, locally distributed, widely available, moderate number of users, sparse access, long-term retention
  • NASA – Earth Observing System Data Information Systems (EOSDIS)
    • Moderately sized data, locally distributed, widely available, large number of users, very long-term retention
  • DoD – NSA
    • Lots of little data – trillions of files, locally distributed, relatively few users, secure, long-term retention
  • DoD – Army High Performance Computing Centers and Naval Research Center
    • BIG data, locally and widely distributed, relatively few users, high bandwidth access, secure, very long term reliable retention
a bit of history
A bit of History
  • 1990 – Supercomputer Centers operating with HUGE disk farms of 50-100 GB!
  • 1990 – Laptop computers have 50MB internal disk drives!
  • 1992 – Fast/wide SCSI runs at break-necking speeds of 20 MB/sec!
  • 1994 – Built a 1+TB array of disks with a single SGI xFS file system and wrote a single 1TB file
    • Used 4GB disks in 7+1 RAID 5 disk arrays
    • 36 disk arrays mounted in 5 racks
  • 1997 ASCI Mountain Blue - 75TB – distributed
  • 2002 ASCI Q – 700TB – online, high performance, pushing limits of traditional [legacy] block-based file systems
the not too distant future
The not-too-distant Future
  • 2004 ASCI Red Storm – 240TB – online, high bandwidth, massively parallel
  • 2005 ASCI Purple – 3000TB – online, high performance, OSD/Lustre
  • 2006 NASA RDS – 6000TB – online, global access, CAS,OSD, Data Grids, Lustre?
  • 2007 DoE Fermi Lab / CERN – 3 PB/year online / nearline, global sparse access
  • 2010 Your laptop will have a 1TB internal disk that will still be barley adequate for MS Office™
doe asci
  • 1998 – Mountain Blue – Los Alamos
    • 48 128-Processor SGI Origin 2000 systems
    • 75TB disk storage
  • 2002 – Q
    • 310 32-processor machines + 64 32-processor I/O nodes
    • 2048 2GB FC connections to 64 I/O nodes
    • 2048 2GB FC connections to disk storage subsystem
    • 692 TB disk storage, 20GB/sec bandwidth
      • 2 file systems of 346GB each
      • 4 file system layers between the application
      • and the disk media
  • 2004 – Red Storm
    • 10,000 processors, 10TB Main Memory
    • 240TB Disk, 50 GB/sec bandwidth


doe asci purple requirements
DoE ASCI Purple Requirements
  • Parallel I/O Bandwidth - Multiple (up to 60,000) clients access one file at hundreds of GB/sec.
  • Support for very large (multi-petabyte) file systems
  • Single files of multi-terabyte size must be permitted.
  • Scalable file creation & Metadata Operations
    • Tens of Millions of files in one directory
    • Thousands of file creates per second within the same directory
  • Archive Driven Performance - The file system should support high bandwidth data movement to tertiary storage.
  • Adaptive Pre-fetching - Sophisticated pre-fetch and write-behind schemes are encouraged, but a method to disable them must accompany them.
  • Flow Control & Quality of I/O Service
hep fermilab and cms
HEP – Fermilab and CMS
  • The Compact Muon Solenoid (CMS)
    • $750M Experiment being built at CERN in Switzerland
    • Will be active in 2007
    • Data rate from the detectors is ~1 PB/sec
    • Data rate after filtering is ~hundreds of MB/sec
  • The Data Problem
    • Dataset for a single experiment is ~1PB
    • Several experiments per year are run
    • Must be made available to 5000 scientists all over the planet (Earth primarily)
    • Dense dataset, sparse data access by any one user
    • Access patterns are not deterministic
  • HEP experiments cost $US 1B, last 20 years, involve thousands of collaborators at hundreds of institutions world-wide, and collect and analyze several petabytes of data per year





Tier2 Center

Tier2 Center

Tier2 Center

Tier2 Center

Tier2 Center

LHC Data Grid Hierarchy

CMS as example, Atlas is similar


~100 MBytes/sec

event simulation

Online System

Tier 0 +1




CMS detector: 15m X 15m X 22m

12,500 tons, $700M.

~2.5 Gbits/sec

Tier 1

German Regional Center

FermiLab, USA Regional Center

French Regional Center

Italian Center

~0.6-2.5 Gbps


Tier 2

~0.6-2.5 Gbps

Tier 3

CERN/CMS data goes to 6-8 Tier 1 regional centers, and from each of these to 6-10 Tier 2 centers.

Physicists work on analysis “channels” at 135 institutes. Each institute has ~10 physicists working on one or more channels.

2000 physicists in 31 countries are involved in this 20-year experiment in which DOE is a major player.

Institute ~0.25TIPS




Physics data cache

100 - 1000 Mbits/sec

Courtesy Harvey Newman, CalTech and CERN

Tier 4


nasa eosdis
  • Remote Data Store Project:
    • Build a 6PB Data archive with a life expectancy of at least 20 years, probably more
    • Make data and data products available to 2 million users
  • What to use?
    • Online versus Nearline
    • SCSI vs ATA
    • Tape vs Optical
    • How much of each and when?
  • Data Grids?
  • Dealing with Technology Life Cycles – continual migration
dod nsa
  • How to deal with a trillion files?
    • At 256 bytes of metadata per file -> 256TB just for the file system metadata for one trillion files
    • File System resiliency
    • Backups? Forget it.
  • File Creation Rate is a challenge – 32,000 file per second for 1 year will generate 1 trillion files
  • How to search for any given file
  • How to search for any given piece of information inside all the files
dod msrc
  • 500TB per year data growth
  • Longevity of data retention is critical
    • 100% reliable access of any piece of

data for 20+ years

  • Security is critical
  • Reasonably quick access to any piece

of data from anywhere at any time

  • Heterogeneous computing and

storage environment

history has shown
History has shown…
  • The problems that the Lunatic Fringe is working on today are the problems that the main-stream storage industry will face in 5-10 years
  • Legacy Block-based File Systems break at these scales
  • Legacy Network File System protocols cannot scale to meet these extreme requirements
what happens when
What happens when….
  • NEC Announces a 10Tbit Memory Chip
  • Disk drives reach 1TByte and beyond
  • MEMS devices become commercially viable
  • Holographic Storage Devices become commercially viable
  • Interface speeds reach 1Tbit/sec
  • Intel develops the sub-space channel
  • Vendors need better ways to exploit the capabilities of these technologies rather than react to them
common thread
Common thread
  • Their data storage capacity, access, and retention requirements are continually increasing
  • Some of the technologies and concepts the Lunatic Fringe are looking at include:
    • Object-based Storage Device
    • Intelligent Storage
    • Data Grid
    • Borg Assimilation Technologies, …etc.
how does snia make a difference
How does SNIA make a difference?
  • Act as a point to achieve critical mass behind emerging technologies such as OSD, SMI, and Intelligent Storage
  • Make sure that these emerging technologies come to market from the beginning as standards (not proprietary implementations that migrate to standards)
  • Help to get beyond the potential barrier for emerging technologies OSD and Intelligent Storage
  • Help to generate vendor and user awareness and education regarding future trends and emerging technologies
  • Lunatic Fringe users will continue to push the limits of existing hardware and software technologies
  • Lunatic Fringe is a moving target – there will always be a Lunatic Fringe well beyond where you are
  • The Storage Industry at large should pay more attention to
    • What they are doing
    • Why they are doing it
    • What they learn
  • University of Minnesota Digital Technology Center – ASCI –
  • Fermilab –
  • NSA –
contact info
Contact Info