1 / 28

Active Storage and Its Applications

2007 Scientific Data Management All Hands Meeting Snoqualmie, WA. Active Storage and Its Applications. Jarek Nieplocha, Juan Piernas-Canovas Pacific Northwest National Laboratory. Outline. Description of the Active Storage Concept New Implementation of Active Storage Programming Framework

hieu
Download Presentation

Active Storage and Its Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 2007 Scientific Data Management All Hands Meeting Snoqualmie, WA Active Storage and Its Applications Jarek Nieplocha, Juan Piernas-Canovas Pacific Northwest National Laboratory

  2. 2 Outline Description of the Active Storage Concept New Implementation of Active Storage Programming Framework Examples and Applications

  3. 3 Active Storage in Parallel Filesystems Active Storage exploits the old concept of moving computing to the data source to avoid data transfer penalties applications use compute resources on the storage nodes Storage nodes are full-fledged computers with lots of CPU power available, and standard OSes and Processors Traditional Approach Active Storage Y=foo(X) Y Y=foo(X) P P x FS FS P P Network Network P P FS FS P P compute nodes compute nodes I/O nodes I/O nodes

  4. Example • BLAS DSCAL on disk Y = α .Y • Experiment • Traditional:The input file is read from filesystem, and the output file is written to the same file system. The input file has 120,586,240 doubles. • Active Storage:Each server receives the factor, reads the array of doubles from its disk locally, and stores the resulting array on the same disk. Each server processes 120,586,240/N doubles, where N is the number of servers • Speedup contributed to using multiple OSTs and avoiding data movement between client and servers (no network bottleneck) 4

  5. 5 Related Work Active Disk/Storage concept was introduced a decade ago to use processing resources ‘near’ the disk On the Disk Controller. On Processors connected to disks. Reduce network bandwidth/latency limitations. References DiskOS Stream Based model (ASPLOS’98: Acharya, Uysal, Saltz) Active Storage For Large-Scale Data Mining and Multimedia (VLDB ’98: Riedel, Gibson, Faloutsos) Research proved Active Disk idea interesting, but Difficult to take advantage of in practice Processors in disk controllers not designed for the purpose Vendors have not been providing SDK Y=foo(X)

  6. 6 Lustre Architecture MDS MDS MDS MDS Client Client O(10000) Client Client Network Directory Metadata & concurrency File IO & Locking OST OST OST OST OST OST O(10) OST OST OST OST Recovery, File Status, File Creation OST OST OST OST O(1000) OST OST OST OST

  7. 7 Active Storage in Kernel Space When the client writes to the file A: ASOBD makes a copy of data, and sends it to ASDEV The PC reads from and writes to the char device Original data in A, processed data in B Processing component User space Kernel space NAL Char device OST Active Storage Module ASDEV ASOBD OBDfilter Ldiskfs Disk A B

  8. 8 Active Storage ApplicationHigh Throughput Proteomics 1 Experiment per hour 5000 spectra per experiment 4 MByte per spectrum Per instrument: 20 Gbytes per hour 480 Gbytes per day Next generation technology will increase data rates x200 9.4 Tesla High Throughput Mass Spectrometer • Application Problem • Given 2 float input number for target mass and tolerance, find all the possible protein sequences that would fit into specified range • Active Storage Solution • Each OST receives its part of the float pair sent • by the client stores the resulting processing • output in its Lustre OBD (object-based disk)

  9. 9 SC’2004 StorCloud Most Innovative Use Award Lustre OST Lustre OST Lustre OST Lustre MDS Gigabit Network Client System Lustre OSS 0 Lustre OSS 38 Lustre OSS 39 • Proteomics Application • 320 TB Lustre • 984 400GB disks • 40 Lustre OSS's running Active Storage • 4 Logical Disks (160 OST’s) • 2 Xeon Processors • 1 MDS • 1 Client creating files Sustained 4GB/s Active Storage write processing

  10. 10 Active Storage in User Space Problems with the Kernel Space implementation Portability, maintenance, extra memory copies We developed a User Space implementation Most file system allows the storage nodes to be clients Most file system allows to create files with a given layout Our framework launches Processing Components on the storage nodes which have the files to be processed Processing Components read from and write to local files Highly Portable Implementation Used with Lustre 1.6, PVFS2 2.7 Bug in Lustre 1.4 (and SFS): frequent kernel crashes when mounting the file system on the storage nodes Held initial discussions with IBM on GPFS port

  11. 11 Active Storage in User Space Parallel Filesystem's Clients Compute Node Compute Node Compute Node Compute Node ..... Network Interconnect ASRF asmaster ASRF Processing Component Processing Component ..... Data I/O Traffic Active Storage Runtime Framework Storage Node0 Metadata Server Storage NodeN-1 Parallel Filesystem's Components (also clients of the filesystem)

  12. 12 Performance Evaluation • AMINOGEN Bioinformatics Application • Input file: ASCII file, mass and tolerance pairs, one per line. Total size = 44 bytes • Output file: binary file which contains amino acid sequences. Total size = 14.2 GB Overall execution time

  13. 13 Enhanced Implementation ofActive Storage for Striped Files Striped Files broadly used for performance not supported by earlier AS work Enhanced Implementation Use striping data from filesystem New component: AS Mapper Locality awareness in Processing Component: compute on local chunks Climate application with netCDF Computes statistics of key variables from Global Cloud Resolving simulation (U. Colorado) Eliminated >95% network traffic Processing Component Active Storage Runtime Framework Processing component read call write call Contiguous file 0 1 2 3 4 ... LIBAS read Local chunks Write Local chunks GLIBC read write Local chunks 2 6 10 14 18 ...

  14. 14 Examples and ApplicationsJuan Piernas-Canovas

  15. 15 Parallel Filesystem's Clients Comp. Node /lustre Comp. Node /lustre Comp. Node /lustre Comp. Node /lustre Network Interconnect MDS & MGS asmaster OST31 /lustre OST43 /lustre dscal dscal Doubles.20 Doubles.15 Data I/O Traffic Doubles.20.out Doubles.15.out Parallel Filesystem's Components Active Storage in DSCAL Example

  16. 16 Non-Striped Files <?xml version="1.0"?> <rule> <match> <pattern>/lustre/doubles.*</pattern> </match> <program> <path arch="any">/lustre/dscal</path> <arguments>12345.67890 @ @.out</arguments> </program> </rule> /lustre/doubles.15.out in OST43 (new file) /lustre/doubles.15 in OST43

  17. 17 Climate Application • Collaboration with SciDAC GCRM SAP (Karen) • Problem: Compute averages for variables generated from scientific simulation • stored in striped output files • geodesic grid • netCDF data format • Objective: Optimize performance by exploiting data locality in AS Processing Components to minimize network traffic

  18. 18 Non-Striped Files <?xml version="1.0"?> <rule> <match> <pattern>/lustre/doubles.*</pattern> </match> <program> <path arch="any">/lustre/dscal</path> <arguments>12345.67890 @ @.out</arguments> </program> </rule> /lustre/doubles.20.out in OST31 (new file) /lustre/doubles.20 in OST31 Execution: /lustre/asd/asmaster /lustre/dscal.xml

  19. 19 Processing Patterns In user space, it is easy to support different processing patterns: Client data stream Client data stream Active Storage Active Storage PC PC 1W0 1W#W

  20. 20 No Output File (Pattern 1W0) <?xml version="1.0"?> <rule> <match> <pattern>/lustre/doubles.*</pattern> </match> <program> <path arch="any">/lustre/dscal1</path> <arguments>12345.67890 @</arguments> </program> </rule> /lustre/doubles.15 in OST43

  21. 21 Several Output Files (Pattern 1W#W) <?xml version="1.0"?> <rule> <match> <pattern>/lustre/doubles.*</pattern> </match> <program> <path arch="any">/lustre/dscal3</path> <arguments>12345.67890 @ @.out @.err</arguments> </program> </rule> /lustre/doubles.15.err in OST43 (new file) /lustre/doubles.15.out in OST43 (new file) /lustre/doubles.15 in OST43

  22. 22 Transparent Access to Striped Files <?xml version="1.0"?> <rule> <match> <pattern>/lustre/doubles.*</pattern> </match> <program> <path arch="any">/lustre/dscal</path> <arguments>12345.67890 @{hidechunks} @{copystriping,hidechunks}.out</arguments> </program> </rule> Transparent access to the chunks of the input file Transparent access to the chunks of the output file New output file with the same striping of the input file

  23. 23 Mapper and Striped netCDF Files Network Interconnect ...... ASRF ASRF ASRF ASRF PC PC asmaster Mapper(0, 2) Storage Node0 Storage Node1 Storage NodeN-1 Storage Node2 Metadata Server Data I/O Traffic ...... Header Var. data Striped netCDF file ...... Var. data ...... Var. data ...... Var. data Var. data

  24. 24 Processing of netCDF files <?xml version="1.0"?> <rule> <stdfiles> <stdout>@.out-${NODENAME}</stdout> </stdfiles> <match> <pattern>/lustre/data.*</pattern> </match> <program> <path arch="any">/lustre/processnetcdf.py</path> <arguments>@ ta</arguments> </program> <mapper> <path arch="any">/lustre/netcdfmapper.py</path> <arguments>@ ta ${CHUNKNUM} ${CHUNKSIZE}</arguments> </mapper> </rule> Non-striped output file /lustre/data.37.out-ost43 Striping information of /lustre/data.37 Variable name in the netCDF file /lustre/data.37

  25. 25 PVFS2 support <?xml version="1.0"?> <rule> <match> <pattern>/lustre/doubles.*</pattern> </match> <program> <path arch="any">/lustre/dscal</path> <arguments>12345.67890 @{hidechunks} @{copystriping,hidechunks}.out</arguments> </program> <filesystem> <type>pvfs</type> <mntpoint>/pvfs2</mntpoint> </filesystem> </rule> PVFS2

  26. 26 Local File System with Virtual Striping <?xml version="1.0"?> <rule> <match> <pattern>/lustre/doubles.*</pattern> </match> <program> <path arch="any">/lustre/dscal</path> <arguments>12345.67890 @{hidechunks} @{copystriping,hidechunks}.out</arguments> </program> <filesystem> <type>localfs</type> <striping>8:1048576</striping> </filesystem> </rule> Local file system Virtual striping: - stripe size: 1MB - stripe count: 8

  27. 27 Further Information Technical paper J. Piernas, J. Nieplocha, E. Felix, “Evaluation of Active Storage Strategies for the Lustre Parallel Filesystem”, Proc. SC’07 Website: http://hpc.pnl.gov/projects/active-storage Upcoming release in December 2007 Support for Lustre 1.6, PVFS2, and Linux local file systems Source code available now under request. Just send us an e-mail! Jarek Nieplocha <jarek.nieplocha@pnl.gov> Juan Piernas-Canovas <juan.piernascanovas@pnl.gov>

  28. Questions? Active Storage and Its Applications Jarek Nieplocha, Juan Piernas-Canovas Pacific Northwest National Laboratory

More Related