1 / 13

Synchronizing Lustre file systems

Synchronizing Lustre file systems. Dénes Németh ( nemeth.denes@iit.bme.hu ) Balázs Fülöp ( fulop.balazs@ik.bme.hu ) Dr. János Török ( torok@ik.bme.hu ) Dr. Imre Szeberényi ( szebi@iit.bme.hu ). The current state of art. Partially solved Conventional local file systems

pisces
Download Presentation

Synchronizing Lustre file systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Synchronizing Lustrefile systems Dénes Németh(nemeth.denes@iit.bme.hu) Balázs Fülöp (fulop.balazs@ik.bme.hu) Dr. János Török(torok@ik.bme.hu) Dr. Imre Szeberényi(szebi@iit.bme.hu)

  2. The current state of art • Partially solved • Conventional local file systems • Off-line operation (rsync) • Problems • Walk through the directory structure • Have to know what will change (Inotify) • Does not work on distributed file systems • Scalability problems

  3. The environment - Lustre • Distributed • Stripes (part of a file) on separate hosts • ~100-1000 clients (reading writing) • Redundant • File system and file metadata • Fault tolerance • Transaction driven operations • Rollback capability

  4. Lustre – synchronization • Distributed • Hosts  absolute event sequencing • Is the time accurate enough? • Clients extreme efficiency • Redundant – Fault tolerance • Pulling the plug during synchronizing • Moving, tracking events • Rollback  synchronize to transactions

  5. „inode” The basic Lustre concept Lustre Server Side Lustre Client Side Metadata Server failover Object Storage Targets ~100-1000

  6. Events Moving the information - metadata Lustre Server Side Lustre Client Side Metadata Server Kernel space Lustre Metadata Access Local Event Sequencer Global Event Sequencer Event Reporter Event Multiplexer Event Processor Object Storage Targets ~100-1000

  7. Events How-to move the information Metadata Server • Big difficulties • Sequencing = Accurate timing • Network delay • Delay from FS overload • Connection to all MDS • Can be a bottleneck • Just multiplexing events • No problems • No authorization, registration • (fix configuration) • Minimal network usage • Usually not a bottleneck • ER & EM can be deployed together or separately • Asynchrone notification • system calls: • Select (timeout) • Read,write (blocking) • Max 100.000 events/sec • Relative Complicated access • Easy access from user-space • Notifications through signals • Possibility for multiple reporters ProcFile System ProcFile System Block Device Block Device Local Event Sequencer Global Event Sequencer TCP/IP Network TCP/IP Network Event Reporter TCP/IP Network TCP/IP Network TCP/IP Network TCP/IP Network TCP/IP Network Event Multiplexer Event Processor

  8. Accurate sequencing Linearly increasing output Number of local sequencers

  9. Average sequence performance Server has enough threads - Performance OK - Constant QoS Server needs more threads - Performance DROPS - Why? ~ 5000 event/thread „Graceful degradation” Linear drop in performance

  10. Resource usage on the global sequencer at most 2 ms in each second ~ 0

  11. SFS 3 MDS OST MDS OST Event Reporter Committer Client Committer Client Event Processor Event Processor Committer Client Event Processor B A 4 4 3 3 A A B B How-to commit the changes SFS 1 SFS 2 MDS OST Event Multiplexer Event Multiplexer Event Reporter How-to execute „3” if „4” already happened? Unfortunately no real goodsolution

  12. Event sequence error resolution • Ostrich politic • Drop all evens with conflicting sequence • Conflict detection • Is the event applicable? • In design stage … • Replaying the already committed events • Currently lack of Lustre support

  13. Questions? Thank you for your Attention!

More Related