DiDaS Distributed Data Storage
160 likes | 186 Views
Explore the distributed data storage project at Masaryk University in collaboration with CESNET. Learn about the motivation, infrastructure, applications, and future extensions of this innovative network storage solution.
DiDaS Distributed Data Storage
E N D
Presentation Transcript
DiDaSDistributed Data Storage Ludek Matyska Masaryk University, Institute of Comp. Sci. And CESNET, z.s.p.o Ludek.Matyska@muni.cz APAN, Logistical Networking WS
Outline • Motivation • Infrastructure • Applications • Future extensions APAN, Logistical Networking WS
Motivation • Increased need for network storage • Computational Grids • Data Grids • Temporary Data Deposits • Transient Caches • Video deposits • National Library Requirements • Distribution of digitized content APAN, Logistical Networking WS
Requirements • Transparent • Location independent • Good geographical distribution • Providing support for • Access quality (e.g. Streaming) • Reliability (no single point of failure) APAN, Logistical Networking WS
Infrastructure • Data depots • Control: Personal computer • Storage: RAID of IDE disks • Capacity 1,5 TB each • Number: 7 (total capacity 10 TB) • Connectivity • Directly to the backbone • 100 Mb/s or 1 Gb/s APAN, Logistical Networking WS
Data Layer • IBP (70% capacity) • General use • GridFTP servers (30% capacity) • Grid support • Computer independent temporary data storage • Comparison with IBP based solution APAN, Logistical Networking WS
Traffic optimisation • Network traffic cost function • Inter-depots topology known • Instrumented clients • Measurement from depot to client • Simultaneous data transfer and measurements • Real-time transfer rate prediction • Choose depot • Decision between point and multipoint transfers APAN, Logistical Networking WS
Applications • National Technical Library • Video Streaming • Nonspecific Users APAN, Logistical Networking WS
National Technical Library • Requirements • Program of content digitalisation • Data stored on the central tape robot • Not optimised for distribution • Danger of overload • Model data: old cartographic maps APAN, Logistical Networking WS
National Technical Library • DiDaS role • Cache like storage • Load balancing optimisation • Data transfer reliability (multistreaming) APAN, Logistical Networking WS
Video Streaming • Permanent storage • Specific clients • QoS requirements (pre-caching) • Replica management • Not yet implemented APAN, Logistical Networking WS
Nonspecific Users • Temporary data deposits • Provide data for load balancing • Transfer outside of DiDaS core • Access reliability • Automatic replica generation • Transparent multi-access • Ability to react on connectivity loss APAN, Logistical Networking WS
Future work • New clients development • support for new application areas • Extended and transparent replica management • Full instrumentation • Data for • Load balancing • Replica creation/deletion • User access optimisation APAN, Logistical Networking WS
Thank you for your interest APAN, Logistical Networking WS