1 / 25

Project Mimir

Project Mimir. A Distributed Filesystem Uses Rateless Erasure Codes for Reliability Uses Pastry’s Multicast System Scribe for Resource discovery and Utilization. Erasure Codes.

kane
Download Presentation

Project Mimir

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Project Mimir • A Distributed Filesystem • Uses Rateless Erasure Codes for Reliability • Uses Pastry’s Multicast System Scribe for Resource discovery and Utilization

  2. Erasure Codes • an erasure code transforms a message of n blocks into a message with more than n blocks, such that the original message can be recovered from a subset of those blocks. The fraction of the blocks required is called the rate, denoted r.

  3. Optimal Erasure Codes

  4. Rateless Erasure Codes • Also called Fountain Codes • Are rateless because they can produce an infinite stream of encoded data. • Most rateless codes are sub-optimal, a file of n blocks can be decoded from any m encoded blocks where m≥ (1+ε)n.

  5. Luby Tranform • Low Density • First rateless erasure code discovered • Encodes blocks by randomly selecting a degree d (1≤d≤n) and then uses the XOR operation on d random un-encoded blocks. • Decodes blocks by the principle that • A XOR B = C • A XOR C = B • B XOR C = A

  6. Encoded Block Identification The decoder must be able to identify an encoded block’s degree and which blocks were used to encode it. • Pass the seed used in the random number generator to regenerate an identical encoding. • Additional decoder overhead • All encoders and decoders must use the same random number generators • Attach binary headers to encoded blocks where each bit represents whether a specific block was used in the encoding. • Additional network overhead • Header bit length equals n

  7. Common Uses of LT Codes • One way communication protocols over noisy channels • File streaming (IPTV & other media) • Long distance communication • Satellite communications • Mobile phone • High latency network

  8. LT Codes in Mimir • Encoded blocks striped evenly across a large network of computers. • Generate xn encoded data and guaranteed successful decoding when 1-(2/x) of all network nodes fail. • Distributed disk space equals real disk space /x • 50 computers *100GB each = 5TB • X=4; distributed disk space =1.25TB • 100% reliability while at least 25 computers are online

  9. Challenges • Cannot encode new data unless file is reconstructed first, high churn network requires a lot of computation • Decoding still probabilistic, although probability is extremely high: greater than 99.99999 at failure limit

  10. Modifications to LT Code • Modified the LT Code to guarantee each block is encoded an equal number of times (evenly saturated). RobuStore also does this. • Evenly distribute according to block degree • Modified distribution • spiking • n unique blocks with degree 1 • offset distribution

  11. Application level any/multicast • Issues with Network level multicast • Uses for multicast • Publish Subscribe Architecture • Resource Advertisement/ Discovery • Mass Content Distribution

  12. Issues with network level multicast • Difficult to set up • Does not handle large numbers of multicast and anycast groups • Does not handle very dynamic networks • Often will not work over the Internet

  13. Content Based Publish Subscribe • Allows for expandable network architectures • Allows for conditional matching and event notification • Allows for fault tolerant networks • Needs Distributed Multidimensional Matching algorithm to match publish subscribe problem to one of multidimensional indexing

  14. Distributed Multidimensional Matching • Requires • each attribute has a known domain • a known finest granularity • a known global order of the attributes • The mapping • a d dimensional space S where d = num attributes • every attribute ai mappes to a dimesion di in S • S is managed by bst that is a recursive subdivision of S into regions through (d-1) – dimensional hyperplanes • each hyperplane divides a region in half. • each region has a corresponding node n(r) in search tree

  15. DMM Continued • Each region is addressed by a bit string called a z-code and is assocated with one node in the tree • a subscription s is stored at all leaf nodes n(ri) in the search tree such that ri intersects s. • the information of each node in the tree is stored at the peer p(r) in the DHT . • Subscriptions are sent to the root peer and flow down to appropriate leaf nodes.

  16. Resource Discovery (Topic Based Matching)‏ • Manage dynamic distributed resources • Nodes join groups when they have a desired resource leave when they no longer are avalible • other nodes can request nearby resources by anycasting/multicasting messages to the appropriate group.

  17. Implementation • each group has a key called groupId which maps to a DHT's ID space. • Create a group • send a message to that group id • nearest node becomes root of spanning group tree • root then adds the requesting node • Join • send a message to the root • if an intermediate node is part of that spanning tree add the requesting node as a child and stop • otherwise keep forwarding the message

  18. Continued • Leaving the group • If a node has entries in the children table • mark as not a member and stop • otherwise • send a leave message to it's parent in the tree • parent then removes node from the children table • Anycast • Implemented as a DFS of the group tree • Load balanced since different requests start at different nodes

  19. Multicast • Like anycast but sends to all group members • Uses bandwidth on O(N)‏ • Useful to request data (values) when most members will have some data to contribute.

  20. Pastry/Scribe in Mimir • Pastry's Id routing is used to provide a network independent routing scheme. • We have three Topics • Metadata Controller Topic • provides security and path to fileid mapping • Storage Node Topic • provides a way to list avalible resources (file storage)‏ • Client Topic • provides information on client nodes

  21. File Storage Request • Send Multi cast request to MDC to add a file to the system • MDC sends back to requesting client the new files id • Client then multi casts a store request message • all storage nodes respond with ip/port data • Client then connects directly to each storage node and stripes the encoded blocks over them evenly.

  22. File Retrieval Request • Multi cast a GET FILE <FILE Id> message to storage nodes • Storage nodes look up what data they have for that file and send it back to the requesting Id. • Client then rebuilds the file as data is being received • If the file can not be rebuilt print some error message.

  23. Advantages of Pastry for Mimir • P2P network provides a reliable way to handle a dynamic set of nodes and keep communication open with no central communication point • Multicast helps us quickly attempt to get and save files

More Related