180 likes | 294 Views
FutureGrid Status Report Steven Hand steven.hand@cl.cam.ac.uk Joint project with Jon Crowcroft (CUCL), Tim Harris (CUCL), Ian Pratt (CUCL), Andrew Herbert (MSR), Andy Parker (CeSC). Grid systems architecture. Common themes of self-organization and distribution
E N D
FutureGrid Status ReportSteven Handsteven.hand@cl.cam.ac.ukJoint project with Jon Crowcroft (CUCL), Tim Harris (CUCL), Ian Pratt (CUCL), Andrew Herbert (MSR), Andy Parker (CeSC)
Grid systems architecture • Common themes of self-organization and distribution • Four motivating application areas: 1. Massively-scalable middleware 2. Advanced resource location mechanisms 3. Automatic s/w replication and distribution 4. Global data storage and publishing • Experimental test beds (PlanetLab, UK eScience centres / access grid / JANET).
Common Techniques • P2P (DHT) layer for distribution: • Using Bamboo routing substrate (Intel Research) for passing messages between peers • Provides a fault-tolerant and scalable overlay net • Can route a message to any node in an n node network in O(ln n) hops, with O(ln n) routing information at each node • Location/Distance Service: • Basic idea: Euclidean co-ordinate space for the Internet • Using PCA + lighthouse/virtual landmark • Running PlanetLab measurements looking at sensitivity to #dimensions, etc • Building “plug in” service for Bamboo neighbour selection
1. P2P Group Communication • Built on Bamboo and location service: • Use location service (co-ordinates) to get best forward route • Use RPF (Scribe) algorithm to build tree • Tree is max 2*delay of native IP multicast tree • Can build per source, or “centered” tree based on density of group, #senders, #receivers, … • Current status: • General system deployed and under test on PlanetLab • Whiteboard demo program works on top of this • Next steps: • IP multicast tunnels across multicast incapable ‘chasms’ • P2P overlay for vic/rat/access grid anticipated end ‘04
2. Distributed resource location 2. Translate to locations in a multi-dimensional search space 3. Partition/replicate the search space 4. Queries select portions of the search space 1. Determine machine locations and resource availability
Current Focus • Location-based resource co-allocation • Wish to choose subset of available nodes according to resource availability and location • First filter set, then use heuristic to solve constrained problems of the form far(near(S1,S2), near(S3,S4), C1) • System built around P2P spatial index • Three phase algorithm • Find an approximate solution in terms of clusters • Use simmulated annealing to minimize associated cost • Select representative machine(s) for each cluster • Results close to ‘brute force’ (average 10% error)
3. P2P Computing • Attempt to use P2P communication principles to gain similar benefits for grid computing • Proceeding on three axes targeting core computation, bioinformatics and batch workloads • Algorithm-specific load tolerance: • Want to allow decentralized independent load shedding • Client submits parallel computation to M > N nodes s.t. any N results suffice to produce ‘correct’ result • General case intractable: focus on algorithm-specific solns • Current focus on matrix operations using erasure codes • Also considering sketches as approximation technique
P2P Computing (2) • Indexing genomic sequences (3 x 109) • Based on using suffix array indexes; supports string matching, motif deletion, sequence alignments, etc • Smaller memory reqs than state of art suffix tree • Distributed on-line construction using P2P overlay • Caching issues (memory and swap) need investigation • Batch-aware ‘spread spectrum’ storage • Observe many batches share considerable data • Want to encourage client-driven distribution of data but avoid centralized quotas and pathological storage use • Use Palimpsest P2P storage system with ‘soft guarantees’ • Data discarded under load => need refresh to keep
4. Global Data Storage • Global-scale distributed file system • Mutability; shared directories; random access • Data permanence, quotas • Aggressive, localized caching in proportion to demand • While maintaining coherence • Storage Nodes • Confederated, well connected, relatively stable • Offer multiples of a unit of storage in return for quota it can distribute amongst users • Clients • Access via nearest Storage Node
Basic Storage Technique • Immutable data blocks, mutable index blocks • Block Id is H(contents), or H(public key) for index blocks • Insert a block by using Bamboo to route it to the node with Id nearest to the Id of the block • Maintain replicas on adjacent nodes for redundancy • Send storage vouchers to user’s accountant nodes
Content-based chunking B1 B1 B2 • Blocks with same Id are reference counted • Clients split file with content-based hash • Rabin fingerprint over 48 byte sliding window • Similar files share blocks • Reduces storage req • Improves caching perf Insert new block B8 and withdraw B2 and B3 Write B3 B4 B4 B5 B5 Read Read: return data B6 Insert new block B9 and withdraw B6 – B7unchanged Insert B7 B7
Summary and Future Work • Attempt to push towards a ‘Future GRID’ • Four ‘strands’ with common themes and (some) common infrastructure • Group communication, resource co-allocation, load flexible computing, global distributed storage • All four strands making progress: • Early papers / tech reports in all cases • Bamboo and location-service deployed & under test • Next steps include: • Move PlanetLab experiments to UK eScience infrastructure • Analysis and test of prototype designs/software
Caching • Data either returned directly, or via previous hop if block is “hot” • Cached copies are “drawn-out” from primary store toward requestors • Exploits local route convergence
Mutable index blocks <folder name=“tim” <file name=“hello.txt”> <blocklist> <block o=“234”> …id… </block> </blocklist> </file> <folder name =“bar” <file name=“hello2.txt”> <blocklist> …id… </blocklist> </file></folder><overlay> <index path=.> …id… <\index> </overlay> • May describe an arbitrary file hierarchy • Index block has associated keypair (eFS, dFS) • Insert index block using hash of public key as Id • Authenticate update by signing insertion voucher using private key • May link to other index blocks • Merge contents • Organise according to access/update patterns Voucher: H(blk), repl_factor, eFS dFS
Shared file spaces • Users can only update their own index blocks • Sharing through overlaying • Import other user’s name space, modify, re-export • Copy on Write overlay • Active delete markers