Download Presentation
## Linda and TupleSpaces

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Linda and TupleSpaces**PrabhakerMateti**Linda Overview**• an example of Asynchronous Message Passing • send never blocks (i.e., implicit infinite capacity buffering) • ignores the order of send • Associative abstract distributed shared memory system on heterogeneous networks • http://lindaspaces.com/ Linda**Tuple Space**• A tuple is an ordered list of (possibly dissimilar) items • (x, y), coordinates in a 2-d plane, both numbers • (true, ‘a’, “hello”, (x, y)), a quadruple of dissimilars • Instead of () some papers use < > • Tuple Space is a collection of tuples • Consider it a bag, not a set • Count of occurrences matters. • T # TS stands for #occurrences of T in TS • Tuples are accessed associatively • Tuples are equally accessible to all processes Linda**Linda’s Primitives**• Four primitives added to a host proglang • out(T) • output T into TS • the number of T’s in TS increases by 1 • Atomic • no processes are created • eval(T) • creates a process that “evaluates” T • residual tuple is output to TS • in(T) • input T from TS • the number of T’s in TS decreases by 1 • no processes are created • more … • rd(T) abbrev of read(T) • input T from TS • the number of T’s in TS does not change • no processes are created Linda**Example: in(T) and inp(T)**• Suppose multiple processes are attempting • Let T # TS stand for no. occurrences of T in TS • if T # TS ≥ 1: • input the tuple T • T # TS decreases by 1 • atomic operation • if T # TS = 1: • Only one process succeeds • Which? Unspecified; nondeterministic • if T # TS = 0: • All processes wait for some process to out(T) • may block for ever • inp(T) • a “predicated” in(T) • if T#TS = 0, inp(T) fails but the process is not blocked • if T#TS = 1, inp(T) succeeds • effect is identical to in(T) • process is not blocked • rdp(T) Linda**Example: in(“hi”, ?x, false)**• x declared to be an int • the tuple pattern matches any tuple T provided: • length of T = 3 • T.1 = “hi” • T.2 is any int • T.3 = false • X is then assigned that int • Suppose TS = {| (“hi”, 2, false), (“hi”, 2, false), (“hi”, 35, false), (“hi”, 7, false), … |} • in(“hi”, ?x, false) inputs one of the above • which? unspecified • Tuple patterns may have multiple ? symbols Linda**in(N, P2, …, Pj)**• N an actual arg of type Name • P2 … Pj are actual/ formal params • The values found in the matched tuple are assigned to the formals; the process then continues • The withdrawal of the matched tuple is atomic. • If multiples tuples match, non deterministic choice • If no matching tuple exists, in(…) suspends until one becomes available, and does the above. Linda**Example: eval(“i”,i, sqrt(i))**• Creates a new process(es) • to evaluate each field of eval(“i”, i, sqrt(i)) • the result is output to TS • The tuple (“i”, i, sqrt(i)) is known as an “active” tuple. • Suppose i = 4 • sqrt(i) is computed by the new process. • Resulting tuple is (“i”, 4, 2.0) • known as a passive tuple • can also be (“i”, 4, -2.0) • (“i”, 4, 2.0) is output to TS • Process(es) terminate(s). • Bindings inherited from the eval-executing process only for names cited explicitly. Linda**Example: eval("Q", f(x,y))**• Suppose eval("Q", f(x,y)) is being executed by process P0 • P0 creates two new processes, say, P1 and P2. • P1 evaluates “Q” • P2 evaluates f(x,y) • P0 moves on • P0 does not wait for P1 to terminate • P0 does not wait for P2 to terminate • P0 may later on do an in(“Q”, ?result) • P2 evaluates f(x,y) in a context where f, x and y have the same values they had in P0 • No bindings are inherited for any variables that happen to be free (i.e., global) in f, unless explicitly in the eval Linda**Linda Algorithm Design Example**• Given a finite bag B of numbers, as well as the size nb of the bag B, find the second largest number in B. • Use p processes • Assume the TS is preloaded with B: • (“bi”, bi) for i: 1..nb • (“size”, nb) • Each process inputs nb/p numbers of B • Is nb % p = 0? • Each process outputs the largest and the second largest it found • A selected process considers these 2*p numbers and does as above • Result Parallel Paradigm Linda**Linda Algorithm: Second Largest**intfirstAndSecond(intnx) { int bi, fst, snd; in(“bi”, ?bi); fst = snd = bi; for (inti = 1; i < nx; i++) { in(“bi”, ?bi); if (bi > fst) { snd = fst; fst = bi; } } out(“first”, fst); out(“second”, snd); return 0; } main(intargc, char *argv[]) { /* open a file, read numbers,… * out(“bi”, bi) * out(“nb”, nb) * p = … */ inti, nx = nb / p; /* Is nb % p = 0? */ for (i=0; i < p; i++) eval(firstAndSecond(nx)); /* in(“first”, fst) and * in(“second”, snd) tuples … * finish the computation */ } Linda**Arrays and Matrices**• An Array • (Array Name, index fields, value) • (“V”, 14, 123.5) • (“A”, 12, 18, 5, 123.5) • That A is 3d … you know it from your design; does not follow from the tuple • Tuple elements can be tuples • (“A”, (12, 18, 5), 123.5) Linda**“Linked” Data Structures in Linda**• A Binary Tree • Number the nodes: 1 .. • Number the root with 1 • Use the number 0 for nil • (“node”, nodeNumber, nodedata, leftChildNumber, rightChildNumber) • A Directed Graph • Represent it as a collection of directed edges. • Number the nodes: 1 .. • (“edge”, fromNodeNumber, toNodeNumber) Linda**More on Data Structures in Linda**• Binary Tree (again) • A Lisp-like cons cell • (“C”, “cons”, [“A”, “B”]) • (“B”, “cons”, []) • An atom • (“A”, “atom”, value) • Undirected Graphs • Similar to Directed Graphs • How to ignore the “direction” in (“edge”, fromNodeNumber, toNodeNumber)? • Add (“edge”, toNodeNumberfromNodeNumber) • Or, use Set Representation. Linda**Coordinated Programming**• Programming = Computation + Coordination • The term coordination refers to the process of building programs by gluing together processes. • Unix glue operation: Pipe • “Coordination is managing dependencies between activities.” • Barrier Synchronization: Each process within some group must until all processes in the group have reached a “barrier”; then all can proceed. • Set up barrier: out (“barrier”, n); • Each process does the following: in(“barrier”, ? val); out(“barrier”, val-1); rd(“barrier”, 0) Linda**serviceARequest() {**int ix, cid; typeRQreq; typeRS response; … for (;;) { in (“request”, ?cid, ?ix, ?req) … out (“response”, cid, ix, response); } } a client process:: intclientid = …, rqix = 0; typeRQreq; typeRS response; … out (“request”, clientid, ++rqix, req); … in (“response”, clientid, rqix, ?response); … RPC Clients and Servers**Dining Philosophers, Readers/Wr**phil(inti) { while(1) { think (); in(in"room ticket") in("chopstick", i); in("chopstick", (i+i)%Num); eat(); out("chopstick", i); out("chopstick",(i+i)%Num); out("room ticket"); } } initialize() { inti; for (i = 0; i < Num; i++) { out("chopstick", i); eval(phil(i); if (i < (Num-1)) out("room ticket"); } } startread(); … read;… stopread(); startread() { rd("rw-head", incr("rw-tail")); rd("writers", 0); incr("active-readers"); incr("rw-head"); } intincr(CounterName); { in(CounterName, ?value); out(CounterName, value + 1); return value; } /* complete the rest of the implementation of * the readers-writers */ Linda**Semaphores in Linda**• Create a semaphore named xyz whose initial value is 3. • Solution: RHS • Properties: • Is it a semaphore satisfying the “weak semaphore assumption”? • Load the tuple space with • (“xyz”), (“xyz”), (“xyz”) • P(nm) { in(nm); } • V(nm) { out(nm); } Linda**Programming Paradigms**• Result Parallel • focus on the “structure” of input space. • Divide this into many pieces of the same structure. • Solve each piece the same way • Combine the sub-results into a final result • Divide-and-Conquer • Hadoop • Agenda Of Activities • A list of things to do and their order • Example: Build A House • Build Walls • Frame the walls • Plumbing • Electrical Wiring • Drywalls • Doors, Windows • Build a Drive Way • Paint the House • Ensemble Of Specialists • Example: Build A House • Carpenters • Masons • Electrician • Plumbers • Painters • Master-slave Architecture • These paradigms are applicable to not only Linda but other languages and systems. Linda**Result Parallel Generate Primes**/* From Linda book, Chapter 5 */ intisprime(int me) {intp,limit,ok;limit=sqrt((double)me)+1;for(p=2; p < limit;++p){rd("primes“,p,?ok);if(ok &&(me%p==0))return0;}return1; } real_main() {int count =0,i, ok;for(i=2;i<= LIMIT;++i) eval("primes",i,isprime(i));for(i=2;i<= LIMIT;++i){rd("primes",i,?ok);if(ok){ ++count; printf(“prime: %n\n”, i); } } } Linda**Paradigm: Agenda Parallelism**/* From Linda book */ real_main(intargc, char *argv[]) { inteot,first_num,i,length, new_primes[GRAIN],np2; intnum,num_prices, num_workers, primes[MAX], p2[MAX]; num_workers = atoi(argv[1]); for (i = 0; i < num_workers; ++i) eval("worker", worker()); num_primes= init_primes(primes, p2); first_num= primes[num_primes-1] + 2; out("next task", first_num); eot= 0; /* Becomes 1 at end of table */ for (num= first_num; num< LIMIT; num += GRAIN){ in("result", num, ? new_primes:length); for (i = 0; i < length; ++i, ++num_primes) { primes[num_primes] = new_primes[i]; if (!eot) { np2 = new_primes[i]*new_primes)[i]; if (np2 > LIMIT) { eot= 1; np2 = -1; } out("primes", num_primes, new_primes[i], np2); } } } /* "? int" match any intandthrow out the value */ for (i = 0; i < num_workers; ++i) in("worker", ?int); printf("count: %d\n", num_primes); } worker() { int count, eot,i, limit, num, num_primes, ok,start; intmy_primes[GRAIN], primes[MAX], p2[MAX]; num_primes = init_primes(primes, p2); eot = 0; while(1) { in("next task", ? num); if (num == -1) { out("next task", -1); return; } limit = num + GRAIN; out("next task", (limit > LIMIT)? -1 : limit); if (limit > LIMIT) limit = LIMIT: start = num; for (count = 0; num < limit; num += 2) { while (!eot && num > p2[num_primes-1]) { rd("primes", num_primes, ?primes[num_primes], ?p2[num_primes]); if (p2[num_primes] < 0) eot= 1; else ++num_primes; } for (i = 1, ok = 1; i < num_primes; ++i) { if (! num % primes[i])) { ok = 0; break ; } if (num < p2[i]) break; } if (ok) {my_primes[count] = num; ++count;} } /* Send the control process any primes found. */ out("result", start, my_primes:count); } } Linda**Paradigm: Specialist Parallelism**/* From Linda book */ source() { inti, out_index=0; for (i = 5; i < LIMIT; i += 2) out("seg", 3, out_index++, i); out("seg", 3, out_index, 0); } pipe_seg(prime, next, in_index) { int num, out_index=0; while(1) { in("seg", prime, in_index++, ? num); if (!num) break; if (num % prime) out("seg", next, out_index++, num); } out("seg", next, out_index, num); } sink() { intin_index=0, num, pipe_seg(), prime=3, prime_count=2; while(1) { in("seg", prime, in_index++, ?num); if (!num) break; if (num % prime) { ++prime_count; if (num*num < LIMIT) { eval("pipe seg“, pipe_seg(prime,num,in_index)); prime = num; in_index = 0 } } } printf("count: %d.\n", prime_count); } real_main() { eval("source", source()); eval("sink", sink()); } Linda**Linda Summary**• out(), in(), rd(), inp(), rdp() are heavier than host language computations. • eval() is the heaviest of Linda primitives • Nondeterminism in pattern matching • Time uncoupling • Communication between time-disjoint processes • Can even send messages to self • Distributed sharing • Variables shared between disjoint processes • Many implementations permit multiple tuple spaces • No Security (no encapsulation) • Linda is not fault-tolerant • Processes are assumed to be fail-safe • Beginners do this in a loop { in(?T); if notOK(T) out(T); } No guarantee you won’t get the same T. • The following can sequentialize the processes using this code block: {in(?count); out(count+1);} • “Where most distributed languages are partially distributed in space and non-distributed in time, Linda is fully distributed in space and distributed in time as well.”**JavaSpaces and TSpaces**• JavaSpaces is Linda adapted to Java • net.jini.space.JavaSpace • write(…): into a space • take(…): from a space • read(…): … • notify: Notifies a specified object when entries that match the given template are written into a space • java.sun.com/developer/technicalArticles/tools/JavaSpaces/ • Tspaces is an IBM adaptation of Linda. • “TSpaces is network middleware for the new age of ubiquitous computing.” • TSpaces = Tuple + Database + Java • write(…): into a space • take(…): from a space • read(…): … • Scan and ConsumingScan • rendezvous operator, Rhonda. • Tspaces Whiteboard • www.almaden.ibm.com/cs/TSpaces/ Linda**http://lindaspaces.com/**• NetWorkSpaces • “open-source software package that makes it easy to use clusters from within scripting languages like Matlab, Python, and R.” • Nicholas Carriero and David Gelernter, “How to Write Parallel Programs” book, MIT Press, 1992 • Tutorial on Parallel Programming with Linda Linda**CEG 730 Preferences**• Assume TS is preloaded with input data in a form that is helpful. • At the end of the algorithm, • TS should have only the results • the preloaded input data is removed • Any C-program can be embedded into C-Linda • not acceptable at all • Use p processes • In general, you choose p so that “elapsed” time is minimized assuming the p processes do time-overlapped parallel computation. • Is nb % p == 0? • pad the input data space with dummy values that preserve the solutions • Let some worker processes do more • Avoid using inp() and/or rdp() because • it confuses our thinking • we can get better designs without them • A badly used inp() can produce a livelock where a plain in() would have cause a block. • Typically, we can avoid the use of inp(). • Not always. • Problem: Compute the number of elements in a bag B. Assume B is preloaded into TS. • Solution needs inp(). Linda**References**• SudhirAhuja, Nicholas Carriero and David Gelernter, ``Linda and Friends,'' IEEE Computer (magazine), Vol. 19, No. 8, 26-34. www.lindaspaces.com/ has an entire book. • JavaSpaces,en.wikipedia.org/wiki/Tuple_space • Andrews, Section 10.x on Linda. Yet another prime number generator. • Jeremy R. Johnson, www.cs.drexel.edu/ ~jjohnson/2010-11/winter/cs676.htm Linda