Implementation of Shared Memory

Implementation of Shared Memory • Considerations • Network traffic due to create/read/write • Latency of create/read/write • Synchronization • Network traffic v. Latency • Copy on Read (No local caching) • Local read and write is fast, no traffic • Remote Read is slow, generates traffic (2-way) • Local write is fast, no traffic • Remote write is fast, generates traffic • Copy on Write (local caching) • Local read is fast, no traffic • Remote read is fast, no traffic • Local write is fast, generates traffic • Remote write is fast, generates traffic • Central Server and Single Copy are basically the same. Except that central server does not allow you to optimize the location of the master copy for the process that needs it the most CSE 466 – Fall 2000 - Introduction - 1

Choice of Policy … Frequently read(X) … … Frequently read(X) … Infrequently: write(X,1); write(X,2); Copy-on-write: Local caching far less traffic … Infrequently read(X) … … Infrequently read(X) … Frequently:: write(X,1); write(X,2); Copy-on-read: no local caching has less traffic CSE 466 – Fall 2000 - Introduction - 2

Choice of place to Store Case 1 … Frequently read(X) … … Frequently read(X) … Infrequently: write(X,1); write(X,2); Copy-on-write: Local caching far less traffic Put the master copy on the most frequent writer Case 2 … Infrequently read(X) … … Infrequently read(X) … Frequently:: write(X,1); write(X,2); Copy-on-read: no local caching has less traffic Put the master copy on the most frequent writer What do to when reading and writing are both frequent? CSE 466 – Fall 2000 - Introduction - 3

Implementation • Message Types • Publish: broadcast, tells everyone location of master copy • Subscribe: to publisher tells publisher location of subscriber • Post: to publisher on a subscriber write • Post, to all subscribers in response to receiving a post (CW) • Update, to publisher sent by subscriber to request a post (CR). Subscriber blocks the app, waiting for the post. • Messages are not functions • It is not the job of the transport layer to guarantee deliver, this is assumed. It is the job of the data link layer • What about order of delivery of these messages? • Is it important? Whose responsibility is it? • Is it good to have subscribers known or should we just broadcast all posts (CW). What is the trade-off here? • How would you implement a test and set operation (CW, CR)? • Could there be race to publish? CSE 466 – Fall 2000 - Introduction - 4

Synchronization • Write atomicity • Order (which is important) • Consistency • Actual time of write CSE 466 – Fall 2000 - Introduction - 5

Synchronization What to compare it to: two tasks running in a shared memory space. Case 1 no Caching! Task 2: for (i = 1; i < N-1; i++) { x = i; if (x >= N) error(); } Task 1: for (i = 1; i < N; i++) { x = i; if (x >= N) error(); } How are they different? Case 2 Task 2: for (i = 1; i < N-1; i++) { lock(); x = i; unlock(); if (x >= N) error(); } … Task 1: for (i = 1; i < N; i++) { lock(); x = i; unlock(); if (x >= N) error(); } … What would we want the system to guarantee? At a minimum No guarantees about value of x at the end of the loop, but they will eventually agree on the value. CSE 466 – Fall 2000 - Introduction - 6

Compare to copy-on-read (no cache) Multiprocessor, no shared physical memory Case 2 Task 2: for (i = 1; i < N-1; i++) { lock(); x = i; unlock(); if (x >= N) error(); } Task 1: for (i = 1; i < N; i++) { lock(); x = i; unlock(); if (x >= N) error(); } Case 1 Task 2: for (i = 1; i < N-1; i++) { write(X,i); if (read(X) >= N) error(); } Assume Copy-on-read Read blocks. Task 1: for (i = 1; i < N; i++) { write(X, i); if (read(X) >= N) error(); } Now, will they eventually agree on the value of X at the end? Is this different in any way than the true shared memory case? CSE 466 – Fall 2000 - Introduction - 7

Network shared memory w/ caching Case 2 Task 2: for (i = 1; i < N-1; i++) { lock(); x = i; unlock(); if (x >= N) error(); } Task 1: for (i = 1; i < N; i++) { lock(); x = i; unlock(); if (x >= N) error(); } Case 1 Task 2: for (i = 1; i < N-1; i++) { write(X,i); if (read(X) >= N) error(); } Assume Copy-on-write Read returns last Received value on The network. Task 1: for (i = 1; i < N; i++) { write(X, i); if (read(X) >= N) error(); } Corruption is not a problem…the actual local assignment takes place in the OS. Guaranteed atomic. Local behavior is determined by the order in which “write” messages are received at each task Will they eventually agree on the value of X? Is it sufficient for the transport layer to just to send all write messages in order? CSE 466 – Fall 2000 - Introduction - 8

Other Ideas • It is not the same as shared memory cache coherency problem. We can send messages to each other • It is not necessary to lock every write • Single writer • Order v. Atomicity • Can use semaphore to protect critical sections • Where does the error handling go…do we need ACK NACK at the transport layer? • Leases (publisher has to renew periodically) • Is Broadcast worse than sending only to list of subscribers? CSE 466 – Fall 2000 - Introduction - 9

Homework • Questions for Friday • Extend the protocol stack to support signal(var) and wait(var) system calls • wait: • if address is > 0 decrement and return true • If address is <= 0 block (if time out and return false…be careful) • Signal: • increment var if signaler is legitimate • Propose a scheme to ensure that data-link transmits messages in order with respect to each receiver CSE 466 – Fall 2000 - Introduction - 10

Implementation Application: Responsible for the application semantics: what does the value of the shared variable mean? Transport: Implements “shared memory” interface by using the datalink system to send guaranteed messages between transport layers running on different processors. Might also implement fifos, semaphores, etc. Exports to application: publish(addr); subscribe(addr); post(addr,var); update(addr,&var); Exports to datalink : transport_recv(message); DataLink: Guarantees error free delivery of messages from one transport to another (in order?) using the available physical layer. Can implement a wide variety of retransmit schemes. Exports to transport layer: datalink_send(message); Exports to physical layer: datalink_recv(packet); Physical: Converts a packet into a set of frames for transmission over the bus. Frames are reconstructed and passed back to datalink layer at other end. Knows how to drive the physical bus. Exports to datalink_layer physical_send(packet); Exports to physical layer ISRs to deal with events on the bus such as start, stop, byte transmission CSE 466 – Fall 2000 - Introduction - 11

Example: The Fuel Cell Controller CSE 466 – Fall 2000 - Introduction - 12

Implementation of Shared Memory

Implementation of Shared Memory

Presentation Transcript

Shared-memory Architectures

Shared Memory Parallelism

Shared Memory and Shared Memory Consistency

Shared Memory Considerations

Shared Memory

Distributed Shared Memory

An Implementation of User-level Distributed Shared Memory

Distributed shared memory

Distributed Shared Memory

Shared Memory

Distributed Shared Memory

IPC: Shared Memory

Shared Memory Multiprocessors

Shared Memory Multiprocessors

MIMD Shared Memory

Implementation of Shared Memory

Shared Memory Multiprocessors

An Implementation of User-level Distributed Shared Memory

Implementation and Performance of Munin (Distributed Shared Memory System)

Shared Memory – Consistency of Shared Variables