1 / 45

Distributed Shared Memory Systems and Programming

Distributed Shared Memory Systems and Programming. By: Kenzie MacNeil Adapted from Parallel Programming Techniques and Applications using networked workstations and parallel computers by Barry Wilkinson and Michael Allen, and. Distributed Shared Memory Systems.

summer-barr
Download Presentation

Distributed Shared Memory Systems and Programming

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Shared Memory Systems and Programming By: Kenzie MacNeil Adapted from Parallel Programming Techniques and Applications using networked workstations and parallel computers by Barry Wilkinson and Michael Allen, and

  2. Distributed Shared Memory Systems • Shared memory programming model on al cluster • Has physically distributed and separate memory • Programming Viewpoint: • Memory is grouped together and sharable between processes • Known as Distributed Shared Memory (DSM)

  3. Distributed Shared Memory Systems • Can be achieved by software or hardware • Software: • Easy to use on clusters • Inferior to using explicit message passing on the same cluster • Utilizes the same techniques as true shared memory systems (Chapter 8)

  4. Distributed Shared Memory • Shared memory programming is generally more convenient than message passing • Data can be accessed by individual processors without explicitly sending data • Shared data has to be controlled • Locks or other means • Both message passing and shared memory often require synchronization

  5. Distributed Shared Memory • Distributed Shared Memory is a group of interconnected computers appearing to have a sing memory with a single address space • Each computer having its own memory which is physically distributed • Any memory location can be accessed by any processor in the cluster • Regardless of the memory residing locally

  6. Distributed Shared Memory

  7. Advantages of DMS • Normal shared memory programming techniques can be used • Easily scalable, compared to traditional bus-connected shared memory multiprocessors • Message passing is hidden from the user • Can handle complex and large data bases without replication or sending the data to processes

  8. Disadvantages of DMS • Lower performance than true shared memory multiprocessor systems • Must provide for protection against simultaneous access to shared data • Locks, etc. • Little programmer control over actual messages being generated • Incur performance penalties when compared to message passing routines on a cluster

  9. Hardware DSM Systems • Special network interfaces and cache coherence circuits are required • Several interfaces that support shared memory operations • Higher level of performance • More expensive

  10. Software DSM Systems • Requires no hardware changes • Preformed by software routines • Software layer added between the operating system and the applications • Kernel may or may not be modified • Software layer can be • Page based • Shared variable based • Object based

  11. Page Based DMS • Existing virtual memory is used to instigate movement of data between computer • Occurs when page referenced does not reside locally • Referred to as virtual shared memory system • Page based systems include: • The first DMS system by Li(1986), TreadMarks (1996), Locust (1998)

  12. Page Based DSM System

  13. Page Based DMS Disadvantages • Size of the unit of the data, a page, can be too big • More than the specific data is usually referenced • Leads to longer messages • Not portable, because they are tied to a particular virtual memory hardware and software • False sharing effects appear at the page level • Situation in which different parts of a page are required by different processors without any actual sharing of information, but each page must be shared by each process to access different parts

  14. Shared Variable DMS • Only variables declared as shared are transferred • Transferred on demand • Paging mechanism is not used • Software routines perform the actions • Shared Variable DMS approach includes: • Munin (1990), JIAJIA (1999), Adsmith (1996)

  15. Object Based DMS • Shared data is embodied in objects • Includes data items and procedures/methods • Methods used to access data • Similar to shared variable approach, even considered an extension • Easily implemented in OO languages

  16. Managing Shared Data • Many ways a processor can be given access to shared data • Simplest is the use of a central server • Responsible for all read write operations on shared data • Requests sent to this server • Occurs sequentially on the server • Implements a single reader/ single writer policy

  17. Managing Shared Data • Single reader/writer policy incurs bottleneck • Additional servers can be added to relieve this bottleneck by sharing variables • However multiple copies of data is preferable • Allows simultaneous access to the data by different processors • Coherence policy must be used to maintain these copies

  18. Multiple Reader / Single Writer • Allows multiple processors to read shared data • Which can be achieved by replicating data • Allows only one processor, the owner, to alter data at any instant • When an owner alters data two policies are available: • Update policy • Invalidate policy

  19. Multiple Reader/Single Writer Policy • Update policy • Utilizes broadcast • All copies are altered to reflect broadcast message • Invalidate policy • All unaltered copies of the data are flagged as invalid • Requires a processor to make a request from the owner • Any copies of the data that are not accessed remain invalid • Both policies are needed to be reliable

  20. Multiple Reader/Single Writer Policy • Page based approach • Complete page, which holds the variable, is transferred • A variable stored on a page which is not shared will be moved or invalidated • Protocols offered by applications like TreadMarks for dual writing to a single page

  21. Achieving Consistent Memory in DSM • Memory consistency addresses when the current value of a shared variable is seen by other processors • Various models are available: • Strict Consistency • Sequential Consistency • Relaxed Consistency • Weak consistency • Release Consistency • Lazy Release Consistency

  22. Strict Consistency • Variable is obtained from the most recent write to the shared variable • As soon as a variable is altered all other processors are informed • Can be done by update or invalidity • Disadvantage is the large number of messages and changes are not instantaneous • Relaxed memory consistency, writes are delayed to reduce message passing

  23. Strict Consistency

  24. Sequential and Weak Consistency • Sequential consistency, result of any execution same as an interleaving of individual programs • Weak consistency, synchronized operations are used by the programmer to enforce sequential consistency • Any accesses to shared data can be controlled with synchronized operations • Locks, etc

  25. Release Consistency • Extension of weak consistency • Specified synchronization operation • Acquire operation, used before a shared variable or variables are to be read • Release operations, used after the shared variable or variable have been altered • Acquire is performed with a lock operation • Release is performed with an unlock operation

  26. Release Consistency

  27. Lazy Release Consistency • Version of release consistency • Update is only done at the time of acquire rather than at release • Generates fewer messages that release consistency

  28. Lazy Release Consistency

  29. Distributed Shared Memory Programming Primitives • Four fundamental and necessary operations of shared memory programming: • Process/thread creations and termination • Shared data creation • Mutual exclusion synchronization, controlled access to shared data • Process/thread and event synchronization • Typically provided by user-level library calls

  30. Process Creation • Set of routines are defined by DSM systems • Such as Adsmith and TreadMarks • Used to start new process if process creation is supported • dsm_spawn(filename, num_processes);

  31. Shared Data Creation • Routine is necessary to declare shared data • dsm_shared(&x); or shared int x; • Dynamically creates memory space for shared data in the manner of a C malloc • After memory space can be discarded

  32. Shared Data Access • Various forms of data access are provided depending on the memory consistency used • Some systems provide efficient routines for difference classes of accesses • Adsmith provides three types of accesses: • Ordinary Accesse • Synchronization Access • Non-Synchronization Access

  33. Synchronization Accesses • Two principle forms: • Global synchronization and process-process pair synchronization • Global is usually done through barrier routines • Process-process pair can be done by the same routine or separate routines through simple synchronous send/receive routines • DSM systems could also provide their own routines

  34. Overlapping Computations with Communications • Can be provided by starting a nonblocking communication before it results are needed • Called a prefetch routine • Program continues execution after the prefetch has been called and while the data is being fetched • Could even be done speculatively • Special mechanism must be in place to handle memory exceptions • Similar to speculative load mechanism used in advanced processors that overlap memory operations with program execution

  35. Distributed Shared Memory Programming • DSM programming on a cluster uses the same concepts as shared memory programming on a shared memory multiprocessor system • Uses user level library routines or methods • Message passing is hidden from the user

  36. Basic Shared-Variable Implementation • Simplest DSM implementation is to use a shared variable approach with user level DSM library routines • Sitting on top of an existing message passing systems, such as MPI • Routines can be embodied into classes and methods • The routines could send messages to a central location that is responsible for the shared variables

  37. Simple DSM System using a Centralized Server Single reader/writer protocol

  38. Basic Shared-Variable Implementation • A simple DSM system using a centralized server can easily result in a bottleneck • One method to reduce this bottleneck is to have multiple servers running on different processors • Each server responsible for specific shared variables • This is a single reader / single writer protocol

  39. Simple DSM System using Multiple Servers

  40. Basic Shared-Variable Implementation • Also can provide multiple reader capability • A specific server is responsible for the shared variable • Other local copies are invalidated

  41. Simple DSM System using Multiple Servers and Multiple Reader Policy

  42. Overlapping Data Groups • Existing interconnections structure • Access patterns of the application • Static overlapping • Defined by the programmer prior to execution • Shared variables can migrate according to usage

  43. Symmetrical Multiprocessor System with Overlapping Data Regions

  44. Simple DSM System using Multiple Servers and Multiple Reader Policy

  45. Questions or Comments?

More Related