1 / 28

Distributed Process Management

Distributed Process Management. Chapter 14. Process Migration. Move an active process from one machine to another The process migrates to a target machine Transferring a sufficient amount of the state of a process from one machine to another. Why Migrate?. Load sharing

brian
Download Presentation

Distributed Process Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Process Management Chapter 14 Chapter 14 - 28 pages

  2. Process Migration • Move an active process from one machine to another • The process migrates to a target machine • Transferring a sufficient amount of the state of a process from one machine to another Chapter 14 - 28 pages

  3. Why Migrate? • Load sharing • move processes from heavily loaded to lightly load systems • load can be balanced to improve overall performance • Communications performance • processes that interact intensively can be moved to the same node to reduce communications cost • may be better to move process to where the data reside when the data is large Chapter 14 - 28 pages

  4. Why Migrate? • Availability • long-running process may need to move because the machine it is running on will be down • Utilizing special capabilities • process can take advantage of unique hardware or software capabilities Chapter 14 - 28 pages

  5. Initiation of Migration • Operating system • when goal is load balancing • Process • when goal is to reach a particular resource Chapter 14 - 28 pages

  6. What is Migrated? • Must destroy the process on the source system • Process control block and any links must be moved Chapter 14 - 28 pages

  7. What is Migrated? • Eager (all):Transfer entire address space • no trace of process is left behind • if address space is large and if the process does not need most of it, then this approach my be unnecessarily expensive Chapter 14 - 28 pages

  8. What is Migrated • Precopy: Process continues to execute on the source node while the address space is copied • pages modified on the source during precopy operation have to be copied a second time • reduces the time that a process is frozen and cannot execute during migration Chapter 14 - 28 pages

  9. What is Migrated? • Eager (dirty): Transfer only that portion of the address space that is in main memory • any additional blocks of the virtual address space are transferred on demand • the source machine is involved throughout the life of the process • good if process is temporarily going to another machine • good for a thread since the threads left behind need the same address space Chapter 14 - 28 pages

  10. What is Migrated? • Copy-on-reference: Pages are only brought over on reference • variation of eager (dirty) • has lowest initial cost of process migration Chapter 14 - 28 pages

  11. What is Migrated? • Flushing: Pages are cleared from main memory by flushing dirty pages to disk • later use copy-on-reference strategy • relieves the source of holding any pages of the migrated process in main memory Chapter 14 - 28 pages

  12. KJ KJ KJ KJ KJ Negotiation of Process Migration 1: Will you take P? Starter Starter 2: Yes, migrate to machine 3 6: MigrateIn P 5: Offer P 3: MigrateOut P A B P S D 0 1 2 3 4 4: Offer P 7: Accept offer Chapter 14 - 28 pages

  13. Distributed Global States • Operating system cannot know the current state of all process in the distributed system • A process can only know the current state of all processes on the local system • Remote processes only know state information that is received by messages • these messages represent the state in the past Chapter 14 - 28 pages

  14. Example • Bank account is distributed over two branches • The total amount in the account is the sum at each branch • At 3 PM the account balance is determined • Messages are sent to request the information Chapter 14 - 28 pages

  15. SA = $100.00 l l l l l Branch A SB = $0.00 l l l l l Branch B 3:00 Total = $100.00 Example Chapter 14 - 28 pages

  16. Example • If at the time of balance determination, the balance from branch A is in transit to branch B • The result is a false reading Chapter 14 - 28 pages

  17. Example SA = $0.00 2:59 l l Branch A msg = “Transfer $100 to Branch B” l l Branch B SB = $0.00 3:01 3:00 Total = $0.00 Chapter 14 - 28 pages

  18. Example • All messages in transit must be examined at time of observation • Total consists of balance at both branches and amount in message Chapter 14 - 28 pages

  19. Example • If clocks at the two branches are not perfectly synchronized • Transfer amount at 3:01 from branch A • Amount arrives at branch B at 2:59 • At 3:00 the amount is counted twice Chapter 14 - 28 pages

  20. Example SA = $100.00 l l Branch A 3:00 3:01 msg = “Transfer $100 to Branch B” l l Branch B 2:59 SB = $100.00 3:00 Total = $200.00 Chapter 14 - 28 pages

  21. Example of a Snapshot Process 3 Outgoing channels 2 sent 1,2,3,4,5,6,7,8 Incoming channels 1 received 1,2,3 stored 4,5,6 2 received 1,2,3 stored 4 4 received 1,2,3 Process 1 Outgoing channels 2 sent 1,2,3,4,5,6 3 sent 1,2,3,4,5,6 Incoming channels Process 2 Outgoing channels 3 sent 1,2,3,4 4 sent 1,2,3,4 Incoming channels 1 received 1,2,3,4 stored 5,6 3 received 1,2,3,4,5,6,7,8 Process 4 Outgoing channels 3 sent 1,2,3 Incoming channels 2 received 1,2 stored 3,4 Chapter 14 - 28 pages

  22. Ordering of Events • Events must be order to ensure mutual exclusion and avoid deadlock • Clocks are not synchronized • Communication delays • State information for a process is not up to date Chapter 14 - 28 pages

  23. Ordering of Events • Need to consistently say that one event occurs before another event • Messages are sent when want to enter critical section and when leaving critical section • Time-stamping • orders events on a distributed system • system clock is not used Chapter 14 - 28 pages

  24. Time-Stamping • Each system on the network maintains a counter which functions as a clock • Each site has a numerical identifier • When a message is received, the receiving system sets is counter to one more than the maximum of its current value and the incoming time-stamp (counter) Chapter 14 - 28 pages

  25. Time-Stamping • If two messages have the same time-stamp, they are ordered by the number of their sites • For this method to work, each message is sent from one process to all other processes • ensures all sites have same ordering of messages • for mutual exclusion and deadlock all processes must be aware of the situation Chapter 14 - 28 pages

  26. Token-Passing Algorithm if not token_present then begin clock := clock + 1; Prelude broadcast(Request, clock I); wait(access, token); token_present := True; end endif; token_held := True: <critical section> token(i) := clock; Postlude token_head := False; for j := i + 1 to n, 1 to i - 1 do if (request(j) > token(J)) [Symbol]^token_present then begin token_present := False; send(access, token(j)) end endif; when received (Request, k, j) do Notation: request(j) := max(request(j), k); send(j, access, token) send message of type access, with token, by process j if token_present[Symbol]Ynot token_held then broadcast(request, clock, i) send message from process i of type request, with <text of postlude> timestamp clock, to all other processes endif received(request, t, j) receive message from process j of type request, enddo; with timestamp t Chapter 14 - 28 pages

  27. Distributed Deadlock Detection Algorithm {Transaction Tj receiving update message} begin if Wait_for(Tj) ¹ Wait_for(Ti) then Wait_for(Tj) := Wait_for(Ti); if Wait_for(Tj) Ç Request_Q(Tj) = nil then update(Wait_for(Ti), Request_Q(Tj)) else begin DECLARE DEADLOCK; {initiate deadlock resolution as follows} {Tj is chosen as the transaction to be aborted} {Tj releases all the data objects it holds} send clear(Tj, Held_by(Tj)); allocate each data object Di held by Tj to the first requester Tk in Request_Q(Tj); for every transaction Tn in Request_Q(Tj) requesting data object Di held by Tj do Enqueue(Tn, Request_Q(Tk)); end end. {Date object Dj receiving a lock_request(Ti)} begin if Locked_by(Dj) = nil then send granted else begin send not granted to Ti; send Locked)by(Dj) to Ti end end. {Transaction Ti makes a lock request for data object Dj} begin send lock_request(Ti) to Dj; wait for granted/not granted; if granted then begin Locked_by(Dj) := Ti; Held_by(Ti) := f end else {suppose Dj is being used by transaction Tj} begin Held_by(Ti) := Tj; Enqueue(Ti, Request_Q(Tj)); if Wait_for(Tj) = nil then Wait_for(Ti) := Tj else Wait_for(Ti) := Wait_for(Tj); update(Wait)for(Ti), Request_Q(Ti)) end end. {Transaction Tk receiving a clear(Tj, Tk) message begin purge the tuple having Tj as the requesting transaction from Request_Q(Tk) end. Chapter 14 - 28 pages

  28. T4 T5 T4 T5 T0 T1 T2 T3 T0 T1 T2 T3 T6 T6 Example of Distributed Deadlock Algorithm Chapter 14 - 28 pages

More Related