1 / 72

Failure Handling in a modal Language

Failure Handling in a modal Language. Nels Eric Beckman Research Talk Institute for Software Research October 30, 2006. Claims Made in this Talk. ML5 is an elegant language for programming distributed systems. In the face of node failure, the meaning of ML5 programs becomes unclear.

darius
Download Presentation

Failure Handling in a modal Language

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Failure Handling in a modal Language Nels Eric Beckman Research Talk Institute for Software Research October 30, 2006

  2. Claims Made in this Talk • ML5 is an elegant language for programming distributed systems. • In the face of node failure, the meaning of ML5 programs becomes unclear. • We propose extensions to ML5 that makes their meaning clear. • (In reality, this research is a work in progress.) Failure Handling in a Modal Language ISR

  3. ML5 • A Programming Language for Distributed Systems • Based on a Modal Logic • i.e. A Logic With an Embedded Notion of Place • Tom Murphy’s Thesis Work • Targeted for Grid Programming Failure Handling in a Modal Language ISR

  4. ML5, Briefly... • Allows Hosts to Send ‘Thunks’ to One Another for Execution • In practice, code can be more cleanly decomposed. • Has An Advanced Type System • Location-specific resources can be typed as so. Failure Handling in a Modal Language ISR

  5. PC RPC-Style Distributed Programming fun a = fun b = rpc(“b”,19.x.x.x) + r return x; Host Active thread Blocked thread Message Failure Handling in a Modal Language ISR

  6. PC RPC-Style Distributed Programming fun a = fun b = rpc(“b”,19.x.x.x) + r return x; Host Active thread Blocked thread Message Failure Handling in a Modal Language ISR

  7. PC RPC-Style Distributed Programming fun a = fun b = rpc(“b”,19.x.x.x) + r return x; Host Active thread Blocked thread Message Failure Handling in a Modal Language ISR

  8. PC RPC-Style Distributed Programming fun a = fun b = rpc(“b”,19.x.x.x) + r rpc “b” return x; Host Active thread Blocked thread Message Failure Handling in a Modal Language ISR

  9. PC PC RPC-Style Distributed Programming fun a = fun b = rpc(“b”,19.x.x.x) + r return x; Host Active thread Blocked thread Message Failure Handling in a Modal Language ISR

  10. PC PC RPC-Style Distributed Programming fun a = fun b = rpc(“b”,19.x.x.x) + r return x; Host Active thread Blocked thread Message Failure Handling in a Modal Language ISR

  11. PC PC RPC-Style Distributed Programming fun a = fun b = rpc(“b”,19.x.x.x) + r return x; Host Active thread Blocked thread Message Failure Handling in a Modal Language ISR

  12. PC RPC-Style Distributed Programming fun a = fun b = rpc(“b”,19.x.x.x) + r ret x return x; Host Active thread Blocked thread Message Failure Handling in a Modal Language ISR

  13. PC RPC-Style Distributed Programming fun a = fun b = rpc(“b”,19.x.x.x) + r ret x return x; Host Active thread Blocked thread Message Failure Handling in a Modal Language ISR

  14. PC RPC-Style Distributed Programming fun a = fun b = rpc(“b”,19.x.x.x) + r ret x return x; Host Active thread Blocked thread Message Failure Handling in a Modal Language ISR

  15. PC ML5 Illustration Host Location of thread Migration of thread Failure Handling in a Modal Language ISR

  16. PC ML5 Illustration Host Location of thread Migration of thread Failure Handling in a Modal Language ISR

  17. PC ML5 Illustration Host Location of thread Migration of thread Failure Handling in a Modal Language ISR

  18. PC ML5 Illustration Host Location of thread Migration of thread Failure Handling in a Modal Language ISR

  19. PC ML5 Illustration Host Location of thread Migration of thread Failure Handling in a Modal Language ISR

  20. PC ML5 Illustration Host Location of thread Migration of thread Failure Handling in a Modal Language ISR

  21. PC ML5 Illustration Host Location of thread Migration of thread Failure Handling in a Modal Language ISR

  22. PC ML5 Illustration Host Location of thread Migration of thread Failure Handling in a Modal Language ISR

  23. Example • Remotely Finding List’s Sum (RPC) Server Code: class ListServ { List<Integer> myList = new ... List<Integer> getList() { return myList; } } Failure Handling in a Modal Language ISR

  24. Example • Remotely Finding List’s Sum (RPC) Client Code: class ListClient { ListServerStub myServ = new ... public void foo() { List<Integer> list = myServ.getList(); for(Integer item: list) { count+= item.intValue(); } if( count >= 40 ) ... }} Failure Handling in a Modal Language ISR

  25. Example • Remotely Finding List’s Sum (RPC) • To Fix Should We: • Add a new server operation that returns true if a list’s sum is greater than 40? • Weird if operation is only used once. • We wouldn’t structure application this way in a centralized setting. • Bite the performance bullet and send the whole list? Failure Handling in a Modal Language ISR

  26. Example • Remotely Finding List’s Sum (ML5) Before: fun foo remote_host remote_list_ref = let fun sum a_list = foldl op+ 0 a_list in if sum ( get[remote_host]( !remote_list_ref ) ) > 40 then true else false Failure Handling in a Modal Language ISR

  27. Example • Remotely Finding List’s Sum (ML5) After: fun foo remote_host remote_list_ref = let fun sum a_list = foldl op+ 0 a_list in get[remote_host]( if sum ( !remote_list_ref ) > 40 then true else false ) Failure Handling in a Modal Language ISR

  28. Types • ML5 Type System Embeds a Notion of Place • Some values can be used at any place. • e.g. Primitive data types, structures • Some values can only be used at the location where they make sense. • e.g. File descriptors, reference cells, printers Failure Handling in a Modal Language ISR

  29. Just a Few Types… • τ@w – “The type τ is well-typed on host w.” Failure Handling in a Modal Language ISR

  30. Just a Few Types… • get[w’,a]e – “Evaluate e on host w’ and return the result to the current host. Change e’s type from @w’ to @w.” • Example: fun foo (x: int ref @w’, a: w’ addr @w) = get[w’,a]( !x + !x ) Failure Handling in a Modal Language ISR

  31. Just a Few Types… • get[w’,a]e – “Evaluate e on host w’ and return the result to the current host. Change e’s type from @w’ to @w.” • Example: fun foo (x: int ref @w’, a: w’ addr @w) = get[w’,a]( !x + !x ) Typed int@w’ Failure Handling in a Modal Language ISR

  32. Just a Few Types… • get[w’,a]e – “Evaluate e on host w’ and return the result to the current host. Change e’s type from @w’ to @w.” • Example: fun foo (x: int ref @w’, a: w’ addr @w) = get[w’,a]( !x + !x ) Typed int@w Failure Handling in a Modal Language ISR

  33. Just a Few Types… • □τ – “Suspended code that can be evaluated anywhere. Produces a value of type τ.” • Example: (let fun sum il = foldl op+ 0 il in box (sum [1,2,3,4,5]) end): □int @w Failure Handling in a Modal Language ISR

  34. Just a Few Types… • ◊τ – “A value of type τ that exists at some other location.” • Example: here (ref 5):◊(ref int) @w Failure Handling in a Modal Language ISR

  35. But What About Host Failure? • What happens here? (* at host 1 *) get[w_2, a_2]( (* at host 2 *) !int_ref_at_w_2 + get[w_3, a_3]( (* at host 3 *) !int_ref_at_w_3)) Failure Handling in a Modal Language ISR

  36. But What About Host Failure? • What happens here? (* at host 1 *) get[w_2, a_2]( (* at host 2 *) !int_ref_at_w_2 + get[w_3, a_3]( (* at host 3 *) !int_ref_at_w_3)) Host 2 dies! Failure Handling in a Modal Language ISR

  37. But What About Host Failure? • What happens here? (* at host 1 *) get[w_2, a_2]( (* at host 2 *) !int_ref_at_w_2 + get[w_3, a_3]( (* at host 3 *) !int_ref_at_w_3)) Throw an exception? Host 2 dies! Failure Handling in a Modal Language ISR

  38. But What About Host Failure? • What happens here? (* at host 1 *) get[w_2, a_2]( (* at host 2 *) !int_ref_at_w_2 + get[w_3, a_3]( (* at host 3 *) !int_ref_at_w_3)) Continue on from Host 3? Throw an exception? Host 2 dies! Failure Handling in a Modal Language ISR

  39. But What About Host Failure? • What happens here? (* at host 1 *) get[w_2, a_2]( (* at host 2 *) !int_ref_at_w_2 + get[w_3, a_3]( (* at host 3 *) !int_ref_at_w_3) or_if_i_cant_return (...))) Continue on from Host 3? Throw an exception? Host 2 dies! Failure Handling in a Modal Language ISR

  40. But What About Host Failure? • What happens here? (* at host 1 *) get[w_2, a_2]( (* at host 2 WHICH DOESN’T EXIST!*) !int_ref_at_w_2 + get[w_3, a_3]( (* at host 3 *) !int_ref_at_w_3) or_if_i_cant_return (...))) Continue on from Host 3? Throw an exception? Host 2 dies! Failure Handling in a Modal Language ISR

  41. What We Want (Intuitively) callcc x => (* at host 1 *) get[w_2, a_2]( (* at host 2 *) !int_ref_at_h_2 + get[w_3, a_3]( (* at host 3 *) !int_ref_at_h_3 or_if_i_cant_return (throw (raise NetFail) to x))) Failure Handling in a Modal Language ISR

  42. What We Want (Intuitively) callcc x => (* at host 1 *) get[w_2, a_2]( (* at host 2 *) !int_ref_at_h_2 + get[w_3, a_3]( (* at host 3 *) !int_ref_at_h_3 or_if_i_cant_return (throw (raise NetFail) to x))) Don’t actually throw something through the network. Failure Handling in a Modal Language ISR

  43. What We Want (Intuitively) callcc x => (* at host 1 *) get[w_2, a_2]( (* at host 2 *) !int_ref_at_h_2 + get[w_3, a_3]( (* at host 3 *) !int_ref_at_h_3 or_if_i_cant_return (throw (raise NetFail) to x))) Have host one detect the failure. Don’t actually throw something through the network. Failure Handling in a Modal Language ISR

  44. Isn’t This Just a ‘Timeout’ Exception? • A Good Question: • “Why not just have the ‘get’ operation throw a timeout exception, like in Java?” • e.g. get[w_2, a_2] ( !int_on_w2 ) handle TimeOut => (* do something *) Failure Handling in a Modal Language ISR

  45. Answers • This is actually a little smarter than just ‘timeout.’ • The ‘Implicit Spawn’ Problem Failure Handling in a Modal Language ISR

  46. Answers • This is actually a little smarter than just ‘timeout.’ • The ‘Implicit Spawn’ Problem get[w_2, a_2] ( (* extremely complicated op *) ) handle TimeOut => (* do something *) Failure Handling in a Modal Language ISR

  47. Answers • This is actually a little smarter than just ‘timeout.’ • The ‘Implicit Spawn’ Problem T2 get[w_2, a_2] ( (* extremely complicated op *) ) handle TimeOut => (* do something *) T1 Failure Handling in a Modal Language ISR

  48. What We Need • Share the Fact that Host 1 Has ‘Given Up’ • Kill the Thread ASAP • Make That Thread’s Actions Irrelevant • Each host gets a chance to ‘undo’ potential effects. • All with ‘Best Effort’ Failure Handling in a Modal Language ISR

  49. One More Wrinkle Grab ‘continuation’ Catom 1 Catom 2 Failure Handling in a Modal Language ISR

  50. One More Wrinkle Assign ‘Catom1’ to ‘myLeader’ Catom 1 Catom 2 Failure Handling in a Modal Language ISR

More Related