1 / 43

Remote Procedure Calls

Remote Procedure Calls. EE324. Administrivia. Course feedback Midterm plan Reading material/textbook/slides are updated. Computer Systems: A Programmer's Perspective , by Bryant and O'Hallaron Some reading material (URL) in the slides. (lecture 9). Building up to today.

alexa
Download Presentation

Remote Procedure Calls

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Remote Procedure Calls EE324

  2. Administrivia • Course feedback • Midterm plan • Reading material/textbook/slides are updated. • Computer Systems: A Programmer's Perspective, by Bryant and O'Hallaron • Some reading material (URL) in the slides. (lecture 9)

  3. Building up to today • Abstractions for communication • Internetworking protocol: IP • TCP masks some of the pain of communicating across unreliable IP • Abstractions for computation and I/O • Process: A resource container for execution on a single machine • Thread:pthread and synchronization primitives (sem, mutex, cv) • File

  4. Now back to Distributed Systems: remember? • A distributed system is: “A collection of independent computers that appears to its users as a single coherent system” "A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable." – Leslie Lamport

  5. Distributed Systems • The middleware layer extends over multiple machines, and offers each application the same interface.

  6. What does it do? Hide complexity to programmers/users Hide the fact that its processes and resources are physically distributed across multiple machines. Transparency in a Distributed System

  7. How? • The middleware layer extends over multiple machines, and offers each application the same interface.

  8. Starter for Today • Splitting computation across the network What programming abstractions work well to split work among multiple networked computers?

  9. Many ways • Request-reply protocols • Remote procedure calls (RPC) • Remote method invocation (RMI) Recommended reading : Distributed Systems: Concepts and Design 5th edition, by Coulouris, et al. (CDK5) chapter 5

  10. Request-reply protocols Client Server Hey, do something working { Done/Result

  11. Request-reply protocols • eg, your PA1 (binary protocol) structfoomsg { u_int32_t len; } send_foo(char *contents) { intmsglen = sizeof(structfoomsg) + strlen(contents); char buf = malloc(msglen); structfoomsg *fm = (structfoomsg *)buf; fm->len = htonl(strlen(contents)); memcpy(buf + sizeof(structfoomsg), contents, strlen(contents)); write(outsock, buf, msglen); } Then wait for response, handle timeout, etc.

  12. Request-reply protocols: text protocol • HTTP • See the HTTP/1.1 standard: • http://www.w3.org/Protocols/rfc2616/rfc2616.html • Done with your PA2 yet?

  13. Remote Procedure Call (RPC) • A type of client/server communication • Attempts to make remote procedure calls look like local ones { ... foo() } void foo() { invoke_remote_foo() } figure from Microsoft MSDN

  14. RPC Goals • Ease of programming • Hide complexity • Automate a lot of task of implementing • Familiar model for programmers (just make a function call) Historical note: Seems obvious in retrospect, but RPC was only invented in the ‘80s. See Birrell & Nelson, “Implementing Remote Procedure Call” ... or Bruce Nelson, Ph.D. Thesis, Carnegie Mellon University: Remote Procedure Call., 1981 :)

  15. Remote procedure call • A remote procedure call makes a call to a remote service look like a local call • RPC makes transparent whether server is local or remote • RPC allows applications to become distributed transparently • RPC makes architecture of remote machine transparent

  16. RPC • The interaction between client and server in a traditional RPC.

  17. Passing Value Parameters (1) • The steps involved in a doing a remote computation through RPC.

  18. But it’s not always simple • Calling and called procedures run on different machines, with different address spaces • And perhaps different environments .. or operating systems .. • Must convert to local representation of data • Machines and network can fail

  19. Marshaling and Unmarshaling • (From example) hotnl() -- “host to network-byte-order, long”. • network-byte-order (big-endian) standardized to deal with cross-platform variance • Note how we arbitrarily decided to send the string by sending its length followed by L bytes of the string? That’s marshalling, too. • Floating point... • Nested structures? (Design question for the RPC system - do you support them?) • Complex datastructures? (Some RPC systems let you send lists and maps as first-order objects)

  20. “stubs” and IDLs • RPC stubs do the work of marshaling and unmarshaling data • But how do they know how to do it? • Typically: Write a description of the function signature using an IDL -- interface definition language. • Lots of these. Some look like C, some look like XML, ... details don’t matter much.

  21. SunRPC • Venerable, widely-used RPC system • Defines “XDR” (“eXternal Data Representation”) -- C-like language for describing functions -- and provides a compiler that creates stubs structfooargs { string msg<255>; intbaz; }

  22. And describes functions program FOOPROG { version VERSION { void FOO(fooargs) = 1; void BAR(barargs) = 2; } = 1; } = 9999;

  23. More requirements • Provide reliable transmission (or indicate failure) • May have a “runtime” that handles this • Authentication, encryption, etc. • Nice when you can add encryption to your system by changing a few lines in your IDL file • (it’s never really that simple, of course -- identity/key management)

  24. RPC vs. LPC • Memory access • Partial failures • Latency

  25. But it’s not always simple • Properties of distributed computing that make achieving transparency difficult: • Calling and called procedures run on different machines, with different address spaces • Machines and network can fail • Latency

  26. Passing Reference Parameters • Replace with pass by copy/restore • Need to know size of data to copy • Difficult in some programming languages • Solves the problem only partially • What about data structures containing pointers? • Access to memory in general?

  27. Partial failures • In local computing: • if machine fails, application fails • In distributed computing: • if a machine fails, part of application fails • one cannot tell the difference between a machine failure and network failure • How to make partial failures transparent to client?

  28. RPC failures • Request from cli -> srv lost • Reply from srv -> cli lost • Server crashes after receiving request • Client crashes after sending request

  29. Strawman solution • Make remote behavior identical to local behavior: • Every partial failure results in complete failure • You abort and reboot the whole system • You wait patiently until system is repaired • Problems with this solution: • Many catastrophic failures • Clients block for long periods • System might not be able to recover

  30. Real solution: break transparency • Possible semantics for RPC: • Exactly-once • Impossible in practice • At least once: • Only for idempotent operations • At most once • Zero, don’t know, or once • Zero or once • Transactional semantics • At-most-once most practical • But different from LPC

  31. RPC semantics: Exactly-Once? • Sorry - no can do in general. • Imagine that message triggers an external physical thing (say, a robot fires a missle) • The robot could crash immediately before or after firing and lose its state. Don’t know which one happened. Can, however, make this window very small.

  32. RPC semantics • At-least-once semantics • Keep retrying... • At-most-once • Use a sequence # to ensure idempotency against network retransmissions • and remember it at the server

  33. Implementing at-most-once • At-least-once: Just keep retrying on client side until you get a response. • Server just processes requests as normal, doesn’t remember anything. Simple! • At-most-once: Server might get same request twice... • Must re-send previous reply and not process request (implies: keep cache of handled requests/responses) • Must be able to identify requests • Strawman: remember all RPC IDs handled. -> Ugh! Requires infinite memory. • Real: Keep sliding window of valid RPC IDs, have client number them sequentially.

  34. Summary: expose remoteness to client • Expose RPC properties to client, since you cannot hide them • Application writers have to decide how to deal with partial failures • Consider: E-commerce application vs. game

  35. RPC implementation issues

  36. RPC implementation • Stub compiler • Generates stubs for client and server • Language dependent • Compile into machine-independent format • E.g., XDR • Format describes types and values • RPC protocol • RPC transport

  37. Writing a Client and a Server (1) • The steps in writing a client and a server in DCE RPC.

  38. Writing a Client and a Server (2) • Three files output by the IDL compiler: • A header file (e.g., interface.h, in C terms). • The client stub. • The server stub.

  39. RPC protocol • Guarantee at-most-once semantics by tagging requests and response with a nonce • RPC request header: • Request nonce • Service Identifier • Call identifier • Protocol: • Client resends after time out • Server maintains table of nonces and replies

  40. RPC transport • Use reliable transport layer • Flow control • Congestion control • Reliable message transfer • Combine RPC and transport protocol • Reduce number of messages • RPC response can also function as acknowledgement for message transport protocol

  41. Performance • As a general library, performance is often a big concern for RPC systems • Major source of overhead: copies and marshaling/unmarshaling overhead • Zero-copy tricks: • Representation: Send on the wire in native format and indicate that format with a bit/byte beforehand. What does this do? Think about sending uint32 between two little-endian machines • Scatter-gather writes (writev() and friends)

  42. Complex / Pointer Data Structures • Very few low-level RPC systems support • C is messy about things like that -- can’t always understand the structure and know where to stop chasing • Java RMI (and many other higher-level languages) allows sending objects as part of an RPC • But be careful - don’t want to send megabytes of data across network to ask simple question!

  43. Important Lessons • Procedure calls • Simple way to pass control and data • Elegant transparent way to distribute application • Not only way… • Hard to provide true transparency • Failures • Performance • Memory access • Etc. • How to deal with hard problem  give up and let programmer deal with it • “Worse is better”

More Related