Goals: conceptual + implementation aspects of network application protocols client server paradigm service models learn about protocols by examining popular application-level protocols More goals specific protocols: http – our focus ftp smtp pop dns: done! programming network applications socket programming Done! Application Layer
Application: communicating, distributed processes running in network hosts in “user space” exchange messages to implement app e.g., email, file transfer, the Web Application-layer protocols one “piece” of an app define messages exchanged by apps and actions taken user services provided by lower layer protocols application transport network data link physical application transport network data link physical application transport network data link physical Applications and application-layer protocols
A process is a program that is running within a host. Within the same host, two processes communicate with interprocess communication defined by the OS. Processes running in different hosts communicate with an application-layer protocol A user agent is an interface between the user and the network application. Web:browser E-mail: mail reader streaming audio/video: media player Network applications: some jargon
Typical network app has two pieces: client and server request reply application transport network data link physical application transport network data link physical Client-server paradigm Client: • initiates contact with server (“speaks first”) • typically requests service from server, • for Web, client is implemented in browser; for e-mail, in mail reader, e.g., outlook Server: • provides requested service to client • e.g., Web server sends requested Web page, mail server delivers e-mail
API: application programming interface defines interface between application and transport layer socket: Internet API two processes communicate by sending data into socket, reading data out of socket Q: how does a process “identify” the other process with which it wants to communicate? IP address of host running other process “port number” - allows receiving host to determine to which local process the message should be delivered Application-layer protocols (cont). … refer to socket prorgamming
Data loss some apps (e.g., audio) can tolerate some loss other apps (e.g., file transfer, telnet) require 100% reliable data transfer Timing some apps (e.g., Internet telephony, interactive games) require low delay to be “effective” What transport service does an app need? Bandwidth • some apps (e.g., multimedia) require minimum amount of bandwidth to be “effective” • other apps (“elastic apps”) make use of whatever bandwidth they get
Transport service requirements of common apps Time Sensitive no no no yes, 100’s msec yes, few secs yes, 100’s msec yes and no Application file transfer e-mail Web documents real-time audio/video stored audio/video interactive games financial apps Data loss no loss no loss loss-tolerant loss-tolerant loss-tolerant loss-tolerant no loss Bandwidth elastic elastic elastic audio: 5Kb-1Mb video:10Kb-5Mb same as above few Kbps up elastic
TCP service: connection-oriented: setup required between client, server reliable transport between sending and receiving process flow control: sender won’t overwhelm receiver congestion control: throttle sender when network overloaded does not providing: timing, minimum bandwidth guarantees UDP service: unreliable data transfer between sending and receiving process does not provide: connection setup, reliability, flow control, congestion control, timing, or bandwidth guarantee Q: why bother? Why is there a UDP? Services provided by Internet transport protocols
Internet apps: their protocols and transport protocols Application layer protocol smtp [RFC 821] telnet [RFC 854] http [RFC 2068] ftp [RFC 959] proprietary (e.g. RealNetworks) NSF proprietary (e.g., Vocaltec) Underlying transport protocol TCP TCP TCP TCP TCP or UDP TCP or UDP typically UDP Application e-mail remote terminal access Web file transfer streaming multimedia remote file server Internet telephony
Web page: consists of “objects” addressed by a URL Most Web pages consist of: base HTML page, and several referenced objects. URL has two components: host name and path name: User agent for Web is called a browser: MS Internet Explorer Netscape Communicator Firefox Server for Web is called Web server: Apache (public domain) MS Internet Information Server The Web: some jargon www.someSchool.edu/someDept/pic.gif
http: hypertext transfer protocol Web’s application layer protocol client/server model client: browser that requests, receives, “displays” Web objects server: Web server sends objects in response to requests http1.0: RFC 1945 http1.1: RFC 2068 The Web: the http protocol http request PC running Explorer http response http request Server running NCSA Web server http response Mac running Navigator
http: TCP transport service: client initiates TCP connection (creates socket) to server, port 80 server accepts TCP connection from client http messages (application-layer protocol messages) exchanged between browser (http client) and Web server (http server) TCP connection closed http is “stateless” server maintains no information about past client requests The http protocol: more aside Protocols that maintain “state” are complex! • past history (state) must be maintained • if server/client crashes, their views of “state” may be inconsistent, must be reconciled
HTTP Usage • HTTP is the protocol that supports communication between web browsers and web servers. • A “Web Server” is a HTTP server • Most clients/servers today speak version 1.1, but 1.0 is also in use.
From the RFC “HTTP is an application-level protocol with the lightness and speed necessary for distributed, hypermedia information systems.”
Suppose user enters URL www.someSchool.edu/someDepartment/home.index 1a. http client initiates TCP connection to http server (process) at www.someSchool.edu. Port 80 is default for http server. http 1.0 example (contains text, references to 10 jpeg images) 1b.http server at host www.someSchool.edu waiting for TCP connection at port 80. “accepts” connection, notifying client 2.http client sends http request message (containing URL) into TCP connection socket 3.http server receives request message, forms response message containing requested object (someDepartment/home.index), sends message into socket time
5. http client receives response message containing html file, displays html. Parsing html file, finds 10 referenced jpeg objects http example (cont.) 4.http server closes TCP connection. 6.Steps 1-5 repeated for each of 10 jpeg objects time
Non-persistent HTTP/1.0 server parses request, responds, and closes TCP connection 2 RTTs to fetch each object Each object transfer suffers from slow start Persistent default for HTTP/1.1 on same TCP connection: server, parses request, responds, parses new request,.. Client sends requests for all referenced objects as soon as it receives base HTML. Fewer RTTs and less slow start. Non-persistent and persistent connections But most 1.0 browsers use parallel TCP connections.
A Typical HTTP Session • User types “www.seattleu.edu/index.html” into a browser • Browser translates www.seattleu.edu into an IP address and tries to open a TCP connection with port 80 of that address • Once a connection is established, the browser sends the following byte stream: • GET /dcs/index.html HTTP/1.1 • HOST: www.seattleu.edu” (plus an empty line below) • The host responds with • a set of headers indicating which protocol is actually being used, whether or not the file requested was found, how many bytes are contained in that file, and what kind of information is contained in the file ("MIME" type) • a blank line to indicate the end of the headers • the contents of the page
A Typical HTTP Session • If the browser finds images embedded in the page, it starts a separate request for each image … • The TCP connection is kept alive a bit longer, then closes if no further requests are received from the browser
Request - Response • HTTP has a simple structure: • client sends a request • server returns a reply. • HTTP can support multiple request-reply exchanges over a single TCP connection.
Well Known Address • The “well known” TCP port for HTTP servers is port 80. • Other ports can be used as well...
HTTP Versions • The original version now goes by the name “HTTP Version 0.9” • HTTP 0.9 was used for many years. • Starting with HTTP 1.0 the version number is part of every request. • tells the server what version the client can talk (what options are supported, etc).
HTTP 1.0+ Request Request-Line Headers . . . • Lines of text (ASCII). • Lines end with CRLF “\r\n” • First line is called “Request-Line” blank line Content...
Request Line MethodURIHTTP-Version\r\n • The request line contains 3 tokens (words). • space characters “ “ separate the tokens. • Newline (\n) seems to work by itself (but the protocol requires CRLF)
HTTP Request • Format: • Method URI HttpVersion
Common Usage • GET, HEAD and POST are supported everywhere . • HTTP 1.1 servers often support PUT, DELETE, OPTIONS & TRACE.
URI: Universal Resource Identifier • URIs defined in RFC 2396. • Absolute URI: scheme://hostname[:port]/path http://www.cs.rpi.edu:80/blah/foo • Relative URI: /path /blah/foo No server mentioned
URI Usage • When dealing with a HTTP 1.1 server, only a path is used (no scheme or hostname). • HTTP 1.1 servers are required to be capable of handling an absolute URI, but there are still some out there that won’t… • When dealing with a proxy HTTP server, an absolute URI is used. • client has to tell the proxy where to get the document! • more on proxy servers in a bit….
HTTP Version Number “HTTP/1.0” or “HTTP/1.1” HTTP 0.9 did not include a version number in a request line. If a server gets a request line with no HTTP version number, it assumes 0.9
The Header Lines • After the Request-Line come a number (possibly zero) of HTTP header lines. • Each header line contains an attribute name followed by a “:” followed by a space and the attribute value. The Name and Value are just text.
Headers • Request Headers provide information to the server about the client • what kind of client • what kind of content will be accepted • who is making the request • There can be 0 headers (HTTP 1.0) • HTTP 1.1 requires a Host: header
HTTP Headers • Accept: Indicates which data formats are acceptable. • Accept: text/html, text/plain • Content-Language: Language of the content • Content-Language: en • Content-Length: Size of message body • Content-Length: 1234 • Content-Type: MIME type of content body • Content-Type: text/html • Date: Date of request/response • Date: Tue, 15 Nov 1994 08:12:31 GMT • Expires: When content is no longer valid • Expires: Tue, 15 Nov 1994 08:12:31 GMT • Host: Machine that request is directed to • Host: www.cs.uct.ac.za • Location: Redirection to a different resource • Location: http://myserver.org/ • Retry-After: Indicates that client must try again in future • Retry-After: 120
Example HTTP Headers Accept: text/html Host: www.rpi.edu From: firstname.lastname@example.org User-Agent: Mozilla/4.0 Referer: http://foo.com/blah
End of the Headers • Each header ends with a CRLF ( \r\n ) • The end of the header section is marked with a blank line. • just CRLF • For GET and HEAD requests, the end of the headers is the end of the request!
http message format: request • two types of http messages: request, response • http request message: • ASCII (human-readable format) request line (GET, POST, HEAD commands) GET /somedir/page.html HTTP/1.0 User-agent: Mozilla/4.0 Accept: text/html, image/gif,image/jpeg Accept-language:fr (extra carriage return, line feed) header lines Carriage return, line feed indicates end of message
http request message: general format Entity body is empty for “GET”, but not for “POST”
POST • A POST request includes some content (some data) after the headers (after the blank line). • There is no format for the data (just raw bytes). • A POST request must include a Content-Length line in the headers: Content-length: 267
Example GET Request GET /~hollingd/testanswers.html HTTP/1.1 Accept: */* Host: www.cs.rpi.edu User-Agent: Internet Explorer From: email@example.com Referer: http://foo.com/ There is a blank line here!
Example POST Request POST /~hollingd/changegrade.cgi HTTP/1.1 Accept: */* Host: www.cs.rpi.edu User-Agent: SecretAgent V2.3 Content-Length: 35 Referer: http://monte.cs.rpi.edu/blah stuid=6660182722&item=test1&grade=99
Typical Method Usage GET used to retrieve an HTML document. HEAD used to find out if a document has changed. POST used to submit a form.
http message format: respone status line (protocol status code status phrase) HTTP/1.0 200 OK Date: Thu, 06 Aug 1998 12:00:15 GMT Server: Apache/1.3.0 (Unix) Last-Modified: Mon, 22 Jun 1998 …... Content-Length: 6821 Content-Type: text/html data data data data data ... header lines data, e.g., requested html file
HTTP Response • Format: • HTTP Version Status Code Reason
200 OK request succeeded, requested object later in this message 301 Moved Permanently requested object moved, new location specified later in this message (Location:) 400 Bad Request request message not understood by server 404 Not Found requested document not found on this server 505 HTTP Version Not Supported http response status codes In first line in server->client response message. A few sample codes:
Response Headers • Provide the client with information about the returned entity (document). • what kind of document • how big the document is • how the document is encoded • when the document was last modified • Response headers end with blank line
Response Header Examples Date: Wed, 30 Jan 2002 12:48:17 EST Server: Apache/1.17 Content-Type: text/html Content-Length: 1756 Content-Encoding: gzip
Content • Content can be anything (sequence of raw bytes). • Content-Length header is required for any response that includes content. • Content-Type header also required.
Single Request/Reply • The client sends a complete request. • The server sends back the entire reply. • The server closes it’s socket. • If the client needs another document it must open a new connection. This was the default for HTTP 1.0
Persistent Connections • HTTP 1.1 supports persistent connections (this is the default). • Multiple requests can be handled over a single TCP connection. • The Connection: header is used to exchange information about persistence (HTTP/1.1) • 1.0 Clients used a Keep-alive: header
1. Telnet to your favorite Web server: Trying out http (client side) for yourself Opens TCP connection to port 80 (default http server port) at www.amazon.com Anything typed in sent to port 80 at www.eurecom.fr telnet www.amazon.com 80 2. Type in a GET http request: By typing this in (hit carriage return twice), you send this minimal (but complete) GET request to http server GET /index.html HTTP/1.0 3. Look at response message sent by http server!
Authentication goal: control access to server documents stateless: client must present authorization in each request authorization: typically name, password authorization: header line in request if no authorization presented, server refuses access, sends WWW authenticate: header line in response usual http request msg + Authorization:line usual http request msg + Authorization:line usual http response msg usual http response msg time User-server interaction: authentication server client usual http request msg 401: authorization req. WWW authenticate: Browser caches name & password so that user does not have to repeatedly enter it.