1 / 36

Petrozavodsk State University, Alex Moschevikin, 2003

Hypertext Transfer Protocol. temporary location of course "Net Technologies": http://dims.karelia.ru/~alexmou/. HTTP/1.1 Authors Roy Fielding (UCI) Jim Gettys - Editor (Digital ISBU / W3C) Jeff Mogul (Digital / WRL) Henrik Frysyk Nielsen (W3C) Tim Berners - Lee (W3C)

makya
Download Presentation

Petrozavodsk State University, Alex Moschevikin, 2003

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hypertext Transfer Protocol temporary location of course "Net Technologies": http://dims.karelia.ru/~alexmou/ • HTTP/1.1 Authors • Roy Fielding (UCI) • Jim Gettys - Editor (Digital ISBU / W3C) • Jeff Mogul (Digital / WRL) • Henrik Frysyk Nielsen (W3C) • Tim Berners-Lee (W3C) • IETF HTTP Working Group • Larry Masinter - Working Group Chair Tim Berners-Lee Thanks to Jim Gettys, Digital EquipmentCorporation, 1996 and James Marshall, 1997. Rev. 1.05 / 14.01.2007 Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  2. HTTP and OSI RM Layer 7 Layer 6 Layer 5 Layer 4 Layer 3 Layer 2 Layer 1 APPLICATION HTTP TCP IP Physical PRESENTATION SESSION TRANSPORT NETWORK DATA LINK PHYSICAL TCP/IP OSI/RM Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  3. What is HTTP? • Application-levelprotocol for distributed, collaborative, hypermedia informationsystems. • Transaction-oriented client/server protocol. • HTTP uses TCP as transport basis. • Text-based commands and directives (not binary). • HTTP (original version)was a "stateless" protocol; each transaction was treated independently. A typical implementation creates a new TCP connection between client and server for each transaction and then terminates the connection as soon as the transaction completes. • Flexible in formats it can handle. Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  4. History of HTTP WWW=HTTP • HTTP/0.9, 1990. Graphical user interface with hyper links to other information both text, graphics, sound, video etc. starts at “homepage”. • 1994 - population explosion on net with many countries providing access. • HTTP/1.0 (RFC 1945, May 1996), the protocol was improvedby allowing messages to be in the format of MIME-likemessages, containing metainformation about the data transferred andmodifiers on the request/response semantics. • However, HTTP/1.0 doesnot sufficiently take into consideration the effects of hierarchicalproxies, caching, the need for persistent connections, or virtualhosts. • HTTP/1.1 (RFC 2068, Jan. 1997), (RFC 2616, June 1999). Screenshot of the first version of Netscape Navigator, 1994 Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  5. URI, URL, URN, difference An Uniform Resource Locator (URL) is the term used to identify an Internet resourcewithout the name specification (example, # anchor in HTML), and can be specified in a single line of text. There are more than 30 URI(URL)-schemes registered in IANA. An Uniform Resource Name (URN) is the term used to identify an Internet resource, without the use of a scheme, and can be specified in a single line of text ("urn:isbn:n-nn-nnnnnn-n"). An Uniform Resource Identifier (URI) is thejunction of URL and URN. URIhttp://www.gleaners.org/faq.html#Q04 (#Q04 is not sent to http server) URLhttp://www.gleaners.org/faq.html URNurn:ietf:rfc:2141 urn:ietf:std:50 urn:ietf:id:ietf-urn-ietf-06 urn:ietf:mtg:41-urn Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  6. URI - Uniform Resource Identifier • Identifies the resource on host machine and and access method for that resource. • General form is • <scheme>:<scheme specific part> • http://www.gde-to.tut:80/~brewery • https://www.gde-to.tam/ • ftp://anonymous:anonymous@128.12.63.263/films/a.avi • Parts • Scheme or protocol • User name • :Password • DNS name of the host • TCP port • Path to and name of resource (index.html) Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  7. Understanding HTTP direct connection: User Agent -> HTTP-request -> Web-server no end-to-end TCP connection between the User Agent and the origin server resource saving work of proxies (caching) Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  8. 3 types of intermediate systems traditional http-proxy (firewall + proxy, different version of HTTP) gateway, substituted origin server (non-http request following auth. http-request) tunnel (no operations on http requests and responses) Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  9. Structure of HTTP transactions Like most network protocols, HTTP uses the client-server model: 1. An HTTP client opens a connection and sends a request message to an HTTP server. 2. The server then returns a response message with request status code, usually containing the resource that was requested. Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  10. Format of request or response • Both kinds of messages (request and response) consist of: • an initial line, • zero or more header lines, • a blank line (i.e. a CRLF by itself), and • an optional message body (e.g. a file, or query data, or query output). • Put another way, the format of an HTTP message is: • <initial line, different for request vs. response> • Header1: value1 • Header2: value2 • Header3: value3 • <optional message body goes here, like file contents or query data; it can be many lines long, or even binary data $&*%@!^$@> Initial lines and headers should end in CRLF (0D 0A). Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  11. Request methods Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  12. Initial request line • A request line has three parts, separated by spaces: • a method name, • the local path of the requested resource, • and the version of HTTP being used. • A typical request line is: • GET /path/to/file/index.html HTTP/1.0 • Notes: • Method names are always uppercase. • The path is the part of the URL after the host name, also called the request URI. • The HTTP version always takes the form "HTTP/x.x", uppercase. Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  13. Initial response line • The initial response line, called the status line, also has three parts separated by spaces: • the HTTP version, • a response status code that gives the result of the request, and • an English reason phrase describing the status code. • Typical status lines are: • HTTP/1.0 200 OKorHTTP/1.0 404 Not Found • Notes: • The status code is a three-digit integer, and the first digit identifies the general category of response: • 1xx indicates an informational message only • 2xx indicates success of some kind • 3xx redirects the client to another URL • 4xx indicates an error on the client's part • 5xx indicates an error on the server's part Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  14. The most common status codes The most common status codes are: 200 OK-- the request succeeded, and the resulting resource (e.g. file or script output) is returned in the message body. 404 Not Found-- the requested resource doesn't exist. 301 Moved Permanently 302 Moved Temporarily 303 See Other (HTTP 1.1 only) -- The resource has moved to another URL (given by the Location: response header), and should be automatically retrieved by the client. This is often used by a CGI script to redirect the browser to an existing file. 500 Server Error-- An unexpected server error. The most common cause is a server-side script that has bad syntax, fails, or otherwise can't run correctly. Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  15. Header lines • One line per header in the form of "Header-Name: value", ending with CRLF (RFC 822 format). • HTTP 1.0 defines 16 headers, though none are required. HTTP 1.1 defines 46 headers, and one (Host:) is required in requests. • Host: dfe3300.karelia.ru • From: kto-to@mail.ru • User-agent: my_software/3.0Gold • Last-Modified: Fri, 31 Dec 1999 23:59:59 GMT • If an HTTP message includes a body, there are usually header lines in the message that describe the body. In particular, • The Content-Type: header gives the MIME-type of the data in the body, such as text/html or image/gif. • The Content-Length: header gives the number of bytes in the body. Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  16. Sample HTTP exchange (1) To retrieve the file from the URL http://www.my_server.com/path/file.html 1. Open a socket to the host www.my_server.com, port 80 (use the default port of 80 because none is specified in the URL). 2. Then, send something like the following through the socket: GET /path/file.html HTTP/1.1 Host: www.my_server.com From: me@my_mail.ru User-Agent: my_soft/3.0 [blank line here] Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  17. Sample HTTP exchange (2) The server should respond with something like the following, sent back through the same socket: HTTP/1.1 200 OK Date: Fri, 31 Dec 1999 23:59:59 GMT Content-Type: text/html Content-Length: 1354 <html> <body> <h1>Happy New Millennium!</h1> (more file contents) . . . </body> </html> After sending the response, the server closes the socket. Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  18. GET and POST (HTTP/1.0) GET: GET /path/script.cgi?home=Cosby&favorite+flavor=flies HTTP/1.0 User-Agent: my_soft/1.0 [blank line here] POST: POST /path/script.cgiHTTP/1.0 User-Agent: my_soft/1.0 Content-Type: application/x-www-form-urlencoded Content-Length: 32 home=Cosby&favorite+flavor=flies Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  19. URL-encoding HTML form data is usually URL-encoded to package it in a GET or POST submission (RFC 2396). 1. Convert all "unsafe" characters in the names and values to "%xx", where "xx" is the ascii value of the character, in hex. "Unsafe" characters include =, &, %, +, non-printable characters, and any others you want to encode. For simplicity, you might encode all non-alphanumeric characters. 2. Change all spaces to plusses. 3. String the names and values together with = and &, like name1=value1&name2=value2&name3=value3 4. This string is your message body for POST submissions, or the query string for GET submissions. For example, if a form (in html document) has a field called "Number" that's set to "B52", and a field called "Text" that's set to "You & me", the URL-encoded form data would be Number=B52&Text=You+%26+mewith a length of 21. Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  20. Features of HTTP/1.1 • Superset of HTTP 1.0. • Improvements: • Faster response, by allowing multiple transactions to take place over a single persistent connection. • Faster response and great bandwidth savings, by adding cache support. • Faster response for dynamically-generated pages, by supporting chunked encoding, which allows a response to be sent before its total length is known. • Efficient use of IP addresses, by allowing multiple domains to be served from a single IP address (virtual hosts). Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  21. HTTP/1.1 clients • To comply with HTTP 1.1, clients must: • include the Host: header in each request; • accept responses with chunked data; • either support persistent connections, or include "Connection: close" header with each request; • handle the "100 Continue" response. Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  22. Chunked Transfer-encoding • If a server wants to start sending a response before knowing its total length (like with long script output), it might use the simple chunked transfer-encoding, which breaks the complete response into smaller chunks and sends them in series. • A chunked message body contains a series of chunks, followed by a line with "0" (zero), followed by optional footers (just like headers), and a blank line. • Each chunk consists of two parts: • a line with the size of the chunk data, in hex, possibly followed by a semicolon and extra parameters you can ignore (none are currently standard), and ending with CRLF. • the data itself, followed by CRLF. Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  23. Chunked encoding example HTTP/1.1 200OK Content-Type: text/plain Transfer-Encoding: chunked 1a; ignore-stuff-here abcdefghijklmnopqrstuvwxyz 10 1234567890abcdef 0 some-footer: some-value another-footer: another-value [blank line here] Note the blank line after the last footer. The length of the text data is 42 bytes (1a + 10, in hex), and the data itself is abcdefghijklmnopqrstuvwxyz1234567890abcdef. The footers should be treated like headers, as if they were at the top of the response. No chunked encoding: HTTP/1.1 200 OK Content-Type: text/plain Content-Length: 42 some-footer: some-value another-footer: another-value abcdefghijklmnopqrstuvwxyz1234567890abcdef Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  24. Persistent connections Problem: In practice, most Web pages consist of several files on the same server. In HTTP 1.0 and before, TCP connections are closed after each request and response, so each resource to be retrieved requires its own connection. Opening and closing TCP connections takes a substantial amount of CPU time, bandwidth, and memory. Solution: Much can be saved by allowing several requests and responses to be sent through a single persistent connection. Persistent connections are the default in HTTP 1.1, so nothing special is required to use them. Just open a connection and send several requests in series (called pipelining), and read the responses in the same order as the requests were sent. If a client includes the "Connection: close" header in the request, then the connection will be closed after the corresponding response. Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  25. The "100 Continue" response On slow channels server might respond with an interim "100 Continue" response. This means the server has received the first part of the request. HTTP/1.1 100 Continue HTTP/1.1 200 OK Date: Fri, 31 Dec 1999 23:59:59 GMT Content-Type: text/plain Content-Length: 42 some-footer: some-value abcdefghijklmnoprstuvwxyz1234567890abcdef To handle this, a simple HTTP 1.1 client might read one response from the socket; if the status code is 100, discard the first response and read the next one instead. Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  26. Web traffic compression There are few methods of web traffic compression (gzip, deflate, compress etc.). The client asks the http server to use on of the supported compression algorithms, the server may send the requested document in compressed form. Decompression begins just after receiving the first bytes of http response (it is not necessary to receive all the document). GET / HTTP/1.1 host: www.google.com Accept-Encoding: gzip, deflate, compress Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  27. If-Modified-Since To avoid sending resources that don't need to be sent, thus saving bandwidth, HTTP 1.1 defines the If-Modified-Since: and If-Unmodified-Since: request headers.The former says "only send the resource if it has changed since this date"; the latter says the opposite. Clients aren't required to use them, but HTTP 1.1 servers are required to honor requests that do use them. Unfortunately, due to earlier HTTP versions, the date value may be in any of three possible formats (1st - the most legal): If-Modified-Since: Fri, 31 Dec 1999 23:59:59 GMT If-Modified-Since: Friday, 31-Dec-99 23:59:59 GMT If-Modified-Since: Fri Dec 31 23:59:59 1999 Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  28. Caching documents • Server side (for example, in case of dynamically generated pages) • Client side (in local files on hard disk and memory) • Intermediate http-proxies • Not all transactions can be cached, and a client or server can dictate that a certain transaction may be cached only for a given time limit Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  29. Caching in HTTP SERVER Includes Date:, Expires: headers, or the max-age directive (server-specified expirationtimes and validators) into HTTP response. PROXIES and CLIENTS How do they know when to kill a certain document in cache or whether store it at all? Cache-Control: max-age=0 Cache-Control: no-cache Cache-Control: must-revalidate Pragma: no-cache (HTTP/1.0) … Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  30. Caching in HTTP <HTML> <HEAD> <META HTTP-EQUIV="Cache-Control" CONTENT="max-age: 400"> <META HTTP-EQUIV="pragma" CONTENT="no-cache"> </HEAD> <BODY> </BODY> </HTML> Web-publisher (programmer) recommends http server to include the corresponding header in http packet SERVER CLIENT CLIENT will not organize new request to SERVER until storing time (in seconds) of document exceeds max-age. Problem of dynamic content of Web-sites. Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  31. Overall scheme <META HTTP-EQUIV="Cache-Control" CONTENT="no-cache"> . . . <FORM METHOD="GET" action="/cgi-bin/vote.pl“> <INPUT TYPE="text" NAME="N" VALUE="test"> <INPUT TYPE="submit" NAME="S" VALUE="Vote"> </FORM> . . . HTML-parser Mouse click on "Vote" GET /cgi-bin/vote.pl?N=test&S=Vote HTTP/1.1 Host: www.server.ru User-Agent: Mozilla/IE 6.0 [blank line] CLIENT HTTP-agent HTTP request to Web-server Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  32. Overall scheme (continued) N=test&S=Vote Web-SERVER STDOUT (POST) $ENV (GET) OR Common Gateway Interface Perl script vote.pl #!/usr/local/bin/perl if ($ENV{'REQUEST_METHOD'} eq "POST") { read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'}) } elsif ($ENV{'REQUEST_METHOD'} eq "GET") { $buffer = $ENV{'QUERY_STRING'} } . . . Reply to Web-server through CGI Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  33. Overall scheme (continued) Content-type: text/html <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> <HTML> <HEAD> <META HTTP-EQUIV="Cache-control" CONTENT="no-cache"> </HEAD> <BODY>Thank you . . .</BODY> </HTML> Web-SERVER HTML-document to SERVER HTTP agent HTTP/1.1 200 OK Cache-control: no-cache Content-Length: 1354 Content-type: text/html <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> <HTML>. . .</HTML> SERVER's HTTP-agent Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  34. Overall scheme (continued) <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> <HTML>. . .</HTML> CLIENT HTTP-agent HTML-viewer Thank you . . . Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  35. Secure HTTP (HTTPS) HTTPS has the same functionality as HTTP, but uses encryption of data transferred to/from client and server (RFC 2660). HTTPS uses 443 TCP port as default. When connection to the secure port is established, the following happens automatically: • The client authenticates the server using the server's digital certificate • The client and server negotiate which cipher suite (set of security protocols) and generate session keys for encrypting and decrypting data. • The client and server establish a secure encrypted connection. HTTPS has its own headers in HTTPS request/response and may, for example, encapsulate HTTP request/response (next slide). Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

  36. Secure HTTP (HTTPS) An appropriate HTTP server response would be: HTTP/1.0 200 OK Security-Scheme: S-HTTP/1.4 Content-Type: text/html Congratulations, you've won. <A href="/prize.html" CRYPTOPTS="Key-Assign: Inband,alice1,reply,des-ecb;020406080a0c0e0f; SHTTP-Privacy-Enhancements: recv-required=auth">Click here to claim your prize</A> This HTTP response, encapsulated as an S-HTTP message becomes: Secure * Secure-HTTP/1.4 Content-Type: message/http Prearranged-Key-Info: des-ecb,697fa820df8a6e53,inband:1 Content-Privacy-Domain: CMS MIAGCSqGSIb3DQEHBqCAMIACAQAwgAYJKoZIhvcNAQcBMBEGBSsOAwIHBAifqtdy x6uIMYCCARgvFzJtOZBn773DtmXlx037ck3giqnV0WC0QAx5f+fesAiGaxMqWcir r9XvT0nT0LgSQ/8tiLCDBEKdyCNgdcJAduy3D0r2sb5sNTT0TyL9uydG3w55vTnW aPbCPCWLudArI1UHDZbnoJICrVehxG/sYX069M8v6VO8PsJS7//hh1yM+0nekzQ5 l1p0j7uWKu4W0csrlGqhLvEJanj6dQAGSTNCOoH3jzEXGQXntgesk8poFPfHdtj0 5RH4MuJRajDmoEjlrNcnGl/BdHAd2JaCo6uZWGcnGAgVJ/TVfSVSwN5nlCK87tXl nL7DJwaPRYwxb3mnPKNq7ATiJPf5u162MbwxrddmiE7e3sST7naSN+GS0ateY5X7 AAAAAAAAAAA= Petrozavodsk State University, Alex Moschevikin, 2003 NET TECHNOLOGIES

More Related