HTTP - Hypertext Transfer Protocol Arthur : Yigal Eliaspur Date : 28.1.2001
HTTP Overview • Web’s application-layer protocol • in use by the WWW since 1990 • client/server paradigm • in the web: • clients : browsers (IExplorer,Netscape..) • server : web servers (Apache,IIS..) • Request/Response Protocol: • Web servers usually using TCP port 80 request C S response
HTTP Overview (cont.) • Stateless protocol - HTTP server maintains no information about the client.
HTTP Versions • HTTP 0.9 • Simple GET protocol for the Web • limits on data transfer (1024 characters) • HTTP 1.0 • Headers give information about the data transferred. • Greater data type/quantity transfer in both directions • HTTP 1.1 • Supports hierarchical proxy servers • caching • persistent connections
HTTP 0.9 GET example • telnet www.cs.huji.ac.il 80 • GET /~dbsi/index.html <CRLF> • output : • <HTML><HEAD>.......</HEAD><BODY>...............</BODY></HTML> • Connection closed by foreign host
HTTP 1.0 • developed between 1992 and 1996. • Exchange more than simple text • Headers allowed in both requests and responses • Extends GET request to allow headers • Adds HEAD request to get information • Adds POST request, sends information with the request
Headers types • General • Date, Pragma .. • Request • Authorization, From, If-Modifed-Since, Referer, User-Agent .. • Response • Location, Server, WWW-Authenticate ... • Entity • Allow, Content-Encoding, Content-Length, Content-Type, Expires, Last-Modified, extension-header...
POST & HEAD messages • POST • sends information with the request in the Entity Body. • Useful when the user fills out a form. • HEAD • return only the request result without the data itself (I.e. only the Status line and the Header lines) • use for debugging HTTP servers and for page update checking.
Upgrading Header • allows the client to specify what additional communication protocols it supports • The server may choose to switch protocols, but this is not mandatory. • Example: • Upgrade: HTTP/2.0, SHTTP/1.3, IRC/6.9, RTA/x11
Caching • Why? • Reduces response time • Request is satisfied from cache closest to browser • Takes less time to get the page and display it • Reduces traffic • Each page only accessed from the server once • Reduces bandwidth used by browser • Saves money if client is paying by traffic • Keeps bandwidth requirements down
Caching (cont.) • Risks? • Might not be ``semantically transparent'’ • the response is different from what would have been returned by the origin server.
Caching in HTTP/1.0 • simple caching mechanism: • Origin server may mark a response, using the Expires header • cache validity checking using a ”conditional request“ which include : If-Modified-Since & Last-Modified headers. • server responds: • 304 (Not Modified) • 200 (OK) + the New entry.
Caching in HTTP/1.0 (cont.) • The Pragma: no-cache request Header indicate that a request should not be satisfied from a cache. • PROBLEM - origin servers/clients can’t give full and explicit instructions to caches (will be explained later)
Caching in HTTP/1.1 • retains the basic HTTP/1.0 design • new features • more careful specifications of the existing features. • Entry start as fresh. • Become stale - when reaches its expiration time. • must revalidate it with the origin server.
Caching in HTTP/1.1 (cont.) • cache validator string : entity tag. • two responses resource with the same entity tag must be identical. • Can include : fine-grained timestamp, internal database pointer . . . • If-None-Match header with one or more entity tags. • Much stronger then If-Modified-Since.
Caching in HTTP/1.1 (cont.) • Cache-Control header • server/client implicit directives to caches • directives examples: • max-age - relative expiration time. • HTTP/1.0 Expires header can lead to clock skew failure. • no-transform - prevent proxies response transformations. • like reduce image complexity over a slow link (WAP) • private & no-store - prevent the storage of some or all of a response.
Caching in HTTP/1.1 (cont.) • Vary header - include list of headers that identical the request beside the URL field. • For example : Accept-Language, Accept-Charset ...
Cooperative Cashing (cont.) • Higher level cache ( e.g. national cash) • larger user population • higher hit rates. • Multiple Web cashes which cooperate => Improve overall performance. • Cooperative cashes usualy built from clusters • divide the traffic overhead • improve storage capacity
Cooperative Cashing (cont.) • which of the cashes we sould ask for a particular doc? • Hash routing (of URLs) - an object want be present in more then one cash. • HTTP/1.1 introduces the concept of hop-by-hop headers: • message headers that apply only to a given connection, and not to the entire path. • This enable much more power with proxies (cashes) usage.
Cooperative Cashing (cont.) • HTTP 1.1 hop-by-hop headers: • Connection • options that are desired for that particular connection (e.g connection:close.) • Public • lists the set of methods supported by the server • Proxy-Authenticate • enable authentication methods between two hops. • Transfer-Encoding - • compression method between two hops. • Upgrade • additional communication protocols supported.
Persistent & Non Persistent Connections. • Persistent Connections: • Opens new TCP connection for each request. • For example : for a web page with 10 image - 11 new TCP connections is needed. • Used in HTTP/1.0 • nonpersistent connections : • one TCP connection can serve more then one request/response pair. • Less connection establishing overhead, smaller slow-start delay. • Used as default in HTTP/1.1
Persistent & Non Persistent Connections.(cont.) • nonpersistent connections, two types: • without pipelining • the client issues a new request only when the previous response has been arrived. • with pipelining • client send the request as soon as it encounters a reference. • Multiple request/response on the same TCP packet. • Or on back-to-back packets.
Compression • most image formats (GIF, JPEG, MPEG) are precompressed. • many other data types used in the Web are not. • compression could save almost 40% of the bytes sent via HTTP • need for negotiating the use of codings.
Compression (cont.) • Client send : Accept-Encoding header • indicate what content-codings it can handle, and which ones it prefers. • Server Send : • Content-Encoding header - forend-to-end coding indication. • Transfer-Encoding header - forhop-to-hop coding indication. (supported only in HTTP/1.1)
W3C Performance Measurements • "Microscape" Benchmark, 43 inline images Scenarios • HTTP/1.0: using 4 simultaneous connections • HTTP/1.1: using 1 persistent connection • HTTP/1.1 pipeline: using 1 persistent connection • HTTP/1.1 pipeline + compression: using 1 connection
Authentication • Many sites require users to provide a username and password in order to access the documents housed on the server. • Provide mechanism for keeping track of users (more then security mechanism). • How does it’s work? • Client send • ordinary request message • server responds with • 401 Authorization Required status code • WWW-Authenticate header which specified how to perform authentication
Authentication (cont.) • Client resend • the requested message but this time including Authorization header (e.g. user-name & password.) • The client continue to add this header for each following request to that server.
Cookies • Another site mechanism for keeping tracks of users. • Example: • Client contact a web site for the first time. • Server response with : • Set-cookie : 1678453 header • client store the cookie value and the server name in a special “cookie file”. • For each further request for that server the client will add the • Cookie : 1678453 header
Cookies (cont.) • Usage: • server requires authentication but doesn’t want to hassle a user with a user-name and password. • Remembering user’s preferences for advertising. • Enable creating a virtual shopping cart. • Problems • users who accesses the same site from different machines.
References • http://www.ietf.org/rfc/rfc2068.txt • http://www.ietf.org/rfc/rfc1945.txt • http://www.w3.org/Protocols/ • http://www8.org/w8-papers/5c-protocols/key/key.html • Computer Networks by Joames Fokurose & Keith W.Ross.