Lecture 4 Basic Web Concepts - PowerPoint PPT Presentation

slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Lecture 4 Basic Web Concepts PowerPoint Presentation
Download Presentation
Lecture 4 Basic Web Concepts

play fullscreen
1 / 25
Lecture 4 Basic Web Concepts
114 Views
Download Presentation
hank
Download Presentation

Lecture 4 Basic Web Concepts

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. CS 502 Computing Methods for Digital LibrariesCornell University – Computer ScienceHerbert Van de Sompelherbertv@cs.cornell.edu Lecture 4 Basic Web Concepts

  2. IP address 1 IP address 2 TCP/IP network HypertexT Transfer Protocol (HTTP) HTTP request HTTP response web browser HTTP client renders response web server HTTP server

  3. Transmission Control Protocol/Internet Protocol (TCP/IP ) • is the protocol suite that drives the Internet • handles network communications between network nodes (computers, printers, webcams, … connected to the Internet) • protocol suite: • TCP: communication of data between applications • IP: communication of data between nodes • UDP: communication between applications • ICMP: error and stats

  4. Client sends HTTP request Server receives HTTP request Application layer Transport layer TCP Internet layer IP Network Access layer Ethernet, … TCP/IP protocol architecture

  5. Transmission Control Protocol (TCP) • breaks message up into chunks • chunks get sequence number and IP address of addressee • opens connection with addressee (handshake) • hands chunks over to IP layer • guarantees error-free delivery of chunks at addressee (through connection)

  6. Internet Protocol (IP) • handles the routing of chunks towards addressee (through routers) • IP Addressing: • each node has an IP address: 157.193.101.6 • each node can have readable name erlserv.rug.ac.be • DNS connects IP and readable name • IP Data Transmission: • sender delivers chunk to router (via lower level protocol) • router delivers chunk to router or host • individual chunks can be delivered via different paths • routers decide on the path of least resistance • at addressee delivers chunk to TCP layer

  7. TCP/IP protocol architecture Application layer HTTP, FTP, telnet Transport layer TCP, UDP Internet layer IP, ICMP Network Access layer Ethernet, …

  8. method header entity-body HTTP request GET / HTTP/1.1 Date: Wednesday, 02-Feb-99 23:04:12 GMT Accept-Language: en-us User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT) Host: no.good.com Connection: Keep-Alive * a blank line * HTTP request no.good.com web browser HTTP client web server HTTP server

  9. HTTP request method method URI HTTP-version GET - POST - HEAD – PUT - … GET / HTTP/1.1 header • general-header: optional, general information • Date: Wednesday, 02-Feb-99 23:04:12 GMT • Connection: Keep-Alive • request-header: about client • Accept-Language: en-us • User-Agent: Mozilla/4.0 (compatible; • MSIE 5.01; Windows NT) • entity-header: about entity-body What is sent to the server entity-body

  10. status header entity-body HTTP response HTTP/1.1 200 OK Date: Wednesday, 02-Feb-99 23:04:25 GMT Server: Apache/1.3.6 (Unix) Last-Modified: Sun, 01 Feb 1999 13:54:26 GMT ETag: “2f5cd-964-38js8” Content-length: 327 Connection: close Content-Type: text/html * a blank line * <title>Welcome to nogood</title> <img src=“/images/nogood-logo.gif”> HTTP response no.good.com web browser HTTP client web server HTTP server

  11. HTTP response status HTTP-version Status-code Reason-phrase HTTP/1.1 200 OK header • general-header: optional, general information • Date: Wednesday, 02-Feb-99 23:04:25 GMT • response-header: about server • Server: Apache/1.3.6 (Unix) • entity-header: about entity-body • Content-Type: text/html • ETag: “2f5cd-964-38js8” • Content-length: 327 entity-body What is sent to the client title>Welcome to nogood</title> <img src=“/images/nogood-logo.gif”>

  12. HTTP request GET /images/nogood-logo.gif HTTP/1.1 Date: Wednesday, 02-Feb-99 23:04:27 GMT Accept-Language: en-us User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT) Host: no.good.com Connection: Keep-Alive * a blank line * HTTP request no.good.com web browser HTTP client web server HTTP server

  13. HTTP response HTTP/1.1 200 OK Date: Wednesday, 02-Feb-99 23:04:29 GMT Server: Apache/1.3.6 (Unix) Last-Modified: Sun, 01 Feb 1999 08:20:00 GMT ETag: “2f5cd-964-445e” Content-length: 220 Connection: close Content-Type:image/gif * a blank line * the GIF file HTTP response no.good.com web browser HTTP client web server HTTP server

  14. HypertexT Transfer Protocol (HTTP) HTTP request HTTP response MIME type + file web browser HTTP client renders response web server HTTP server

  15. Browser • built into browser • plug-in • helper application file MIME type Presentation software Display

  16. s e r v e r c l i e n t HTTP Proxies • Reduce network traffic: caching (Etag, Last-Modified) • IP-based authentication cache no.good.com web browser HTTP client web server HTTP server HTTP proxy

  17. HTTP cookies • HTTP protocol is stateless: once a server has given a response to a client, it forgets about it. No session information. • Fake state with cookies: • server sends token to client • client sends token back to server • server understands the meaning of the token • for instance: server avoids to require input of username/password with every request by reading authorization from cookie

  18. CGI HTTP request HTTP response Dynamic content: Common Gateway Interface (CGI) • Client interaction with non-web servers program no.good.com web browser HTTP client web server HTTP server

  19. CGI CGI -- HTTP POST request POST/cgi-bin/find HTTP/1.1 Date: Wednesday, 02-Feb-99 23:04:27 GMT Accept-Language: en-us User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT) Host: no.good.com Connection: Keep-Alive Content-length: 26 Content-type: application/x-www-form-urlencoded * a blank line * search=herbert&type=author program find HTTP request no.good.com web browser HTTP client web server HTTP server

  20. CGI CGI -- HTTP GET request GET/cgi-bin/find?search=herbert&type=author HTTP/1.1 Date: Wednesday, 02-Feb-99 23:04:27 GMT Accept-Language: en-us User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT) Host: no.good.com Connection: Keep-Alive * a blank line * program find HTTP request no.good.com web browser HTTP client web server HTTP server

  21. CGI - the interface program find • find receives input from • STDIN • environment variables (about client, server, • request … CGI search=herbert&type=author SERVER-NAME server.good.com REMOTE-HOST 157.193.101.6 … no.good.com web server HTTP server

  22. CGI - the interface find outputs to STDOUT program find Content-type: text/html <title>Search results</title> … CGI web server adds header information sends response to client no.good.com web server HTTP server

  23. Dynamic content: Mobile code - JavaScript • Executed by the browser • • User interface, client-side validation, … HTML HTTP response JavaScript no.good.com web server HTTP server web browser HTTP client

  24. Dynamic content: Mobile code – Java applets • Executed by virtual machine • • Interaction with find not via HTTP program find Java HTTP response no.good.com web server HTTP server web browser HTTP client

  25. Want to read a bit more? • on Web Characterization http://www.w3.org/1999/05/WCA-terms/01 • on CGI http://www.ukans.edu/~acs/docs/other/forms-intro.shtml • on Web, TCP/IP, CGI http://www.wdvl.com/Authoring/Tools/Tutorial/index4.html • HTTP http://www.ietf.org/rfc/rfc1945.txt?number=1945 ; http://www.jmarshall.com/easy/http/