1 / 36

Distributed Systems: Origin, Communication, and Security

Learn about the origin and overview of the web, distributed system aspects, communication protocols, naming conventions, replication and fault tolerance, and web security.

jvandyke
Download Presentation

Distributed Systems: Origin, Communication, and Security

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Web • Origin and overview of the web • Drill-down on distributed system aspects • Communication • Processes • Naming • Synchronization • Replication (especially caching) • Fault tolerance • Security Distributed Systems - Comp 655

  2. Origin of the web • CERN (European particle physics lab) • Purpose: facilitate document sharing • Large user community • Geographically dispersed • Founder: Tim Berners-Lee • Use exploded in late 90’s • Graphical user interfaces (Mosaic and descendants) • Huge amounts of content • Search engines • Interactive pages Distributed Systems - Comp 655

  3. Definition of the Web • Many standards • HTML • HTTP • DNS • URL, URI, URN • XML • DOM • W3C • IETF Distributed Systems - Comp 655

  4. A word about RFCs • Standards track • Proposed standard • Draft standard (at least two independent and interoperable implementations) • Internet standard (also has STD number, for example IP is STD-005 and RFC-0791) • “Off-track” • Experimental • Informational • Historic(al) See RFC 2026 for details Distributed Systems - Comp 655

  5. Yet more words about RFCs Before using an RFC, • check the Obsolete RFC list • or find it on the Active RFC list I use the RFC index at faqs.org because I find it a bit easier to use than the IETF’s list. Remember, if there’s a conflict, IETF is the authority. Distributed Systems - Comp 655

  6. Overall structure Distributed Systems - Comp 655

  7. Client-side script What’s in a web page? Distributed Systems - Comp 655

  8. Some web pages are XML Distributed Systems - Comp 655

  9. XML document type definition Distributed Systems - Comp 655

  10. Other document types Distributed Systems - Comp 655

  11. CGI – early Web interaction Distributed Systems - Comp 655

  12. Problems with CGI • Process per request • Wide variety in server-side runtime environments • Solutions • Server-side scripting (JSP, ASP, PHP) • Servlets Distributed Systems - Comp 655

  13. Problems with browsers • Browser-based user interfaces tend to be clunky and limited • Solutions: • Client-side scripting • Applets • More recently, AJAX • An example: http://www.javarss.com/ajax/j2ee-ajax.html • See http://en.wikipedia.org/wiki/AJAX for more information Distributed Systems - Comp 655

  14. Server-side scripts and servlets Distributed Systems - Comp 655

  15. Nothing’s perfect • What Web technology has big problems with server-side page generation? Distributed Systems - Comp 655

  16. Communcation on the web: HTTP • TCP-based client/server protocol • Create connection • Send request • Send response • Close connection • HTTP 1.1 reduces connection overhead with persistent connections Distributed Systems - Comp 655

  17. HTTP connections non-persistent persistent Distributed Systems - Comp 655

  18. HTTP request types Distributed Systems - Comp 655

  19. type path protocol headers HTTP request example GET /xyzzy HTTP/1.1 Connection: Keep-Alive Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/vnd.ms-powerpoint, applicat ion/vnd.ms-excel, application/msword, application/x-shockwave-flash, */* Accept-Language: en-us Host: laptop:1215 If-Modified-Since: Sun, 27 Jun 2004 00:58:28 GMT User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) Distributed Systems - Comp 655

  20. HTTP header types Distributed Systems - Comp 655

  21. Processes • Browsers • Proxies • Apache web server framework Distributed Systems - Comp 655

  22. Browser with plug-in Distributed Systems - Comp 655

  23. Web proxy Most browsers today support ftp. However, proxies are still used for shared caching. Distributed Systems - Comp 655

  24. Apache modules www.apache.org Distributed Systems - Comp 655

  25. Server cluster – simple minded Distributed Systems - Comp 655

  26. Server cluster - clever Distributed Systems - Comp 655

  27. Web naming URI URL URN Distributed Systems - Comp 655

  28. URI examples from RFC 2396 ftp://ftp.is.co.za/rfc/rfc1808.txt -- ftp scheme for File Transfer Protocol services gopher://spinaltap.micro.umn.edu/00/Weather/California/Los%20Angeles -- gopher scheme for Gopher and Gopher+ Protocol services http://www.math.uio.no/faq/compression-faq/part1.html -- http scheme for Hypertext Transfer Protocol services mailto:mduerst@ifi.unizh.ch -- mailto scheme for electronic mail addresses news:comp.infosystems.www.servers.unix -- news scheme for USENET news groups and articles telnet://melvyl.ucop.edu/ -- telnet scheme for interactive services via the TELNET Protocol More examples on page 670 Distributed Systems - Comp 655

  29. Naming – URL – how to access Distributed Systems - Comp 655

  30. Naming – URN – true resource identifier RFC 2648 defines a URN namespace for IETF documents. RFC 2141 defines URN syntax. RFC 3406 is a BCP (Best Current Practice) for defining URN namespaces. Distributed Systems - Comp 655

  31. Activity – hitting a web page • Check your understanding: draw a UML sequence diagram showing the interaction of key software elements when a browser hits a web page containing graphics • Assume the web page and the images are on different servers • “Classes” in the diagram should include • Browser • DNS resolver • DNS server • Server for the page • Server for the images Distributed Systems - Comp 655

  32. Not much to synchronize … • Generally, web clients don’t exchange information with other clients, and servers don’t exchange with other servers • Most documents have a single author – few write/write conflicts • However, WebDAV is a simple locking and versioning scheme • Locks are connection-independent • Handling abandoned locks is left to implementation Distributed Systems - Comp 655

  33. Replication – client and proxy Many organizations run proxy servers Some proxies can cooperate Virtually all browsers can cache Distributed Systems - Comp 655

  34. Replication – server side • Server clusters • Mirror sites • Content delivery networks (CDNs) • For example, Akamai Distributed Systems - Comp 655

  35. CDN operation In Akamai’s CDN, embedded document URLs get resolved to “closest” CDN server Distributed Systems - Comp 655

  36. If using client authentication Security on the Web NOTE: using both public and private key encryption, for performance reasons NOTE: client has to use same server for entire session Distributed Systems - Comp 655

More Related