1 / 82

Web Servers

Web Servers. Pre-lecture Survey: What is the #1 web server:. Apache Google MS IIS HTTP server nginx Sun Other . http://en.wikipedia.org/wiki/Web_servers. Generic Overview. Web Servers. A web server can be a: Computer Program

nadine
Download Presentation

Web Servers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Web Servers

  2. Pre-lecture Survey: What is the #1 web server: • Apache • Google • MS IIS HTTP server • nginx • Sun • Other

  3. http://en.wikipedia.org/wiki/Web_servers Generic Overview

  4. Web Servers • A web server can be a: • Computer Program • Responsible for accepting HTTP requests from clients (web browsers) • Returns HTTP responses with optional data contents • Usually web pages • HTML documents • Linked objects (images, etc.). • Computer • Running a computer program which provides the above functionality

  5. Common Features

  6. Common Features • HTTP • Accepts HTTP requests from a client • Provides HTTP responses to the client • Typically an “HTML” document can be: • File containing HTML statements • Raw text file • Image • Some other type of document • defined by MIME-types • If an error in a client request or trying to service the request: • Web server sends an error response • May include custom HTML • May have text messages • Better explain the problem to end user

  7. Common Features • Logging • Web servers keep detailed information to log files • Client requests • Server responses • Allows the Webmaster to collect data • Running log analyzers

  8. Additional Features • Authentication • Optional authorization before allowing access to some or all resources • Requires a user name and password • Handles: • Static content • Dynamic content • Support one or more related interfaces • SSI, CGI, SCGI, FastCGI, JSP, PHP, ASP, ASP .NET, Server API such as NSAPI, ISAPI, etc.

  9. Additional Features • HTTPS support • VIA SSL or TLS • Allows secure (encrypted) connections • Uses port 443 instead of port 80 • Content compression • I.e. by gzip encoding • Reduces the size of the responses • Lower bandwidth usage, etc.

  10. Additional Features • Virtual hosting • Serve many web sites using one IP address • Large file support • Serve files greater than 2 GB • Typical 32 bit OS restriction • Bandwidth throttling • Limit the speed of responses • Do not saturate the network • Able to serve more clients

  11. Where does the requested material come from? Origin of the returned content

  12. Content Origin • Origin of the returned content may be: • Static • Pre-existing data file • Contents loaded on request • Dynamic • Content generated by another program • Script (programming language) • Creates/retrieves the requested information • Static content is usually delivered much faster than dynamic content • 2 to 100 times • Especially if the latter involves data pulled from a database

  13. How does it find it? Path translation

  14. Path translation • Web servers map the path component of a Uniform Resource Locator (URL) into: • Local file system resource • Static requests • Internal or external program name • Dynamic requests • For a static request the URL path specified by the client is relative to the Web server's root directory • This is not the same as the computers root directory

  15. Path translation • Consider the following URL requested by a client Web Browser: • http://www.example.com/path/file.html • Client's Web browser translates it: • Where • http:// • Use the HTTP protocol • www.example.com • The Web server to connect to • This is translated to an IP address by DNS • Sent to 93:184.216.119:80 • /path/file.html • The resource to access • Generates the following HTTP 1.1 request sent to the IP address: • GET /path/file.html HTTP/1.1Host: www.example.com

  16. Path translation (cont.) • Web server host (www.example.com) • Sees the request is for port 80 • Sends request to the Web Server software • Appends the given path to the path of the servers Web root directory • On Unix machines typically /var/www/htdocs or /var/www • Result would then be the local file system resource: • /var/www/htdocs/path/file.html • Web server: • Retrieves the file, if it exists • Processes it by the Web servers rules • Sends a response to the client's web browser • Response: • Describes the content of the file • Contains the file requested or a response

  17. Performance

  18. Performance • Web servers must: • Serve requests quickly! • From more than one TCP/IP connection at a time • Some main key performance parameters are: • number of requests per second • depends on the type of request, etc. • latency response time in milliseconds • for each new connection or request • throughput in bytes per second • Depends on • File size • Content cached or not • Available network bandwidth • etc. • Measured under: • Varying load of clients • Varying requests per client

  19. Performance • Performance parameters may vary noticeably depending on the number of active connections • Concurrency level • Fourth parameter supported by a web server under a specific configuration • Specific server model used to implement a web server program can bias the performance and scalability level that can be reached under heavy load or when using high end hardware • many CPUs, disks, etc.

  20. Resume after Spring Break

  21. Load limits

  22. Load limits • Web server (program) has defined load limits • Can handle only a limited number of concurrent client connections per IP address (and IP ports) • Usually between 2 and 60,000 • Default between 500 and 1,000 • Can serve only a certain maximum number of requests per second depending on: • Settings • HTTP request type • Content origin • Static • Dynamic • Served content cached or not • Hardware and software limits of the native OS • A web server near or over its limits • Becomes overloaded • Unresponsive

  23. Overload causes

  24. Overload causes • A sample daily graph of a web server's load, indicating a spike in the load early in the day.

  25. Overload causes • At any time web servers can be overloaded because of: • Too much legitimate web traffic • Thousands or even millions of clients hitting the web site in a short interval of time • DDoS • Distributed Denial of Service attacks • Coordinated • Computer worms • Abnormal traffic because of millions of infected computers • Not coordinated • XSS viruses • Millions of infected browsers and/or web servers • Internet web robots • Traffic not filtered / limited on large web sites with very few resources (bandwidth, etc.) • Internet (network) slowdowns • Client requests are served more slowly and the number of connections increases so much that server limits are reached • Web servers (computers) partial unavailability • Required / urgent maintenance or upgrade • HW or SW failures • Back-end (i.e. DB) failures, etc. • Remaining web servers get too much traffic and they become overloaded

  26. Overload symptoms

  27. Overload symptoms • Symptoms of an overloaded web server include: • Requests are served with (possibly long) delays • from 1 second to a few hundred seconds • 500, 502, 503, 504 HTTP errors returned to clients • Sometimes also unrelated 404 error or even 408 error may be returned • TCP connections are refused or reset (interrupted) before any content is sent to clients • In very rare cases, only partial contents are sent • This behavior may well be considered a bug • Even if it stems from unavailable system resources

  28. Anti-overload techniques

  29. Anti-overload techniques • To partially overcome load limits and to prevent overload use techniques like: • Managing network traffic by using: • Firewalls • Block unwanted traffic • Bad IP sources • Bad patterns • HTTP traffic managers • Drop, redirect or rewrite requests having bad HTTP patterns • Bandwidth management and traffic shaping • Smooth down peaks in network usage • Deploying web cache techniques • Use different domains to serve different content (static and dynamic) by separate Web servers, i.e.: • http://images.example.com • Serves static images • http://www.example.com • Serves dynamic data requests

  30. Anti-overload techniques • Techniques continued: • Use different domain names and/or computers to separate big files from small/medium files • Be able to fully cache small and medium sized files • Efficiently serve big or huge (over 10 - 1000 MB) files by using different settings • Using many Web servers (programs) per computer • Each bound to its own network card and IP address • Use many Web servers that are grouped together • Act or are seen as one big Web server • See Load balancer

  31. Anti-overload techniques • Techniques continued: • Add more hardware resources • RAM, disks, NICs, etc. • Tune OS parameters • Hardware capabilities • Usage • Use more efficient computer programs for web servers, etc. • nginx • Use workarounds • Specially if dynamic content is involved

  32. Historical notes

  33. Historical notes • World's first web server • 1989 - Tim Berners-Lee proposed to CERN a new project • Ease the exchange of information between scientists • Using a hypertext system • 1990 - Berners-Lee wrote two programs: • Browser • WorldWideWeb • Web server • Ran on NeXTSTEP

  34. Historical notes • First web server in USA • Installed December 12, 1991 • Bebo White at SLAC • After returning from a sabbatical at CERN • Between 1991 and 1994: • Simplicity and effectiveness of early technologies used to surf and exchange data through the World Wide Web helped to: • Port them to many different operating systems • Spread their use among lots of different social groups of people • First in scientific organizations • Then in universities • Finally in industry

  35. Historical notes • 1994: Tim Berners-Lee constituted the World Wide Web Consortium (W3C) • Regulate the further development of the many technologies in a standardization process: • HTTP • HTML • etc. • Following years saw an exponential growth of the number of web sites and servers

  36. Software

  37. Software • There are thousands of different web server programs available • Many specialized for very specific purposes • About 50 mainstream • The fact that a web server is not very popular does not necessarily mean • Lot of bugs • Poor performance • See Category:Web server software for a longer list of HTTP server programs.

  38. Statistics

  39. Statistics • Most popular web servers, used for public web sites, are tracked by • Netcraft.com • Details given by • Netcraft Web Server Reports • According to this site: • Apache has been the most popular web server on the Internet since April of 1996 • July 2010 Netcraft Web Server Survey: • 54.90% web sites on the Internet use Apache • 25.87% web sites use IIS

  40. Web Servers

  41. Who’s running the show? What are they? The big two: Popular Web Servers

  42. Apache http://en.wikipedia.org/wiki/Apache_web_server We’re number one!

  43. Apache • Apache HTTP Server, referred to simply as Apache: • A web server • Notable for playing a key role in the initial growth of the World Wide Web • Apache • First viable alternative to Netscape Communications Corporation web server • Currently known as Sun Java System Web Server • Evolved to rival other Unix-based web servers • Functionality and performance • Since April 1996 Apache has been the most popular HTTP server on the World Wide Web • September 2007: Apache served 50% of all websites

  44. Apache • Project's name was chosen for two reasons: • Respect for the Native American Indian Apache tribe • Well-known for their endurance and their skills in warfare • Project's root is a set of patches to the codebase of NCSA HTTPd 1.3 • Making it "a patchy" server • Apache is developed and maintained by • Open community of developers • Under the auspices of the Apache Software Foundation • Available for a wide variety of OSs • Microsoft Windows • Novell NetWare • Unix-like operating systems • e.g. Linux and Mac OS X • z-OS (IBM mainframe) • and more… • Released under the Apache License • Apache is free software / open source software.

  45. Apache History

  46. History • First version of the Apache web server created by Robert McCool • Heavily involved with the National Center for Supercomputing Applications web server • Known simply as NCSA HTTPd • When Rob left NCSA in mid-1994 • Development of httpd stalled • Left a variety of patches for improvements circulating through e-mails • Rob McCool was not alone in his efforts • Several other developers helped form the original "Apache Group": • Brian Behlendorf, Roy T. Fielding, Rob Hartill, David Robinson, Cliff Skol nick, Randy Terbush, Robert S. Thau, Andrew Wilson, Eric Hagberg, Frank Peters, and Nicolas Pioch

  47. History • Version 2 of the Apache server was a substantial re-write of much of the Apache 1.x code • Strong focus on further modularization and the development of a portability layer, the Apache Portable Runtime • Apache 2.x core - several major enhancements over Apache 1.x: • UNIX threading • Better support for non-Unix platforms • New Apache API • IPv6 support • First alpha release of Apache March 2, 2000 • First general availability release on April 6, 2002 • Version 2.2 introduced a new authorization API that allows for more flexibility • Also features improved cache modules and proxy modules

  48. Features

  49. Features • Apache supports a variety of features • Many implemented as compiled modules • Extend the core functionality • Range from server-side programming language support to authentication schemes: • Common language interfaces support • mod_perl, mod_python, Tcl, and PHP • Popular authentication modules include • mod_access, mod_auth, and mod_digest

  50. Features • Other features include: • SSL and TLS support • mod_ssl • A proxy module • A useful URL rewriter • AKA a rewrite engine, implemented under mod_rewrite • Custom log files • mod_log_config • Filtering support • mod_include • mod_ext_filter • Apache logs can be analyzed via web browsers with free scripts • AWStats/W3Perl • Visitors

More Related