Ch 22: Web Hosting and Internet Servers LaShall Bates
Web Hosting • Providing • Raw web pages (HTML) • FTP • SSL • Streaming audio or video
Linux vs. Windows hosting??? • Linux provides • Maintainability and performance • Multi user, interactive OS • Multi task Administration • Can be tuned to be faster
The Basics • A web server is a system configured to answer HTTP requests. Browsers contact these remote web servers and make requests on behalf of users. • To convert a generic Linux system, install a daemon that listens for connections on TCP port 80, accepts requests for docs and transmits them to the requesting user.
URL • Uniform Resource Locator, the global address of docs and other resources on the WWW. • 5 parts of the address 1. What protocol to use 2. IP address or the domain name where the resource is located. 3. TCP/IP port (opt) 4. Directory (opt) 5. Filename (case sensitive)
URL ex. • An executable file that should be fetched using the FTP protocol • ftp://uark.edu/stuff.exe • A Web page that should be fetched using the HTTP protocol • http://uark.edu/classes/index.html
How HTTP works • HyperText Transfer Protocol, the underlying protocol used by the WWW. • HTTP defines how messages are formatted and transmitted, and what actions Web servers and browsers should take in response to various commands. • For example, when you enter a URL in your browser, this actually sends an HTTP command to the Web server directing it to fetch and transmit the requested Web page.
HTTP: Stateless Protocol • HTTP is called a stateless protocol because each command is executed independently, without any knowledge of the commands that came before it. This is the main reason that it is difficult to implement Web sites that react intelligently to user input.
CGI • The shortcoming of HTTP is being addressed in a number of technologies, including CGI, ActiveX, Java, JS and cookies. • CGI is a specification that allows the designer to provide active changing web content. It allows the HTTP server to exchange info with other programs.
CGI: generating on the fly • CGI scripts can be C(++), Perl, Python or any other program that can perform real-time I/O
CGI: security • CGI poses a security problem for Admins. It can potentially allow anyone to run a program on your server and/or gain access to files.
Load balancing • Hits/page views a server can handle. • Dependent on • Operating system • System tuning • Hardware architecture • Construction of site • More important: Scalability • Products spread the work specified by a variety of admin-config params such as individual server response time and availability
Choosing a Server • Robustness • Performance • Timeliness of updates and bug fixes • Availability of source code • Cost • Access control and security • Ability to act as a proxy • Ability to handle encryption
Apache • The August 2002 Netcraft Web Server Survey found that 63% of the web sites on the Internet are using Apache, thus making it more widely used than all other web servers combined.
Apache • The name 'Apache' was chosen from respect for the Native American Indian tribe of Apache, well-known for their superior skills in warfare strategy and their inexhaustible endurance. • Secondarily, and more popularly (though incorrectly) accepted, it's a considered cute name which stuck. Apache is "A PAtCHy server". It was based on some existing code and a series of "patch files".
Installing Apache • If you want to compile source code yourself • ./configure –prefix=/etc/httpd/ • Include or remove features • enable-module=, disable-module= • For a complete list of modules http://httpd.apache.org/docs/mod/ • Run make, then make install to compile and install appropriate files
Configuring Apache • After installation, configure setup • conf directory (/etc/httpd/conf) • 3 files to configure • httpd.conf • srm.conf • access.conf
Apache Conf files: httpd • httpd.conf • How Apache daemon interacts with system. • Set TCP port • location of log files • various network and performance params • Configure virtual connections
Apache Conf files: srm • srm.conf • Controls resources server needs • DocumentRoot def: defines root of directory tree in which servable docs are located. • Also handling of “special” URLs (ex. http://comp.uark.edu/~crane)
Apache Conf files: access • access.conf • Security concerns • Directives that control access on a per-file pr per-dir basis, prevents access to sensitive files vs. httpd • Use option ExecCGI in srm.conf to enable CGI restrictions • Allows two access controls one for entire doc dir and one for cgi-bin.
Running Apache • Start by hand • /usr/sbin/httpd –f /etc/httpd/conf/httpd.conf • or from rc scripts • Run at boot time make link in rc directory that points to /etc/init.d/httpd file • Start late in booting sequence after daemons that manage functions such as routing and time synchronization have started
High performance Hosting • TUX is an architecture for kernel-accelerated network services. • Runs in conjunction with Apache • Serves up static pages without leaving kernel space. Minimizes context switches with a zero-copy architecture. Not recommended for beginners
Virtual Interfaces • Allowing the hosting of more than one web site by associating more than one IP address with a system. • Allows daemon to identify request’s destination IP address. • Single Linux machine responds on the network to more IP addresses than it has physical network interfaces. Each of the result “virtual” network interfaces can be associated with a corresponding domain name that users on the Internet might want to connect to.
Needs of Virtual Interfaces • Create the virtual interface at TCP/IP level • ifconfig eth0:0 220.127.116.11 netmask 255.255.255.255.192 up • To make permanent modify startup • Tell Apache server about virtual interfaces created. • Add VirtualHost clause to httpd.conf file • One for each virtual interface
Caching and Proxy Servers • Squid Internet Object Cache • Caching and proxy server that runs under Linux and supports several protocols • How it works • Client browser contacts squid to request object • Squid makes request on client’s behalf (or finds cached copy) and returns result to client • Proxy servers can enhance security or filter content
Anonymous FTP server • Lets users have accounts to download files you have made available. • Create user-------------------------------- • Create the user ftp in /etc/passwd. • Misc group, let ~ftp be root you wish anon. users to see. • Use invalid psw and use shell for better security ftp:*:400:400:Anonymous FTP:/home/ftp:/bin/true
Anonymous FTP: create user dir • Home directory: • Owned by root, same group as ftp, so Group permissions will be for anonymous users • Set the permissions for ~ftp to 555 (read, nowrite, execute). • Create subdirs: bin, etc, lib, and pub • Copy ls program to ~ftp/bin and its shared libraries to ~ftp/lib • Copy /etc/passwd and group files to ~ftp/etc • Should only contain users root, daemon, and ftp • Replace psw with *’s
Anonymous FTP: Use • Put files you want to make available in ~ftp/pub
Questions References: http://Apache.org http://webopedia.com http://squid.nlanr.net/