1 / 23

A Director of distributed array of web servers

A Director of distributed array of web servers. TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak Yoav Helfman. 1. Introduction. 1.1. General

astin
Download Presentation

A Director of distributed array of web servers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Director of distributed array of web servers TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak Yoav Helfman Director 1.0

  2. 1. Introduction 1.1. General • Our goal was to develop a Layer-5 director that switches itself into a layer-4 directorafter making the "request routing" decision, based on the URL. Then it should assign a new connection to the requesting client by using NAPT(Network Address & Port Translation). • As a platform for the director we use Linux operating system. Director 1.0

  3. 2. General Layout Internet RIP1 Real Server1 VIP WAN/LAN CIP Load Balancer/Director Linux Box RIP2 Client CIP: Client IP Address & Port VIP: Virtual IP Address & Port RIP: Real Server IP Address & Port Real Server2 RIP3 • The Director is connected to two networks: the web servers farm network, and the network representing the outside world (Internet). Real Server3 Director 1.0

  4. General Layout cont. • The Director reads HTTP requests (on port number 80) from the global network adapter • at this stage the Director and NAPT are working together, processes them (using a hash function in order to find the server that holds the URL) and sends the requests to the selected web server through the local network adapter. • From this moment on NAPT “takes initiative” performing the translation between the actual physical server and the client, actually the Layer 5 level has finished it’s part at this stage. Director 1.0

  5. 3. Modules. The project consists of four main modules: Layer 3/4 NAPT Layer 5 URL Director Director NAPT timeout manager Debug Director 1.0

  6. Modules cont. • Layer 5 URL Director • Accept: Examines each “GET” request and makes new routing decision based on a hashing function of the URL. • Connect: Initiate a new connection to the selected server. • Layer 3/4 NAPT • Listener: Receives new packets and Classifies them. • Connection Establisher: Manages the NAPT table entries. • NAPT: Redirects the "packet's flow” to the real WEB server and back to the client. Director 1.0

  7. Modules cont. • NAPT timeout manager • Timeout: Terminates inactive client-server connections and removes finished connections NAPT entries. • Debug • Print: Enables a real time Director tables view. Director 1.0

  8. Modules cont. Packets Buffer NAPT Entries Layer 5 URL Director Incoming Packets Header ContentExtraction Packet Routing(Load Balancing) Forward Packet Director 1.0

  9. Raw Sockets • We used raw sockets in order to intercept the raw data directly from layer 3 • Raw Sockets allows the user to receive the packets directly to the user level without passing through all the network layers on the way • A copy of the packets is sent to us by the Raw Sockets and the real packet continues it’s way to the TCP stack • Raw Socket intercept the packets before the packets are processed by the TCP/IP, therefore we can receive and send data even if the TCP/IP is blocked • The use of Raw Sockets is identical to intercepting the packets in the kernel level in terms of the data received Director 1.0

  10. Algorithms used • Layer 5 Director - Accept • Initializes the layer 4 threads and tables • Calls accept() waiting for new connections • When a new connection arrives we create a new thread which connects to the client. • Loops back to accept() Director 1.0

  11. Algorithms cont. • Layer 5 Director - Connect • Reads the request from the client • Calculates the length of the URL and decides which server to connect to. • Calls Connect() with the address of the server containing the requested page. • Builds a semi-complete NAPT entry and inserts it into the semi-complete table. • The thread finishes and exits Director 1.0

  12. Algorithms cont. • Layer 3/4 Director - Listener • Creates a raw-socket and calls Recv() on the socket • After intercepting a packet we categorize it (only TCP packets are inspected - by looking at the protocol field in the IP header we can tell which packets are TCP): • SYN packet – discarded • SYN-ACK packet – inserted into the SYN-ACK queue. • All the other packets are inserted into the Data queue. Director 1.0

  13. Algorithms cont. • Layer 3/4 Director – Connection Establisher • In order to extract the sequence numbers we examine the SYN-ACK packets which are stored in the SYN-ACK queue. • Removes a packet from the queue and searches for the semi-complete entry which has the same port and IP. • Updates the sequence numbers according to the direction of the packet (client-server or server-client) • Inserts the seq. no. into the ACK-3 queue (explained later) • If both directions are updated, the entry is removed from the semi-complete table and entered into the NAPT table. • Loop back to remove a new SYN-ACK packet Director 1.0

  14. Algorithms cont. • Layer 3/4 Director - NAPT • Removes a packet from the Data queue. • Checks if the packet is the ACK packet from one of the handshakes (by comparing its sequence number to the sequence numbers stored in the ACK-3 queue. • Searches for an entry in the NAPT table which has the same port and IP. • If no entry is found the packet is discarded. • If an entry is found we fix the source and destination port and IP, the sequence numbers and the checksums. • We update the time field in the NAPT entry. • The packet is sent onwards (to the server or to the client). Director 1.0

  15. Algorithms cont. • If an entry has received RST (from any direction), the entry is removed. • NAPT timeout manager- Timeout • Every 10 seconds the thread wakes up and goes over all the entries in the NAPT and semi-complete tables. • If an entry is found which has not been used in over 24 hours, it is removed from the tables. • If an entry has received both FINs (from each direction) and at least 60 seconds have passed, the entry is removed. Director 1.0

  16. Algorithms cont. • Debug- Print • At any time we can examine all the tables and queues by hitting a number and pressing enter – a thread is waiting all the time to print the contents of the threads. Director 1.0

  17. Tables and Queues • NAPT and Semi-Complete tables: • Each entry consists of: • Source and destination IP • Source and destination port • Client-director sequence and ack numbers • Director-server sequence and ack numbers • Time stamp • Socket file descriptors (client-director and director-server) • Flags - indicating whether we’ve received both FINs Director 1.0

  18. Tables and Queues cont. • Functions: The table is implemented as a queue: • Enqueue – add a new entry at the head of the queue. • Dequeue – remove an entry from the end of the queue. • Find – finds an address by the given source and destination port and IP. Director 1.0

  19. Tables and Queues cont. • Data and SYN-ACK queues • These queues hold the packet as received off the raw socket, with the link layer headers removed – just the IP and TCP layer headers are saved. • Functions: • Enqueue – add a new entry at the head of the queue • Dequeue – remove an entry from the end of the queue Director 1.0

  20. Tables and Queues cont. • ACK-3 queues • This queue hold the sequence no. as received in the SYN-ACK packet. This queue is used to identify the 3rd packet of the handshake, so that it won’t be passed on to the server. • Functions: • Enqueue – add a new item at the head of the queue. • Remove – find remove an item from the queue if it exists. Director 1.0

  21. Tables and Queues cont. • Address table • This table is used for storing the addresses of the servers and the clients for the use of the raw socket. • The table consists of the IP and port of the address, and a struct sockaddr_ll. • Functions: • Enqueue – add a new address to the table. • Remove – find remove an address from the table. • Find – finds an address by the given port and IP Director 1.0

  22. Notes • In order to avoid having the kernel automatically send an ack for every TCP packet received we used the built in linux firewall: • After sending the SYN-ACK packet to the client we insert a rule in to the firewall that blocks all TCP traffic to this client (port and IP). • After calling connect() we add a rule that blocks all TCP traffic to the server we just connected to (port and IP). • When the entry is removed from the NAPT table, the rule is removed from the firewall too. • Although we are blocking all output traffic to some servers, we can still send raw data to those server using the Raw Sockets. Director 1.0

  23. Questions?

More Related