Chapter 7

Chapter 7 net_py

net_py Scaling up from one client at a time • All server code in the book, up to now, dealt with one client at a time. • Except our last chatroom homework. • Options for scaling up: • event driven: See chatroom example. problem is its restriction to a single CPU or core • multiple threads • multiple processes (in Python, this really exercises all CPUs or cores)

net_py Load Balancing I • Prior to your server code via DNS round-robin: ; zone file fragment ftp IN A 192.168.0.4 ftp IN A 192.168.0.5 ftp IN A 192.168.0.6 www IN A 192.168.0.7 www IN A 192.168.0.8 ; or use this format which gives exactly the same result ftp IN A 192.168.0.4 IN A 192.168.0.5 IN A 192.168.0.6 www IN A 192.168.0.7 IN A 192.168.0.8

net_py Load Balancing II • Have your own machine front an array of machines with the same service on each and forward service requests in a round-robin fashion.

net_py Daemons and Logging: • “Daemon” means the program is isolated from the terminal in which it was executed. So if the terminal is killed the program continues to live. • The Python program supervisord does a good job in this isolation process and in addition offers the following services: • starts and monitors services • re-starts a service that terminates and stops doing so if the service terminates several times in a short period of time. • http://www.supervisord.org • supervisord sends stdout and stderr output to a log file system that cycles through log, log.1, log.2, log.3 and log.4.

net_py Logging continued: • Better solution is to import your own logging module and save things to a log in that way. • logging has the benefit of writing to what you want - files, tcp/ip connection, printer, whatever. • It can also be customized from a configuration file called logging.conf by using the logging.fileConfig() method. import logging log = logging.getLoger(__name__) log.error('This is a mistake')

net_py Sir Launcelot: • The following is an importable module: #!/usr/bin/env python # Foundations of Python Network Programming - Chapter 7 - lancelot.py # Constants and routines for supporting a certain network conversation. import socket, sys PORT = 1060 qa = (('What is your name?', 'My name is Sir Lancelot of Camelot.'), ('What is your quest?', 'To seek the Holy Grail.'), ('What is your favorite color?', 'Blue.')) qadict = dict(qa) def recv_until(sock, suffix): message = '' while not message.endswith(suffix): data = sock.recv(4096) if not data: raise EOFError('socket closed before we saw %r' % suffix) message += data return message

net_py Sir Launcelot II: • The following is part of an importable module: def setup(): if len(sys.argv) != 2: print >>sys.stderr, 'usage: %s interface' % sys.argv[0] exit(2) interface = sys.argv[1] sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) sock.bind((interface, PORT)) sock.listen(128) print 'Ready and listening at %r port %d' % (interface, PORT) return sock

net_py server_simple.py import lancelot def handle_client(client_sock): try: while True: question = lancelot.recv_until(client_sock, '?') answer = lancelot.qadict[question] client_sock.sendall(answer) except EOFError: client_sock.close() def server_loop(listen_sock): while True: client_sock, sockname = listen_sock.accept() handle_client(client_sock) if __name__ == '__main__': listen_sock = lancelot.setup() server_loop(listen_sock)

net_py Details: • The server has two nested infinite loops – one iterating over different client/server exchanges and one iterating over the individual client/server exchange until the client terminates. • The server is very inefficient; it can only server one client at a time. • If too many clients try to attach the connection queue will fill up and prospective clients will be dropped. Hence the #WHS will not even begin; let alone complete.

net_py Elementary Client: • This client asks each of the available questions once and only once and then disconnects. #!/usr/bin/env python import socket, sys, lancelot def client(hostname, port): s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect((hostname, port)) s.sendall(lancelot.qa[0][0]) answer1 = lancelot.recv_until(s, '.') # answers end with '.' s.sendall(lancelot.qa[1][0]) answer2 = lancelot.recv_until(s, '.') s.sendall(lancelot.qa[2][0]) answer3 = lancelot.recv_until(s, '.') s.close() print answer1 print answer2 print answer3

net_py Elementary Client II: • The rest • It seems fast but is it really? • To test this for real we need some realistic network latency so shouldn't use localhost. • We also need to measure microsecond behaviour. if __name__ == '__main__': if not 2 <= len(sys.argv) <= 3: print >>sys.stderr, 'usage: client.py hostname [port]' sys.exit(2) port = int(sys.argv[2]) if len(sys.argv) > 2 else lancelot.PORT client(sys.argv[1], port)

net_py Tunnel to another machine: ssh -L 1061:archimedes.cs.newpaltz.edu:1060 joyous.cs.newpaltz.edu

net_py Tunneling: • See page 289 for this feature • Alternatively, here is agood explanation of various possible scenarios http://www.zulutown.com/blog/2009/02/28/ ssh-tunnelling-to-remote-servers-and-with-local-address-binding/ http://toic.org/blog/2009/reverse-ssh-port-forwarding/#.Uzr2zTnfHsY

net_py More on SSHD Port Forwarding: • Uses: • access a backend database that is only visible on the local subnet • your ISP gives you a shell account but expects emails to be sent from their browser mail client to their server • reverse port forwarding then ssh -L 3306:mysql.mysite.com user@sshd.mysite.com ssh -L 8025:smtp.homeisp.net:25 username@shell.homeisp.net ssh -R 8022:localhost:22 username@my.home.ip.address ssh -p 8022 username@localhost

net_py Waiting for Things to Happen: • So now we have traffic that takes some time to actually move around. • We need to time things. • If your function, say foo(), is in a file called myfile.py then the script called my_trace.py will time the running of foo() from myfile.py.

net_py My Experiment • Set up VPN from my home so I have a New Paltz IP address • Use joyous.cs.newpaltz.edu as my remote machine • Have both server and client run on my laptop [pletcha@archimedes 07]$ ssh -L 1061:137.140.108.130:1060 joyous [pletcha@archimedes 07]$ python my_trace.py handle_client server_simple.py '' python my_trace.py client client.py localhost 1061

net_py my_trace.py #!/usr/bin/env python # Foundations of Python Network Programming - Chapter 7 - my_trace.py # Command-line tool for tracing a single function in a program. import linecache, sys, time def make_tracer(funcname): def mytrace(frame, event, arg): if frame.f_code.co_name == funcname: if event == 'line': _events.append((time.time(), frame.f_code.co_filename, frame.f_lineno)) return mytrace return mytrace

net_py my_trace.py if __name__ == '__main__': _events = [] if len(sys.argv) < 3: print >>sys.stderr, 'usage: my_trace.py funcname other_script.py ...' sys.exit(2) sys.settrace(make_tracer(sys.argv[1])) del sys.argv[0:2] # show the script only its own name and arguments try: execfile(sys.argv[0]) finally: for t, filename, lineno in _events: s = linecache.getline(filename, lineno) sys.stdout.write('%9.6f %s' % (t % 60.0, s))

net_py My Output: 43.308772 s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) Δ = 83 μs 43.308855 s.connect((hostname, port)) Δ = 644 μs 43.309499 s.sendall(lancelot.qa[0][0]) Δ = 41 μs 43.309540 answer1 = lancelot.recv_until(s, '.') # answers end with '.' Δ = 241 ms 43.523284 while True: Δ = 8 μs 43.523292 question = lancelot.recv_until(client_sock, '?') Δ = 149 ms 43.672060 answer = lancelot.qadict[question] Δ = 9 μs 43.672069 client_sock.sendall(answer) Δ = 55 μs 43.672124 while True: Δ = 4 μs 43.672128 question = lancelot.recv_until(client_sock, '?') Δ = 72 ms 43.744381 s.sendall(lancelot.qa[1][0]) skip this iteration

net_py My Output: 43.744381 s.sendall(lancelot.qa[1][0]) Δ = 80 μs 43.744461 answer2 = lancelot.recv_until(s, '.') Δ = 133 ms 43.877629 answer = lancelot.qadict[question] Δ = 10 μs 43.877639 client_sock.sendall(answer) Δ = 63 μs 43.877702 while True: Δ = 6 μs 43.877708 question = lancelot.recv_until(client_sock, '?') Δ = 55 ms 43.932345 s.sendall(lancelot.qa[2][0]) Δ = 86 μs 43.932431 answer3 = lancelot.recv_until(s, '.') Δ = 149 ms 44.081574 answer = lancelot.qadict[question]

net_py My Output: 44.081574 answer = lancelot.qadict[question] Δ = 8 μs 44.081582 client_sock.sendall(answer) Δ = 47 μs 44.081629 while True: Δ = 4 μs 44.081633 question = lancelot.recv_until(client_sock, '?') Δ = 59 ms 44.140687 s.close() Δ = 88 μs 44.140775 print answer1 Δ = 61 μs 44.140836 print answer2 Δ = 20 μs 44.140856 print answer3 Δ = 146 ms 44.287308 except EOFError: Δ = 11 μs 44.287317 client_sock.close()

net_py Observations: • Server finds the answer in 10 microseconds (answer =) so could theoretically answer 100000 questions per second. • Each sendall() takes ~60 microseconds while each recv_until() takes ~60 milliseconds (1000 times slower). • Since receiving takes so long we can't process more than 16 questions per second with this iterative server. • The OS helps where it can. Notice that sendall() is 1000 times faster than recv_until(). This is because the sendall() function doesn't actually block until data is sent and ACKed. It returns as soon as the data is delivered to the TCP layer. The OS takes care of guaranteeing delivery.

net_py Observations: • 219 milliseconds between moment when client executes connect() and server executes recv_all(). If all client requests were coming from the same process, sequentially this means we could not expect more than 4 sessions per second. • All the time the server is capable of answering 33000 sessions per second. • So, communication and most of all, sequentiality really slow things down. • So much server time not utilized means there has to be a better way. • 15-20 milliseconds for one question to be answered so roughly 40-50 questions per second. Can we do better than this by increasing the number of clients?

net_py Benchmarks: • See page 289 for ssh -L feature • Funkload: A benchmarking tool that is written in python and lets you run more and more copies of something you are testing to see how things struggle with the increased load.

net_py Test Routine: • Asks 10 questions instead of 3 #!/usr/bin/env python from funkload.FunkLoadTestCase import FunkLoadTestCase import socket, os, unittest, lancelot SERVER_HOST = os.environ.get('LAUNCELOT_SERVER', 'localhost') class TestLancelot(FunkLoadTestCase): # python syntax for sub-class def test_dialog(self): # In Java & C++, receiver objects are implicit; # in python they are explicit (self == this. sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) sock.connect((SERVER_HOST, lancelot.PORT)) for i in range(10): question, answer = lancelot.qa[i % len(launcelot.qa)] sock.sendall(question) reply = lancelot.recv_until(sock, '.') self.assertEqual(reply, answer) sock.close() if __name__ == '__main__': unittest.main()

net_py Environment Variables: • You can set a variable from the command line using SET and make sure it is inherited by all processes run from that command line in the future by EXPORTING it. • The authors explain why they are using environment variables - “I can not see any way of to pass actual arguments through to tests via Funkload command line arguments” • Run server_simple.py on separate machine. (BenchMark)[pletcha@archimedes BenchMark]$ export LAUNCELOT_SERVER=joyous.cs.newpaltz.edu

net_py Config file on laptop: # TestLauncelot.conf: <test class name>.conf [main] title=Load Test For Chapter 7 description=From the Foundations of Python Network Programming url=http://localhost:1060/ # overridden by environment variable import [ftest] log_path = ftest.log result_path = ftest.xml sleep_time_min = 0 sleep_time_max = 0 [bench] log_to = file log_path = bench.log result_path = bench.xml cycles = 1:2:3:5:7:10:13:16:20 duration = 8 startup_delay = 0.1 sleep_time = 0.01 cycle_time = 10 sleep_time_min = 0 sleep_time_max = 0

net_py Testing Funkload: • Big mixup on Lancelot-Launcelot. (BenchMark)[pletcha@archimedes BenchMark]$ fl-run-test lancelot_tests.py TestLancelot.test_dialog . ---------------------------------------------------------------------- Ran 1 test in 0.010s OK

net_py Benchmark run: • Typical cycle output Cycle #7 with 16 virtual users ------------------------------ * setUpCycle hook: ... done. * Current time: 2013-04-11T13:46:34.536010 * Starting threads: ................ done. * Logging for 8s (until 2013-04-11T13:46:44.187746): . ........................... done. * Waiting end of threads: ................ done. http://www.cs.newpaltz.edu/~pletcha/NET_PY/test_dialog-20130411T134423/index.html

net_py Interpretation: • Since we are sending 10 questions per connection (test) we are answering 1320 questions per second. • We greatly outdid the original 16 questions per second in the sequential test example. • Adding more than 3 or 4 clients really didn't help. • Remember we still only have a single-threaded server. The reason for the improvement is that clients can be “pipelined” with several clients getting something done at the same time. • The only thing that can't be in parallel is answering the question.

net_py # Clients 3 # Questions 1320 403 # Ques/client 5 1320 264 10 1320 132 15 1320 99 20 1320 66 Performance: • Adding clients drags down performance • Insurmountable problem: Server is talking to only one client at a time.

net_py Performance: • Adding clients drags down performance • Insurmountable problem: Server is talking to only one client at a time.

net_py Event-driven Servers: • The simple server blocks until data arrives. At that point it can be efficient. • What would happen if we never called recv() unless we knew data was already waiting? • Meanwhile we could be watching a whole array of connected clients to see which one has sent us something to respond to.

net_py Event-driven Servers: #!/usr/bin/env python # Foundations of Python Network Programming - Chapter 7 - server_poll.py # An event-driven approach to serving several clients with poll(). import lancelot import select listen_sock = lancelot.setup() sockets = { listen_sock.fileno(): listen_sock } requests = {} responses = {} poll = select.poll() poll.register(listen_sock, select.POLLIN)

net_py Event-driven Servers: while True: for fd, event in poll.poll(): sock = sockets[fd] # Removed closed sockets from our list. if event & (select.POLLHUP | select.POLLERR | select.POLLNVAL): poll.unregister(fd) del sockets[fd] requests.pop(sock, None) responses.pop(sock, None) # Accept connections from new sockets. elif sock is listen_sock: newsock, sockname = sock.accept() newsock.setblocking(False) fd = newsock.fileno() sockets[fd] = newsock poll.register(fd, select.POLLIN) requests[newsock] = ''

net_py Event-driven Servers: # Collect incoming data until it forms a question. elif event & select.POLLIN: data = sock.recv(4096) if not data: # end-of-file sock.close() # makes POLLNVAL happen next time continue requests[sock] += data if '?' in requests[sock]: question = requests.pop(sock) answer = dict(lancelot.qa)[question] poll.modify(sock, select.POLLOUT) responses[sock] = answer # Send out pieces of each reply until they are all sent. elif event & select.POLLOUT: response = responses.pop(sock) n = sock.send(response) if n < len(response): responses[sock] = response[n:] else: poll.modify(sock, select.POLLIN) requests[sock] = ''

net_py Event-driven Servers: • The main loop callspoll(), which blocks until something/ anything is ready. • The difference is recv() waited for a single client and poll() waits on all clients. • In the simple server we had one of everything. In this polling server we have an array of everything; one of each thing dedicated to each connection. • How poll() works: We tell it what sockets to monitor and what activity we are interested in on each socket – read or write. • When one or more sockets are ready with something, poll() returns.

net_py Event-driven Servers: • The life-span of one client: 1: A client connects and the listening socket is “ready”. poll() returns and since it is the listening socket, it must be a completed 3WHS. We accept() the connection and tell our poll() function we want to read from this connection. To make sure they never block we set blocking “not allowed”. 2: When data is available, poll() returns and we read a string and append the string to a dictionary entry for this connection. 3: We know we have an entire question when '?' arrives. At that point we ask poll() to write to the same connection. 4: Once the socket is ready for writing (poll() has returned) we send as much of we can of the answer and keep sending until we have sent '.'. 5: Next we swap the client socket back to listening-for-new-data mode. 6: POLLHUP, POLLERR and POLLNOVAL events occur on send() so when recv() receives 0 bytes we do a send() to get the error on our next poll().

net_py server_poll.py benchmark: • server_poll.py benchmark • So we see some performance degradation. http://www.cs.newpaltz.edu/~pletcha/NET_PY/ test_dialog-20130412T081140/index.html

net_py We got Errors • Some connections ended in errors – check out listen(). • TCP man page: socket.listen(backlog) Listen for connections made to the socket. The backlog argument specifies the maximum number of queued connections and should be at least 0; the maximum value is system-dependent (usually 5), the minimum value is forced to 0. tcp_max_syn_backlog (integer; default: see below; since Linux 2.2) The maximum number of queued connection requests which have still not received an acknowledgement from the connecting client. If this number is exceeded, the kernel will begin dropping requests. The default value of 256 is increased to 1024 when the memory present in the system is adequate or greater (>= 128Mb), and reduced to 128 for those systems with very low memory (<= 32Mb). It is recommended that if this needs to be increased above 1024, TCP_SYNQ_HSIZE in include/net/tcp.h be modified to keep TCP_SYNQ_HSIZE*16<=tcp_max_syn_backlog, and the kernel be recompiled.

net_py Poll vs Select • poll() code is cleaner but select(), which does the same thing, is available on Windows. • The author's suggestion: Don't write this kind of code; use an event-driven framework instead.

net_py Non-blocking Semantics • In non-blocking mode, recv() acts as follows: • If data is ready, it is returned • If no data has arrived, socket.error is raised • if the connection is closed, '' is returned. • Why does closed return data and no data return an error? • Think about the blocking situation. • First and last can happen and behave as above. Second situation won't happen. • The second situation had to do something different.

net_py Non-blocking Semantics: • send() semantics: • if data is sent, its length is returned • socket buffers full: socket.error raised • connection closed: socket.error raised • Last case is interesting. Suppose poll() says a socket is ready to write but before we call send(), the client sends a FIN. Listing 7-7 doesn't code for this situation.

net_py Event-driven Servers: • They are “blocking” and are “synchronous”. • poll(), afterall, blocks. This is not essential. You can have poll()timeout and go through a housekeeping loop. Such a server is not entirely event-driven. • Since the server reads a question or sends back a reply as soon as it is ready, it is not really asynchronous. • “Non-blocking” here, means on a particular client. • This is the only alternative to a “busy-loop”, since such a program would grab 100% of CPU. Instead we go into the blocking-IO state of the scheduling state diagram. • Author says, “leave the term asynchronous” for hardware interrupts, signals, etc.

net_py Alternatives to Figure 7-7 • Twisted Python is a framework for implementing event -driven servers. #!/usr/bin/env python from twisted.internet.protocol import Protocol, ServerFactory from twisted.internet import reactor import lancelot class Lancelot(Protocol): def connectionMade(self): # onConnect() self.question = '' def dataReceived(self, data): # onDataReceived self.question += data if self.question.endswith('?'): self.transport.write(dict(lancelot.qa)[self.question]) self.question = '' factory = ServerFactory() factory.protocol = Lancelot reactor.listenTCP(1060, factory) reactor.run()

net_py Homework: • Rewrite server_twisted.py (Listing 7-8) to handle the situation in which the client does not wait for the answer to a question before sending out a new question. • Rewrite the TestLauncelot.test_dialog() method (Listing 7-5) to send out all questions before it processes any of the answers. • Run the funkload test with the same configuration file. • In order to do this you will need to set up your own virtual environment and install both funkload (page 106) and twisted. mkvirtualenv BenchMark cd to project directory mkproject BenchMark workon BenchMark cp Chapter07code/* . pip install funkload pip install twisted

net_py More on Twisted Python • Twisted Python handles all the polling, etc of our Listing 7-7. • In addition, Twisted can work with actual services that take some time to execute. • Hence Twisted can deal with services that require more than the 10 microseconds it takes the program to look up the answer to the question in the dictionary. • It does so by allowing you to register deferred methods. These are like callbacks (or callback chain, actually) that are registered and fire off as a separate thread when data is received. They have to terminate with a “send reply” event

net_py Load Balancing and Proxies: • Our server in Listing 7-7 is a single thread of control. So is Twisted (except for its deferred methods). • Single threaded approach has a clear upper limit – 100% CPU time. • Solution – run several instances of server (on different machines) and distribute clients among them. • Requires a load balancer that runs on the server port and distributes client requests to different server instances. • It is thus a proxy since to the clients it looks like the server and to any server instance it looks like a client.

net_py Load Balancers: • Some are built into network hardware. • HAProxy is a TCP/HTTP load balancer. • Firewall rules on Linux let you simulate a load balancer. • Traditional method is to use DNS – one domain name with several IP addresses. • Problem with DNS solution is that once assigned a server, the client is stuck if server goes down. • Modern load balancers can recover from a server crash by moving live connections to different server instance. • DNS still used for reasons of geography. In Canada, enterwww.google.comand you'll get www.google.ca

Chapter 7

Chapter 7

Presentation Transcript

Chapter 7

Chapter 7

Chapter 7

CHAPTER 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7

Chapter 7