1 / 41

I/O Multiplexing

I/O Multiplexing. Computer Network Programming. Input from multiple sources. file. other terminal devices. Process. keyboard. screen. sockets. A process may have multiple sources of input and may be sending output to multiple destinations.

lluvia
Download Presentation

I/O Multiplexing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. I/O Multiplexing Computer Network Programming

  2. Input from multiple sources file other terminal devices Process keyboard screen sockets A process may have multiple sources of input and may be sending output to multiple destinations. I/O multiplexing is used to multiplex the input from multiple sources into a single process.

  3. Where do we use • When a client handles multiple descriptors • stdin, a network socket… • When a client handles multiple sockets at the same time • web clients • TCP server handles listening socket and connected socket at the same time • Server handles both TCP and UDP • Server handles multiple servers: inetd for example

  4. I/O Models • Blocking I/O • Nonblocking I/O • I/O multiplexing (select() and poll()) • Signal driven I/O (SIGIO signal) • Asynchronous I/O (aio_functions) • Two phases for an input operation: • waiting for the data to be ready in the kernel • copying the data from the kernel to the process

  5. Blocking I/O Model Kernel Application system call recvfrom no datagram ready Wait for data Process blocks in a call to recvfrom datagram ready copy datagram Copy data from kernel to user return OK copy complete process datagram Assume we want to read from a UDP socket a UDP datagram with recvfrom function (or system call)

  6. Non-blocking I/O Model Kernel Application system call recvfrom no datagram ready EWOULDBLOCK Process repeatedly calls recvfrom waiting for an OK return (polling) system call recvfrom EWOULDBLOCK Wait for data system call recvfrom EWOULDBLOCK system call recvfrom datagram ready copy datagram Copy data from kernel to user return OK copy complete process datagram

  7. I/O Multiplexing Model Kernel Application system call select no datagram ready Process blocks in a call to select waiting for one of possibly many sockets to become readable Wait for data return readable datagram ready copy datagram system call recvfrom Process blocks while data copied into application buffer Copy data from kernel to user return OK copy complete process datagram

  8. Signal driven I/O Kernel Application Establish SIGIO signal handler sigaction system call return Process continues executing Wait for data signal handler deliver SIGIO datagram ready copy datagram Process blocks while data copied into application buffer system call recvfrom Copy data from kernel to user return OK copy complete process datagram

  9. Asynchronous I/O Model Kernel Application system call aio_read no datagram ready return wait for data Process continues executing datagram ready copy datagram Copy data from kernel to user deliver signal signal handler process datagram copy complete specified in aio_read

  10. Comparison of I/O Models Blocking initiate complete Nonblocking check check check check check check check check complete I/O Multiplexing check ready initiate complete Signal-driven I/O notification initiate complete Asynchronous I/O initate notification Wait for data blocked Copy data from kernel to user blocked blocked blocked

  11. synchronous versusasynchronous I/O A synchronous I/O operation causes the requesting process to be blocked until that I/O operation completes Blocking Nonblocking I/O multiplexing Signal driven I/O An asynchronous I/O operation does not cause the requesting process to be blocked. Asynchronous I/O

  12. select() function • Process instruct the kernel to wait for any one of multiple events to occur and wake up the process only one or more of these events occurs or when a special amount of time has passed Select returns when: • descriptor ready for reading • descriptor ready for writing • descriptor has an exception condition pending • some certain amount of time has passed int select(int maxfdp1, fd_set *readset, fd_set * writeset, fd_set *, fd_set * exceptset, const struct timeval *timeout); returns: positive count of ready descriptors, 0 on timeout, -1 on error.

  13. select() Wait forever: timeout = NULL; Wait the amount of time specified by timeout. Do not wait at all: timeout->tv_sec = 0; timeout->tv_usec = 0; struct timeval { long tv_sec; long tv_usec; } void FD_ZERO(fd_set *fdset); /* clear all bits */ void FD_SET(int fd, fd_set *fdset); /* turn on the bit for fd */ void FD_SET(int fd, fd_set *fdset); /* turn off the bit for fd */ int FD_ISSET(int fd, fd_set *fdset); /* is the bit for fd on? */

  14. When is the descriptor ready for read? • The number of bytes in th socket receive buffer is greater than or equal to the current size of the low-water mark for the socket to receive buffer.Low water mark defaults to 1 and can be set using SO_RCVLOWAT socket option. • The read-half of the connection is closed (TCP received a FIN). Zero returned • The socket is listening socket and the number of completed connections for the socket is non-zero. • A socket error is pending.

  15. When a descriptor is ready for write • The number of bytes of available space in the socket send buffer is greater than or equal to the current size of the low-water mark for the socket send buffer • the socket should be connected for TCP • The write-half of the TCP connection is closed. SIGPIPE returned from the function • A socket error is pending.

  16. Example str_cli function We are calling select for readibility on either standard input or socket client stdin Data or EOF socket error EOF TCP RST FIN data TCP layer can receive a data segment, a FIN segment or a RST segment from the peer.

  17. str_cli function at the client void str_cli(FILE *fp, int sockfd) { int maxfdp1; fd_set rset; char sendline[MAXLINE], recvline[MAXLINE]; FD_ZERO(&rset); for ( ; ; ) { FD_SET(fileno(fp), &rset); FD_SET(sockfd, &rset); maxfdp1 = max(fileno(fp), sockfd) + 1; Select(maxfdp1, &rset, NULL, NULL, NULL); if (FD_ISSET(sockfd, &rset)) { /* socket is readable */ if (Readline(sockfd, recvline, MAXLINE) == 0) err_quit("str_cli: server terminated prematurely"); Fputs(recvline, stdout); } if (FD_ISSET(fileno(fp), &rset)) { /* input is readable */ if (Fgets(sendline, MAXLINE, fp) == NULL) return; /* all done */ Writen(sockfd, sendline, strlen(sendline)); } } }

  18. shutdown function • Close() • decrements the reference count of a socket and closes it only when it reaches to zero. • Close terminates both directions of data transfer • Shutdown() • can close a socket immediately without looking to the reference count • can close only read-half or write-half of a connection int shutdown(int socketfd, in howto) howto:SHUT_RD: read-half of connection is closed SHUT_WR: write-half of connection is closed SHUT_RDWR: read-half and write-half of connection is closed.

  19. TCP Echo server • We have written concurrent TCP echo server using child processes using fork. • We can write also a single process concurrent TCP server using select(). • We will use select to handle any number of clients concurrently listening client server client Data structures at the server: fd0 fd1 fd2 fd3 fd4 fd5 client[] [0] 4 0 0 0 1 1 1 [1] 5 maxfd + 1 = 6 [3] -1 ……..

  20. TCP echo server #include "unp.h" int main(int argc, char **argv) { int i, maxi, maxfd, listenfd, connfd, sockfd; int nready, client[FD_SETSIZE]; ssize_t n; fd_set rset, allset; char line[MAXLINE]; socklen_t clilen; struct sockaddr_in cliaddr, servaddr; listenfd = Socket(AF_INET, SOCK_STREAM, 0); bzero(&servaddr, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_addr.s_addr = htonl(INADDR_ANY); servaddr.sin_port = htons(SERV_PORT); Bind(listenfd, (SA *) &servaddr, sizeof(servaddr)); Listen(listenfd, LISTENQ);

  21. maxfd = listenfd; /* initialize */ maxi = -1; /* index into client[] array */ for (i = 0; i < FD_SETSIZE; i++) client[i] = -1; /* -1 indicates available entry */ FD_ZERO(&allset); FD_SET(listenfd, &allset); for ( ; ; ) { rset = allset; /* structure assignment */ nready = Select(maxfd+1, &rset, NULL, NULL, NULL); if (FD_ISSET(listenfd, &rset)) { /* new client connection */ clilen = sizeof(cliaddr); connfd = Accept(listenfd, (SA *) &cliaddr, &clilen); #ifdef NOTDEF printf("new client: %s, port %d\n", Inet_ntop(AF_INET, &cliaddr.sin_addr, 4, NULL), ntohs(cliaddr.sin_port)); #endif for (i = 0; i < FD_SETSIZE; i++) if (client[i] < 0) { client[i] = connfd; /* save descriptor */ break; }

  22. if (i == FD_SETSIZE) err_quit("too many clients"); FD_SET(connfd, &allset); if (connfd > maxfd) maxfd = connfd; if (i > maxi) maxi = i; if (--nready <= 0) continue; } for (i = 0; i <= maxi; i++) { /* check all clients for data */ if ( (sockfd = client[i]) < 0) continue; if (FD_ISSET(sockfd, &rset)) { if ( (n = Readline(sockfd, line, MAXLINE)) == 0) { /*connection closed by client */ Close(sockfd); FD_CLR(sockfd, &allset); client[i] = -1; } else Writen(sockfd, line, n); if (--nready <= 0) break; } } } }

  23. Denial of Service Attack • The TCP server should be designed so that it doesn’t block on a read operation indefinitely • otherwise a malicious reader can make the server block indefinitely, making the server unavailable for other clients • on the previous example, readline may block forever, if a malicious client does not send an end-of-line character • therefore server should do one of the following: • use nonblocking I/O • have each client served by a separate thread of control • place a timeout on the I/O operation.

  24. Socket Options

  25. There are options that affect the operation of the socket. • There are functions to get and set the values of these options • getsockopt() and setsockopt() • fcntl() • ioctl() (we will see this later)

  26. getsockopt(), setsockopt() int getsockopt(int sockfd, int level, int optname, void *optval, size_t *optlen); int setsockopt(int sockfd, int level, int optname, const void *optval, size_t len); sockfd should refer to an open socket descritor. optval is a pointer to a variable to keeps the value level specifies the code in the system to interpret the option SOL_SOCKET IPPROTO_IP IPPROTO_TCP

  27. Socket options level optname get set flag datatype SOL_SOCKET SO_BROADCAST x x x int SO_DEBUG x x x int SO_DONTROUTE x x x int SO_ERROR x int SO_KEEPALIVE x x x int SO_LINGER x x linger{} SO_RECVBUF x x int SO_SENDBUF x x int SO_RCVLOWAT x x int SO_SNDLOWAT x x int SO_RCVTIMEO x x timeval{} SO_SNDTIMEO x x timeval{} SO_REUSEADDR x x x int SO_TYPE x x int IPPROTO_IP IP_HDRINCL x x x int IP_OPTIONS x x x int IP_TOS x x int IP_TTL x x int IPPROTO_TCP TCP_KEEPALIVE x x int TCP_MAXRT x x int TCP_MAXSEG x x int TCP_NODELAY x x int

  28. Default Values for Socket Options aspendos{korpe}:> checkopts SO_BROADCAST: default = off SO_DEBUG: default = off SO_DONTROUTE: default = off SO_ERROR: default = 0 SO_KEEPALIVE: default = off SO_LINGER: default = l_onoff = 0, l_linger = 0 SO_OOBINLINE: default = off SO_RCVBUF: default = 8192 SO_SNDBUF: default = 8192 SO_RCVLOWAT: getsockopt error: Option not supported by protocol SO_SNDLOWAT: getsockopt error: Option not supported by protocol SO_RCVTIMEO: getsockopt error: Option not supported by protocol SO_SNDTIMEO: getsockopt error: Option not supported by protocol SO_REUSEADDR: default = off SO_REUSEPORT: (undefined) SO_TYPE: default = 2 SO_USELOOPBACK: default = off IP_TOS: default = 0 IP_TTL: default = 255 TCP_MAXSEG: default = 536 TCP_NODELAY: default = off

  29. Generic Socket Options • SO_BROADCAST • enables or disables a process to send broadcast messages • only supported for datagram sockets • broadcasting is only supported in broadcast mediums (ethernet, tokenring, wireless LAN….) • SO_DEBUG • Kernel keeps track of every packet and received over the socket

  30. SO_ERROR • obtain the value of so_error variable and reset it to zero. • When a socket error occurs, this so_error variable is set to one of the E… error values. • When such pending error exists on a socket: • select returns • SIGIO is generated for the process (if process is usingsignal driven IO).

  31. SO_KEEPALIVE • When there is no TCP transferred over the connection, a server (or a client) can issue probe segments if this option is set. (after 2 hours) • the peer can respond with an ACK • the peer can respond with a RST (ECONNRESET) • no response at all (ETIMEOUT, EHOSTUNREACH) • SO_LINGER • specifies how close function operates for a connection oriented protocol

  32. SO_LINGER • by default, close returns immediately. The remaining data in the socket send buffer is sent by TCP to the peer.struct linger { int l_onoff; int l_linger;} • if l_onoff is 0, the option is turned off. • if l_onoff is non-zero and l_linger is zero, TCP aborts the connection when close issued: all remaining data is discarded and RST is set to the peer. • if l_onoff is nonzero and l_linger is nonzero, the close will block, until all data is transmitted and acknowledged or linger time expires.

  33. How to know destination processreceived our data • close() returns immediately. Does not gşve any clue if destination application has received our data • close() lingers until the ACK of our FIN is received. This makes sure that the destination TCP has received all the data (but may be the process not) • shutdown followed by a read waits until we receive the peers FIN thereby being sure that the receiving process received all the data and closed the socket • use application level acknowledgements

  34. SO_RCVBUF and SO_SNDBUF • receive buffers are used to hold received data until it is read by the application • receive buffer size should be set before calling connect and listen. • Because of the window scale option sent in SYN segments • for optimum performance, size of socket buffer sizes should be related to Bandwidth x Delay product • we should have at least that much of buffer space.

  35. SO_RCVLOWAT, SO_SNDLOWAT • used by the select function and determines when the select will return readable or writeable on a socket • rcvlowwatermark default is 1, sndlowmatermark default is 2048 • SO_RCVTIMEOUT, SO_SNDTIMEOUT • place timout on socket receives and sends. • Affects the following functions: • read, readv, recv, recvfrom, recvmsg • write, writev, send, sendto, sendmsg

  36. SO_REUSEADDR • when we set this option, the listening server can be restarted even though the child is running using the same port number • Allows multiple instances of the same server to be started on the same port, as long each uses a different IP address • However with TCP we can not start servers that use the same local IP, local port pair no matter what you do. • Allows a process to bind same port number to multiple sockets as long as they have different local IP addresses. • Allows with UDP completely duplicate binding: same IP address and port number can be assigned to multiple sockets. • This is used for multicasting • We will se multicasting later

  37. IPv4 Socket OPTIONS • IP_HDRINCL option • we must build our own IP header if this is set for a raw socket • IP_OPTIONS • setting this options allows us to set IP options in the IP header • IP_RCVDSTADDR • this option causes the destination IP address of the received UDP datagram as an ancillary data by recvmsg. • IP_RECVIF • causes the return of index of the network interface on which a UDP datagram received with recvfrom function. • IP_TOS, IP_TTL • set the type of service field, TTL field in outgoing IP datagrams from this socket

  38. TCP Socket Options • TCP_KEEPALİVE • specifies the idle time in seconds for a connection before TCP starts keepalive probes (of SO_KEEPALIVE option is set) • TCP_MAXRT • specifies the amount of time before a connection is broken once TCP starts retransmitting data. • TCP_MAXSEG • allows us to fetch and set the MSS value for a TCP connection (we can not generate segments larger than specified with this option) • TCP_NODELAY option • disables Nagle’s algorithm.

  39. Nagle’s Algorithm • Designed to reduce the amount of small packets on a wide area network. • Says that if a connection has an outstanding data (data not acked), we can not send small packets on the connection (small means < MSS) • used for rlogin and telnet h h We have chance of sending more than one character in a single TCP segment e l el l o lo !

  40. fcntl function • Stands for file control and performs various descriptor control operations. fcntl(int fd, int cmd, … /* int arg */); • Set a descriptor for non-blocking I/O • use F_SETFL cmd with O_NONBLOCK flag • set a socket for signal driven I/O • use F_SETFL cmd with O_ASYNC flag • set the socket owner • use F_SETOWN cmd • receive thereby SIGIO (data available) and SIGURG (urgent data available) signals fcntl(int fd, int cmd, … /* int arg */);

  41. Example int flags; int socketfd; …. sockfd = socket(AF_INET, SOCK_STREAM, 0); if ((flags = fcntl(fd, F_GETFL, 0)) < 0) err_sys(“F_GETFL error”), flags = flags | O_NONBLOCK; if (fcntl(fd, F_SETFL, flags) < 0) err_sys(“F_SETFL error”),

More Related