480 likes | 760 Views
Transmission Control Protocol (TCP). NETS3303/3603 Week 8. Outcomes. Learn the mechanisms of TCP How they operate How they are implemented What are the limitations of TCP Retransmission and congestion control. Intro. TCP - Transmission Control Protocol
E N D
Transmission Control Protocol (TCP) NETS3303/3603 Week 8
Outcomes • Learn the mechanisms of TCP • How they operate • How they are implemented • What are the limitations of TCP • Retransmission and congestion control
Intro • TCP - Transmission Control Protocol • reliable, connection-oriented stream (point to point) protocol • if UDP is like Aus Post • TCP is like a one-to-one phone call (cannot broadcast/multicast)
Intro • RFC 793 and host requirements 1122 • TCP has own jargon: • Socket: a communication endpoint • segment: a TCP packet • MSS: maximum segment size, max pkt one TCP side can send another, negotiated at connection time • port: application identifier
TCP Properties • stream orientation. stream of bytes passed between send/recv • connection is full duplex • think of it as two independent streams joined with piggybacking mechanism • piggybacking - one data stream has control info for the other data stream • unstructured stream • doesn’t show packet boundaries to applications
TCP Properties • virtual circuit connection • client connects and server listens/accepts • i/o transfers don’t have remote peer address • Buffered Transfer • Send and receive buffers for flow control • Reliability!
Providing Reliability • Traditional technique: Positive Acknowledgement with Retransmission (PAR) • Receiver sends acknowledgement when data arrives • Sender starts timer whenever transmitting • Sender retransmits if timer expires before acknowledgement arrives
The Problem With Simplistic PAR This wastes a substantial network bandwidth because it must delay sending a new packet until it receives an ack for the previous packet
Solving The Problem • Allow multiple packets to be outstanding at any time • Still require acknowledgements and retransmission • For reliability • Known as sliding window
Why Sliding Window Works • Because a well-tuned sliding window protocol keeps the network completely saturated with packets • it obtains substantially higher throughput than a simple positive ack protocol
Sliding Window Used By TCP • Measured in byte positions • Bytes through 2 are acknowledged • Bytes 3 through 6 not yet acknowledged • Bytes 7 though 9 waiting to be sent • Bytes above 9 lie outside the window and cannot be sent
Sliding Window • TCP can use cumulative ACK e.g., ACK up to #7 • tcp uses bytes not packets for sequencing • recv-side controls sliding window • Based on its available buffer space • Can stop sending by telling it window size is 0 in ACK, thus flow control
TCP Flow Control • Differs from mechanism used in LLC, HDLC and other data link protocols: • Decouples ack of received data units from granting permission to send more • TCP’s flow control is known as a credit allocation scheme: • And each transmitted octet has a sequence number
Flow Control And TCP Window • Receiver controls flow by telling sender size of currently available buffer measured in bytes • Called window advertisement • Each segment, including data segments, specifies size of window beyond acknowledged byte • Window size may be 0 (receiver cannot accept additional data at present) • Receiver can send additional ack later when buffer space becomes available
TCP Header Fields for Flow Control • Sequence number (SN) of first octet/byte in data segment • Acknowledgement (ACK) number (AN) next octet to receive, (if any) • Window (W) • If ACK contains AN = i, W = j: • Octets through SN = i - 1 acknowledged • Permission is granted to send W = j more octets, i.e., from octets i through i + j - 1
Credit Allocation is Flexible Suppose last message B issued was AN = i, W = j: • To increase credit to k (k > j) when no new data, B issues AN = i, W = k • To acknowledge a segment containing m octets (m < j) without allocating more credit, B issues AN = i + m, W = j – m
Silly Window Syndrome • Problem when sending application creates data slowly or the receiving app consumes slowly • Significantly reduces network efficiency • Smalls windows are advertised and small segments are sent • E.g.: a 1-byte data segment would have 54 bytes overhead => 98%!!
Solution for SWS by Sender • Serving application creates data slowly, may be 1 byte at a time • Solutions using Nagle’s algorithm: • Sending TCP sends first data piece immediately • Subsequently, it accumulates data until receiving ack or enough data to fill MSS. Then, it sends it.
Solution for SWS by Receiver • Serving an app consuming slowly • Buffer gets full quickly and advertised 0 window; and then a small window ad for a long time • Solution Delayed Acknowledgement: • When a segment arrives, don’t ack immediately • Waits until decent space but not more 500 ms to ack
TCP Header Hlen
Header Explained • header sent in every TCP packet • Sometimes may just be control message (SYN/FIN/ACK) with no data • view TCP as 2 sender/recv data streams with control information sent back the other way (piggybacking)
Header • source port: 16 bits, the TCP source port • destination port: 16 bits, note ports in 1st 8 bytes • sequence number: 1st data octet in this segment (from send to recv): 32 bit space (ISN) • ack: if ACK flag set, next expected sequence number (piggybacking; i.e., we are talking about the flow the other way)
Header • hlen: # of 4-byte words in header (e.g. 5 means 20 bytes) • Reserved bits: not used • flags • URG: - urgent pointer field significant • ACK:- ack field significant (this pkt is an ACK!) • PSH: - push function (mostly ignored) • RST: - reset (give up on) the connection (error) • SYN: - initial synchronization packet (start connect) • FIN: - final hangup packet (end connect)
Header • window: window size, begins with ACK field that recv-side will accept (piggyback) • checksum: 16 bits • Includes 12-byte IP pseudo-header, tcp header, and data • urgent pointer: offset from sequence number, points to data following urgent data, URG flag must be set • options - e.g., Max Segment Size (MSS), timestamp
TCP Ports, Connections, And Endpoints • Endpoint of communication is application program • TCP uses protocol port number to identify application • TCP connection between two endpoints identified by four items • Sender’s IP address • Sender’s protocol port number • Receiver’s IP address • Receiver’s protocol port number
TCP Open / Close • Two sides of a connection • One side waits for contact • A server program • Uses TCP’s passive open • One side initiates contact • A client program • Uses TCP’s active open
Closing TCP connection • connections are full duplex and it is possible to shutdown one side at a time • close closes everything • involves two 2-way handshakes (send FIN, recv replies with ACK per channel) • Modified three-way handshake • interesting problem: how do you make sure last ACK got there (can’t ACK it...)?
First FIN cost • both apps could close first and send FIN, hence left side is more complex • but state machine supports async close (right side) • TIME_WAIT state is used to deal with unreliable delivery, must wait 2 MSL (max segment length) time, 1 or 2 minutes typically • Wait for any duplicate segments to arrive before closing
Some TCP Protocol Mechanisms • flow control • adaptive retransmission + backoff • congestion control
TCP Retransmission • Designed for Internet environment • Delays on one connection vary over time • Delays vary widely between connections • Fixed value for timeout will fail • Waiting too long introduces unnecessary delay • Not waiting long enough wastes network bandwidth with unnecessary retransmission • Retransmission strategy must be adaptive
Adaptive Retransmission • TCP keeps estimate of round-trip time (RTT) on each connection • RTT derived from observed delay between sending segment and receiving ack • Timeout for retransmission based on current round-trip estimate
Difficulties With AdaptiveRetransmission • The problem is knowing when to retransmit • Segments or ACKs can be lost or delayed, making RTT estimation difficult or inaccurate • RTTs vary over several orders of magnitude between different connections • Traffic is bursty, so RTTs fluctuate wildly on a single connection!
Solution: Smoothing • Adaptive retransmission schemes keep a statistically smoothed round-trip estimate • Smoothing keeps running average from fluctuating wildly • keeps TCP from overreacting to change • Difficulty: choice of smoothing scheme!
Retransmission Timer Management Three Techniques to calculate RTO: • RTT Variance Estimation • Exponential RTO Backoff • Karn’s Algorithm
RTT Variance Estimation(Jacobson’s Algorithm) 3 sources of high variance in RTT: • If data rate relatively low, then transmission delay (T) will be relatively large, with larger variance due to variance in packet size • Load may change abruptly due to other sources • Peer may not acknowledge segments immediately • So, using only smoothed RTT is insufficient • need to consider delay variance too
Jacobson’s Algorithm • Initial RTO value typically reflects Ethernet delay • Update RTO using returning acks based on average and variance SRTT(K + 1) = (1 – g) × SRTT(K) + g × RTT(K + 1) SERR(K + 1) = RTT(K + 1) – SRTT(K) SDEV(K + 1) = (1 – h) × SDEV(K) + h ×|SERR(K + 1)| SDEV is a RTT variability factor RTO(K + 1) = SRTT(K + 1) + 4 × SDEV(K + 1) g = 0.125 h = 0.25
Two Other Factors Jacobson’s algorithm can significantly improve TCP performance, but: • What RTO to use for retransmitted segments? ANSWER: exponential RTO backoff algorithm • Which round-trip samples to use as input to Jacobson’s algorithm? ANSWER: Karn’s algorithm
Exponential RTO Backoff • Loss indicates congestion; multiple losses more severe congestion! • So, it’s a form of congestion control • Increase RTO each time the same segment retransmitted – backoff process • Multiply RTO by constant: RTO = q × RTO • When q = 2 is called binary exponential backoff (similar to Ethernet backoff)
Which RTT Samples to Consider? • If an ack is received for retransmitted segment, there are 2 possibilities: • Ack is for first transmission • Ack is for second transmission • TCP source cannot distinguish these 2 cases • No valid way to calculate RTT: • From first transmission to ack, or • From second transmission to ack?
Karn’s Algorithm • Do not use measured RTT of retransmitted segments to update SRTT and SDEV • Calculate backoff RTO when a retransmission occurs • Use backoff RTO for segments until an ACK arrives for a segment that has not been retransmitted • Then Jacobson’s algorithm is reactivated to calculate RTO
Summary Of TCP • Major transport service in the Internet (85% of traffic) • Connection oriented • Provides end-to-end reliability • Uses adaptive retransmission • Includes facilities for flow control and congestion avoidance • Uses 3-way handshake for connection startup and shutdown