Network Application Performance - PowerPoint PPT Presentation

Network application performance
1 / 65

  • Uploaded on
  • Presentation posted in: General

Deke Kassabian and Shumon Huque ISC Networking & Telecommunications February 2002 - Super Users Group. Network Application Performance. What this talk is all about Network performance on the local area network and around campus

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentationdownload

Network Application Performance

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Network application performance

Deke Kassabian and Shumon Huque

ISC Networking & Telecommunications

February 2002 - Super Users Group

Network Application Performance


What this talk is all about

Network performance on the local area network and around campus

Network performance in the wide area and for advanced applications

Goal: acceptable performance, positive user experience


Who needs to be involved

Who needs to be involved?

  • End Users

  • Researchers

  • Local Support Providers

  • Application Developers

  • System Programmers/Administrators

  • Network Engineers

What is performance

“Performance” might mean …

Elapsed time for file transfers

Packet loss over a period of time

Percentage of data needing retransmission

Drop outs in video or audio

Subjective “feeling” that feedback is “on time”

What is performance?


Throughput is the amount of data that arrives per unit time.

“Goodput” is the amount of data that arrives per unit time, minus the amount of that data that was retransmitted.



Delay is a time measurement for data transfer

One way network delay for a bit in transit

Delay for a total transfer

Time from mouse click to screen message that the “operation is complete”



Stack to Stack

Eyeball to eyeball


Variation in delay over time

Non-issue for non-realtime applications

May be problematic for some applications with real-time interactive requirements, such as video conferencing

E2E delay of 70 ms +/- 5 ms -> low jitter

E2E delay of 35 ms +/- 20 ms -> higher jitter


Some contributors to delay

Slow networks

Slow computers

Poor TCP/IP stacks on end-stations

Poorly written applications

Some Contributors to Delay

Analysis of delay

Analysis of Delay



(2) Propagation Delay

Insertion time

(3) Processing Delay

Analysis of delay1

Analysis of Delay

From: Deke

To: Ira

Date: Mon Feb 12, 2002, 11:00AM EST

Subject: Lunch

Hey Ira,

Meet you at the food trucks at noon!


Send 1,000 bits from A to B,

With an acknowledgement,

Over 100 meters of fiber



(2) Propagation Delay

0.0000004 sec

Insertion time

(3) Processing Delay

0.0001 sec

0.01 sec

Analysis of delay2

Analysis of Delay

Send 1,000 bits from A to B,

With an acknowledgement,

Over 100 meters of fiber



Total Elapsed Time: 0.0211008 seconds

Analysis of delay3

Analysis of Delay


Add 2 switches and a

router to the path





Add 0.00002 sec

Add 0.00002 sec

Add 0.002 sec

New Total Elapsed Time: 0.0231408 seconds

Summary of delay analysis

Propagation delay is of little consequence in LANs, more of an issue for high bandwidth WANs.

Queueing delays are rarely major contributors.

Processing delay is almost always an issue.

Retransmission delays can be major contributors to poor network performance.

Summary of Delay Analysis

Speaker change

Speaker Change

What i m going to talk about

What I’m going to talk about

  • More on delay contributors, their causes and how to minimize them

  • Protocol Stack behavior & tuning

  • Quality of Service (QoS)

  • Performance measurement tools

  • Operating System tuning examples

  • General comments about things you can do

Recap delay contributors

Recap: Delay Contributors

  • Processing Delay

  • Retransmission Delay

  • Queueing Delay

  • Propagation Delay

Processing delay

Processing Delay

  • Time it takes to process a packet at an end-station or network node. Depends on:

    • Network protocol complexity, application code, computational power at node, NIC efficiency etc

  • Endstation Tuning

  • Application Tuning

Endstation tuning

Endstation Tuning

  • Good network hardware/NICs

  • Correct speed/duplex settings

    • Auto-negotiation problems

  • Sufficient CPU

  • Sufficient Memory

  • Network Protocol Stack tuning

    • Path MTU discovery, Jumbo Frames, TCP Window Scaling, SACK etc

Ethernet bandwidth duplex mode

Ethernet Bandwidth/Duplex mode

  • Ethernet bandwidth: 10, 100, 1000

    • 10 Gigabit Ethernet soon

  • Duplex modes: half-duplex, full-duplex

  • Auto-Negotiation

  • Mismatch Detection:

    • CRC/Alignment errors

    • Late Collisions

Application tuning

Application Tuning

  • Optimize access to host resources

  • Pay attention to Disk I/O issues

  • Pay attention to Bus and Memory issues

  • Know what concurrent activity may be interfering with performance of app

  • Tuning application send/receive buffers

  • Efficient application protocol design

  • Positive end user feedback

    • Subjective perception of performance

Retransmission delay

Retransmission Delay

  • Causes

    • Packet loss

      • Bad hardware: NICs, switches, routers, transmission lines

    • Congestion and Queue drops

    • Out of order packet delivery

      • May be considered packet loss from application’s perspective if it can’t re-order packets

    • Untimely delivery (delay)

      • Some apps may consider a packet to be lost if they don’t receive it in a timely fashion

Retransmission delay cont

Retransmission Delay (cont)

  • Mitigating retransmission delay

    • Ensure working equipment

      • Although some packet loss is unavoidable; eg. most transmission lines have a BER (Bit Error Rate)

    • Reduce time to recover from packet loss

      • Eg. Highly tuned network stack with more aggressive retransmission and recovery behavior

    • Forward Error Correction (FEC)

      • Very useful for time/delay sensitive applications

      • Also, for cases when it’s expensive to retransmit data

Bit errors on wan paths

Bit Errors on WAN paths

  • Bit Error Rate (BER) specs for networking interfaces/circuits may not be low enough:

    • 1 bit-error in 10 billion bits

    • Assuming 1500 byte packets

    • Packet error rate: 1 in 1 million

    • 10 hops => 1 in 100,000 packet drop rate

Queueing delay

Queueing Delay

  • Long queueing delays could be caused by lame hardware (switches/routers)

    • Head of line blocking

    • Insufficient switching fabric

    • Insufficient horse power

  • Unfavorable QoS treatment

Queueing delay cont

Queueing Delay (cont)

  • How to reduce

    • Use good network hardware

    • Improved network architecture

      • Reduce number of switching/routing elements on the network path

      • Richer network topology, more interconnections

      • End user may not have influence over architecture

    • Employ preferential queue scheduling algorithms

      • Will discuss later in QoS section of talk

Propagation delay

Propagation Delay

  • Restricted by speed of light through transmission medium

    • Can’t be changed, but rarely a concern in the campus/LAN environment

    • A concern in long distance paths (WAN), but

      • Some steps can be taken to increase performance (throughput) on such paths

Other delays and bottlenecks

Other delays and bottlenecks

  • Intermediary systems

    • DNS

    • Routing issues

      • Route availability, asymmetric routing, routing protocol stability and convergence time

    • Firewalls

    • Tunnels (IPSec VPNs, IP in IP tunnels etc)

      • Router hardware poor at encap/decap



  • Influenced by a number of variables:

    • All the delay factors we discussed

    • Window size (for TCP)

    • Bottleneck link capacity

    • End station processing and buffering capacity

What i m going to talk about next

What I’m going to talk about next

  • Brief description of TCP/IP protocol

  • How to improve TCP/IP performance

Transport tcp vs udp

Transport: TCP vs UDP

  • Network apps use 2 main transport protocols:

  • TCP (Transmission Control Protocol)

    • Connection oriented (telephone like service)

    • Reliable: guarantees delivery of data

    • Flow control

    • Examples: Web (HTTP), Email (SMTP, IMAP)

  • UDP (User Datagram Protocol)

    • Connectionless (postal system like)

    • Unreliable: no guarantees of delivery

    • Examples: DNS, various types of streaming media

  • When to use tcp or udp

    When to use TCP or UDP?

    • Many common apps use TCP because it’s convenient

      • TCP handles reliable delivery, retransmissions of lost packets, re-ordering, flow control etc

  • You may want to use UDP if:

    • Delays introduced by ACKs are unacceptable

    • TCP congestion avoidance and flow control measures are unsuitable for your application

    • You want more control of how your data is transported over the network

    • Highly delay/jitter sensitive apps often use UDP

      • Audio-video conferencing etc

  • Network stack tuning

    Network Stack Tuning

    • Jumbo Frames

    • Path MTU Discovery

    • TCP Extensions:

      • Window Scaling - RFC 1323

      • Fast Retransmit Fast Recovery

      • Selective Acknowledgements

    Jumbo frames

    Jumbo Frames

    • Increase MTU used at link layer, allowing larger maximum sized frames

    • Increases Network Throughput

    • Fewer larger frames means:

      • Fewer CPU interrupts and less processing overhead for a given data transfer size

  • Some studies have shown Gigabit Ethernet using 9000 byte jumbo frames provided 50% more throughput and used 50% less CPU!

    • (default Ethernet MTU is 1500 bytes)

  • Jumbo frames cont

    Jumbo Frames (cont)

    • Pitfalls:

      • Not widely deployed yet

        • Many network devices may not be capable of jumbo frames (they’ll look like bad frames)

    • May cause excessive IP fragmentation

    • BER may have more impact on jumbo frames

      • Eg. A single bit-error can cause a large amount of data to be lost and retransmitted

    • May have negative impact on host processing requirements:

      • More memory for buffering, newer NICs

    Path mtu discovery

    Path MTU Discovery

    • MTU (Max Transmission Unit)

      • Max sized frame allowed on the link

  • Path MTU

    • Min MTU on any network in the path between 2 hosts

  • IP Fragmentation & Reassembly

  • Path MTU Discovery

  • MSS (Max Segment Size)

  • What happens without PMTU discovery?

    • Might select wrong MTU and cause fragmentation

    • Suboptimal selection of TCP MSS (536 default?)

  • Path mtu discovery cont

    Path MTU Discovery (cont)



    IP fragmentation may occur







    Path MTU is 1500


    Tcp sliding window

    TCP Sliding Window

    • TCP uses a flow control method called “Sliding Window”

      • Allows sender to send multiple segments before it has to wait for an ACK

      • Results in faster transfer rate coz sender doesn’t have to wait for an ACK each time a packet is sent

      • Receiver advertises a window size that tells the sender how much data it can send without waiting for ACK

    Tcp sliding window cont

    TCP Sliding Window (cont)

    Slow start

    Slow Start

    • In actuality, TCP starts with small window and slowly ramps it up (upto rwin)

    • Congestion Window (cwnd)

      • controls startup and limits throughput in the face of congestion

      • cwnd initialized to 1 segment

      • cwnd gets larger after every new ACK

      • cwnd gets smaller when packet loss is detected

    • Slow Start is actually exponential

    Congestion avoidance

    Congestion Avoidance

    • Assumption: packet loss is caused by congestion

    • When congestion occurs, slow down transmission rate

      • Reset cwnd to 1 if timeout

      • Use slowstart until we reach the half way point where congestion occurred.

      • Then use linear increase

        • Increase cwnd by ~ 1 segment/RTT

    Tcp behavior

    packet loss, D-ACK



    slow start: exponential increase

    congestion avoidance: linear increase

    retransmit: slow start again


    TCP Behavior

    • Recovery after a loss can be very slow on today’s high delay/bandwidth links

      • (graph from Peter O’Neill, NCAR)

    Tcp throughput acceleration

    TCP Throughput Acceleration

    (From Phil Dykstra)

    Tcp window size tuning

    TCP Window Size Tuning

    • TCP performance depends on:

      • Transfer rate (bandwidth)

      • Round trip time

    • BW*Delay product

    • TCP Window should be sized to be at least as large as the BW*Delay product

    Bw delay product

    BW*Delay Product

    • BW*Delay product measures:

      • Amount of data that would fill the network pipe

      • Buffer space required at sender and receiver to achieve the max possible TCP throughput

      • Amount of unacknowledged data that TCP must handle in order to keep pipe full

    Bw delay example

    BW*Delay example

    • A path from Penn to Stanford has:

      • Round trip time: 60 ms

      • Bandwidth: 120 Mbps

    • BW * Delay =

      • 60/1000 sec * 120 * 1000000 bits/sec

      • = 7200000 bits = 7200 Kbits

      • = 900 Kbytes

    • So TCP window should be at least 900KB

    Tcp window scaling

    TCP Window Scaling

    • RFC 1323: TCP Extensions for High Performance

    • Allows scaling of TCP window size beyond 64KB (16 bit window field)

      • Introduces new TCP option

    • Note: In previous example, TCP needs to support Window Scaling to use 900KB window

    Window scaling pitfalls

    Window Scaling Pitfalls

    • Why not use large windows always?

      • Might consume large memory resources

      • May not be useful for all applications

      • Isn’t useful in the campus/LAN environment

    Fast retransmit fast recovery

    Fast Retransmit Fast Recovery

    • TCP required to send immediate D-ACK when out-of-order packet received

    • After 3 D-ACKs, sending TCP retransmits only one segment

    • Also perform congestion avoidance but not slow start








    Packet loss, causing D-ACK

    Tcp selective acks sack

    TCP Selective Acks (SACK)

    • RFC 2018

    • Allows TCP to efficiently recover from multiple segment losses within a window

    • Without retransmitting entire window

    Enough about tcp

    Enough about TCP

    Performance depends on app

    Performance depends on App

    • So, understand application’s requirements (high throughput, low latency, low jitter), eg:

      • File Transfer using TCP

        • Needs high throughput

        • Intolerant of packet loss

        • May be more tolerant of delay

      • Interactive Video Conferencing application

        • Tolerant of some loss

        • More intolerant of delay and jitter

    Quality of service qos

    Quality of Service (QoS)

    • A method to selectively allocate scarce network resources

    • A mechanism to offer varying degrees of service to varying classes of traffic

    • Service: delay, jitter, proportion of link bandwidth etc

    Quality of service qos cont

    Quality of Service (QoS) cont

    • Requires deployed QoS infrastructure

      • Might require

        • Traffic marking capabilities in hosts and network hardware

        • Traffic classification and identification capabilities

        • Multiple traffic queues with different service characteristics

        • Different queue servicing algorithms

        • Mechanisms to specify and enforce QoS policy

        • Signalling mechanisms

    • IEEE 802.1p, IP precedence, IntServ/RSVP, DiffServ, MPLS

    Performance measurement tools

    Performance Measurement Tools

    • To measure “real” performance of an app, you need to instrument the app with measurement code!

    • However, independent measurement of some common network perf metrics can be done

    • Two kinds:

      • Active and Passive measurement

    Active measurement

    Active Measurement

    • Ping

    • Traceroute

    • Netperf

    • Iperf

    • Pathchar

    • Pathrate

    • Mping

    Passive measurement

    Passive Measurement

    • OCxMON/PCMon

    • Router/switch stats collected via

      • SNMP

      • Netflow, etc

    • tcpdump, snoop, etherfind

    Some tuning examples

    Some tuning examples

    • Microsoft Windows

      • Newer versions: Win98, Win2K, WinXP support many of the features (window scaling, PMTU discovery, SACK etc)

      • May require registry tweaks to turn some of them on

      • TCPTune: A TCP Stack Tuner for Windows


    More tuning examples

    More tuning examples

    • MacOS X

      • [need to find out more, who knows?]

      • Supports window scaling:

        • $ sysctl net.inet.tcp.rfc1323

        • net.inet.tcp.rfc1323: 1

      • Socket buffer raising:

        • Kernel tunable kern.ipc.maxsockbuf

      • TCP send/receive buffer tuning:

        • Tunables supported:

          • net.inet.tcp.sendspace

          • net.inet.tcp.recvspace

    More tuning examples1

    More tuning examples

    • Linux

      • In /proc/sys/net/core/ set:

        • rmem_default

        • rmem_max

        • wmem_default

        • wmem_max

      • In /proc/sys/net/ipv4 set:

        • tcp_windowscaling

        • tcp_sack

    More tuning examples2

    More tuning examples

    • Solaris 2.x - 8

      • ndd -set /dev/tcp tcp_max_buf xxx

      • ndd -set /dev/tcp tcp_xmit_hiwat xxx

      • ndd -set /dev/tcp tcp_recv_hiwat xxx

      • ndd -set /dev/ip ip_path_mtu_discovery 1

      • ndd -set /dev/tcp tcp_sack_permitted 2

    Web100 project

    Web100 Project


    • Enhance TCP capabilities with:

      • Better (finer grain) kernel instrumentation

      • Automatic controls

    • Availability:

      • Today: Linux (patches for 2.4.16 kernel)

      • Being ported to other operating systems.

    Things you can do wan

    Things you can do (WAN)

    • Make sure app offers adequately sized receive windows and send buffers

    • But don’t run your system out of memory

    • Find out your path RTT with ping

    • Check your path with traceroute

    • Determine bottleneck capacity and available bandwidth on path

    • Make sure your OS uses Path MTU discovery

    • Make sure your OS uses TCP Large Windows, Fast Retransmit, SACK

    Things you can do campus

    Things you can do (Campus)

    • Check your host

      • (80% of the problems)

    • Check your host

      • Bandwidth/Duplex problems

      • Network stack tuning

      • Application tuning

    • Talk to campus networking folks



    • Understand performance requirements of your application

    • What are the issues?

      • Campus/LAN environment

      • WAN environment

    • What can you do to ask for help?

    Any questions

    Any Questions?

    • Deke Kassabian


    • Shumon Huque


  • Login