Expose VoIP Problems With Wireshark June 15, 2010 Sean Walberg Vantage Media SHARKFEST‘10 Stanford University June 14-17, 2010
The Agenda • About VoIP • Capturing VoIP • Analyzing Signaling • Analyzing RTP
The old way Local Loop
The old way Dialtone Off Hook
The old way Dialing Digits
The old way RING – 90v@20Hz
The VoIP way I’m calling x1234
The VoIP way Hey, 1234, you’re being called
The VoIP way Use x.x.x.x:xxxx Use y.y.y.y:yyyy
The VoIP way ZZZZZZ
So there are two parts to VoIP • Signaling • SIP • H.323 • MGCP • SCCP • Proprietary • Voice (Bearer) • RTP (G.711, G.722, G.729a,…)
(two and a half, really) • Touch Tones are a problem unto themselves • 3212333222333 3212333322321
Jitter != Delay Loss Jitter Delay (This is from a program called smokeping)
10, 10, 10, 10 Latency, no jitter 10, 11, 12, 11, 9, 10 Latency and jitter
Same conversation, different perspectives Here you see inbound latency and jitter, but nothing on the outbound Here you see inbound latency and jitter, but nothing on the outbound
NAT changes the address Src=C Dst=D Src=A Dst=B The address changes within the cloud!
By the way… If the signaling or the voice is encrypted, you won’t be able to decode it. Sorry.
Add a column for DSCP Signaling Tagged RTP Untagged RTP Edit -> Preferences User Interface->Columns
Are you running a proprietary PBX? Edit -> Properties, Protocols -> RTP
The Role of Signaling • Indicate to the remote end that a call is coming • Establish the codec to be used for voice • Establish the addresses of the endpoints • Get out of the way • Tear down the connection once it’s done
Back to Loss, Delay, and Jitter • Jitter is usually a non-issue • Delay, within reason, is OK • Clustering/Specific applications notwithstanding • Loss isn’t great • TCP retransmits at layer 4 • UDP retries at layer 7
The properties of RTP • RTP simulates the real time voice normally carried over a wire • 4KHz voice bandwidth = 8KHz sampling rate (Nyquist) • 8 bits/sample * 8KHz = 64,000bps (DS0) • A Codec (G.711u/A law, G.729, G.726, etc) • Most codecs use 20ms voice samples = 50pps • Even with compression, you have a fairly consistent packet rate, only the size changes
DTMF • Compressing DTMF is bad • So many different ways to carry the digits out of band, look for them in traces (see demo)
Three factors that affect voice quality Latency <= 150ms (one way) Jitter <= 20ms Packet loss <= 0.1%
Latency <= 150ms (one way) Jitter buffer, Transcoding delay Transcoding delay Path delay Serialization delay Hi, how are you?Hello? Oops, sorry, go ahead Fine, I oh hello, go ahead
Packet Loss <= 0.1% Hi Bo *POP* How *POP*e you? Hi Bo How you?
Jitter <= 20ms Better late than never? No. May as well be lost.