1 / 53

Revealing Skype Traffic: When Randomness Plays with You

Revealing Skype Traffic: When Randomness Plays with You. D. Bonfiglio 1 , M. Mellia 1 , M. Meo 1 , D. Rossi 2 , P. Tofanelli 3 Dipartimento di Elettronica, Politecnico di Torino 1 ENST T é l é com Paris 2 Motorola Inc. 3 ACM Sigcomm 2007. Presented by Te-Yuan Huang. Outline. Goal

johnna
Download Presentation

Revealing Skype Traffic: When Randomness Plays with You

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Revealing Skype Traffic:When Randomness Plays with You D. Bonfiglio1, M. Mellia1, M. Meo1,D. Rossi2, P. Tofanelli3Dipartimento di Elettronica, Politecnico di Torino1 ENST Télécom Paris2 Motorola Inc.3 ACMSigcomm 2007 Presented by Te-Yuan Huang

  2. Outline • Goal • Contribution • Know More about Skype • Classifiers • Experiments • Conclusions

  3. Outline • Goal • Contribution • Know More about Skype • Classifiers • Experiments • Conclusions

  4. Goal • Identify Skype Traffic among • aggregated traffic • Direct session • Either UDP or TCP • The algorithm should be • Work in Real-Time • Reliable • Able to detect short flows (only last several seconds)

  5. Outline • Goal • Contribution • Know More about Skype • Classifiers • Experiments • Conclusions

  6. Importance of Skype Traffic Identification • Interest of network operator • Network Design & Provisioning • Traffic and Performance Monitoring • Tariff Policies • Traffic Differentiation

  7. Difference from Related Work • K.T. Chen et al.“Quantifying Skype USI” • Only identify UDP traffic • Need Skype login phase to be monitored • Fail on backbone links • Fail if any modification on Skype login proc. • K. Suh et al.“Characterizing and Detect relayed traffic: A case study using Skype” • Only identify relayed Skype traffic

  8. Outline • Goal • Contribution • Know More about Skype • Classifiers • Experiments • Conclusions

  9. Let’s get hands dirty – Know more about Skype traffic sources A Skype Message

  10. Skype Parameters • Rate • Codec Rate • Delta T • Skype Message Framing Time • The time between two subsequent Skype Message • RF (Redundancy Factor) • The number of past blocks that Skype retransmits

  11. Parameters changes on Network Conditions

  12. Skype Communication Mode • End-to-End (E2E) • Skype user call Skype user • End-to-Out (E2O) • Skype-in/Skype-out • PSTN involved • Only voice data • No video / file transfer / IM

  13. Skype Codec • Codecs • Automatically selected • ISAC • The preferred codec for E2E • G.729 • The preferred codec for E2O

  14. More on Skype Message • Skype encrypt the message • TCP: • Reliable transport • Receive packet in correct sequence(from application layer point of view) • encrypt the whole content of the message • UDP: • Unreliable • Maybe out-of-order • Application layer header is needed • to resolve incorrect order • Only can be obfuscated • Only encrypt partial message

  15. Byte 1 2 3 TCP E2E Message Frame • All ciphered

  16. Byte 1 2 3 4 … UDP E2E Message ID Frame Fun • Identified Field • ID: 16-bit long identifier. • Randomly selected • Fun: 5-bit long field masked by 0x8f • Used to stating the payload type • 0x02, 0x03, 0x07,0x0f : signaling message • 0x0d : Data message (all 4 types DATA) • Not Random, but obfuscate (Mixed) • Frame: ciphered information

  17. Byte 1 2 3 4 … E2O Message CID Frame • Identified Field • CCID: 4 bytes • Connection Identifier (CID) of PSTN gateway • Deterministic • After initial signaling

  18. Outline • Goal • Contribution • Know More about Skype • Classifiers • Experiments • Conclusions

  19. How to Identify Skype Traffic? • Chi-Square Classifier (CSC) • Utilize the knowledge of ciphering mechanism • Naïve Bayes Classifier (NBC) • Utilize the general characteristics of VoIP traffics • Payload-Based Classifier (PBC) • Look into the non-ciphered SoM • Only used for traffic in UDP

  20. Chi-Square Classifier (CSC) • Purpose: • To Know whether message portion is encrypted • Rationale • Given a message, • Only the third bytes is not random • Probably, E2E Skype flow by UDP • The first four bytes are deterministic, others are ciphered • Probably, E2O Skype flow by UDP • The whole message is ciphered • Probably, Skype flow transported by TCP

  21. Chi-Square Classifier (CSC) – Cont. • Chi-Square Distr. • Observing the objects’ ouput for nTOTtimes • There are n possible output • For ith output, it is expected to occur Ei times among nTOT, and is observed to occur Oi times • Then,is Chi-Square Distr. With n-1 degree of freedom

  22. Chi-Square Classifier (CSC) – Cont. • For each flow, take first G group of b bits • For each group g, there are 2b possible output • If the content of the flow is random, then Eifor each group is nTOT / 2b b bits b bits b bits ….. b bits ….. 1 2 3 …… G ……

  23. Chi-Square Classifier (CSC) – Cont. • Evaluate the test statistic as: • Define the thresholds by

  24. Chi-Square Classifier (CSC) – Cont. • G = 16, b = 4bits are used • E2E over UDP • The block g = 5 or 6 is mixed • Others are random • Classified Criteria

  25. Chi-Square Classifier (CSC) – Cont. • E2O over UDP • E2E or E2O over TCP • Not Skype • Otherwise

  26. Chi-Square Classifier (CSC) – Cont. • Deterministic test satistics • Linear with nTOT

  27. Chi-Square Classifier (CSC) – Cont. • Mixed block: • If one bit is fixed and the others are random • Linearly increase with nTOT

  28. Chi-Square Classifier (CSC) – Cont.

  29. Chi-Square Classifier (CSC) – Cont. • Chi-Square works only if the observation is large enough, that is Ei = nTOT/2b >=5 • Namely, nTOT >= 80 • Choose nTOT = 100 • Also, set

  30. Naïve Bayes Classifier • Feature vector x = [xi] • P{C|x} : the probability that the object is belong to class C, given the feature x is observed • P{x|C}: the probability that the feature x will be observed, given the object is belong to class C • Bayes Rule • P{C|x} = P{x|C}P{C} / P{x}

  31. Naïve Bayes Classifier – cont. • Naïve : features are independent • P{x|C} called belief

  32. NBC – Feature Selection • VoIP • Small Message Size • Less burstier than data traffic • Feature • Message size • Observe a window of message at a timex = [s1, s2, …, sw] • Average-Inter Packet Gap (average-IPG)

  33. NBC – Feature Selection • Belief • How to determine • P{si|C} &

  34. NBC – Feature Characterization • For each codec, the message size is determined by • Rate • Header length • Redundancy factor (RF) • Message framing time (delta T) • The message size can be represented by Gaussian distribution

  35. NBC – Feature Characterization • Map each codec to a Gaussian distr. • Model average-IPG to a Gaussian distr. with For Constant Bit Rate Codec For variable Bit Rate Codec

  36. NBC – Derive Beliefs

  37. NBC – Make Decision • Let • Define a threshold Bmin • If B > Bmin • Valid Skype flow • Otherwise • Not Skype flow

  38. Payload Based Classifier (PBC) • Used as cross check for previous two classifier • Only useful for UDP traffic • Two Part • Per-flow Identification • Per-host Identification

  39. PBC - Per-flow Identification Utilize the knowledge about UDP E2E Message • Fun: 5-bit long field masked by 0x8f • Used to stating the payload type • 0x02, 0x03, 0x07,0x0f : signaling message • 0x0d : Data message (all 4 types DATA) Byte 1 2 3 4 … ID Frame Fun

  40. PBC - Per-flow Identification • Terminology • nTOT: the total number of packets in the flow • nsig: the number of Skype signaling message • nE2E: the number of Skype E2E data/video/chat/voice message • nE2O: the number of Skype E2O voice message

  41. PBC - Per-flow Identification • Criteria

  42. PBC - Per-host Identification • Known: a Skype client always uses the same UDP port to send/receive traffic • Before start conversation, • Signaling messages are sent between two clients • Able to identify a Skype client running at a specific IP and port

  43. PBC - Per-host Identification • Criteria to identify the Skype client IP/port

  44. Experiment • Two Data Set • Campus – 95 hours took on 2006/5/29 • No P2P traffic is allowed • Most traffic are TCP data flows • ISP – one day took on 2006/5/15 • All traffic is allowed • More heterogeneous • Expect little Skype traffic

  45. Measurement Result

  46. Measurement Result – UDP, Campus

  47. Measurement Result – UDP, ISP

  48. Measurement Result - TCP

  49. Parameter Tuning - Bmin

  50. Parameter Tuning – X2(Thr)

More Related