1 / 20

DYSWIS

DYSWIS. KYUNG-HWA KIM HENNING SCHULZRINNE 12/09/2008 INTERNET REAL-TIME LAB, COLUMBIA UNIVERSITY. Do You See What I See?. Do you see what I see?. End user. Internet. End user. End user. Outline. Overview Fault Detection Peer Selection Probing Problem Implementation Demo.

drew
Download Presentation

DYSWIS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DYSWIS KYUNG-HWA KIM HENNING SCHULZRINNE 12/09/2008 INTERNET REAL-TIME LAB, COLUMBIA UNIVERSITY

  2. Do You See What I See? Do you see what I see? End user Internet End user End user

  3. Outline • Overview • Fault Detection • Peer Selection • Probing • Problem • Implementation • Demo

  4. Overview • Overview • DYSWIS – Do you see what I see • Distributed network fault detection and analysis system • Motivation • Different causes for a particular network fault • Need different ‘view’ from other sources for the fault • End-to-end diagnosis • Need user-friendly interface • Current Problem • Centralized management schemes • Complexity in the user network and devices • Failed to solve the service quality problem • Approach • Collaborate with other end users • P2P based • Remote probing

  5. For Quick Understanding DHT for looking for remote node XMLRPC For Remote Function call Detect Detect Detect Detect Detect Detect Detect Detect Detect Detect Detect Detect Diagnosis Diagnosis Diagnosis Diagnosis Diagnosis Diagnosis Diagnosis Diagnosis Diagnosis Diagnosis Diagnosis Diagnosis Probe Probe Probe Probe Probe Probe Probe Probe Probe Probe Probe Probe Internet

  6. Fault Detection • Automatic fault detection • Network raw packet capturing • Analyze network packet and protocol • Raw packet capturing • Check error response • Check timeout • Check TCP congestion • Monitoring TCP sequence numbers • Define fault cases • Automatic vs. Manual • FSM approach • pre-define • learning

  7. FSM - Approach * Automatic Protocol Failure Detection Using Finite State Machines Zhifeng Wang , Kai X. Miao, Tao Zuo, Henning Schulzrinne, Kyung Hwa Kim, Vishal Kumar Singh

  8. FSM - Approach * Automatic Protocol Failure Detection Using Finite State Machines Zhifeng Wang , Kai X. Miao, Tao Zuo, Henning Schulzrinne, Kyung Hwa Kim, Vishal Kumar Singh

  9. Peer Selection • Peer Selection • DHT or Database • Register myself to DHT network • AS number, subnet, first hop, AP. • Search probing nodes • Inner nodes and outer nodes You can contact to B. His IP address is 218.59.21.16 and port number is 9090 I need some nodes who can help me. Who is in same subnet with me? A B DHT

  10. Peer Selection - DHT (key, value) <key> <type>node</type> <asn>14<asn> <subnet>128.59.0.0/16</subnet> </key> <value> <type>node</type> <ip>128.59.21.15</ip> <port>9090</port> <protocol>udp</protocol> </value> I need some nodes who can help me. Who is in same subnet with me? <key> <type>node</type> <asn>9880<asn> <subnet>45.45.45.0/24</subnet> <firewall>no</firewall> <nat>no</nat> </key> <value> <type>node</type> <ip>128.59.21.15</ip> <hostname>kkh.cs.columbia.edu</hostname> <port>9090</port> <protocol>tcp</protocol> </value> A B DHT

  11. Remote Probing • Distributing modules • Detecting and probing modules should be added and updated • Dynamic class loading • Dynamic module distributing • Modules can be created and updated separately. • XMLRPC

  12. Probing Scenarios • HTTP • Causes: Dead web-server , page moved, low bandwidth … • Check DNS query • TCP connection • Ask other node to try same query • Check TCP congestion • … • DNS • Causes : Dead DNS server , resolution failed, udp is not working , … • Check other DNS server • Ask other node to try to connect my DNS server • Ask other node to query same host to another DNS server • SIP/RTP • Causes: NAT, DNS, proxy server, authentication • Proxy connectivity test • Ask other node to try same action. • …

  13. Probing Scenarios • Connection problem • Causes : Dead server, firewall, wrong port number … • Traceroute – Check routers • Ask other node to try to connect the server • Ask other node to check my port • … • TCP Congestion • Causes : Queuing delay, dead routers • Traceroute , ping • Try to find bottleneck • …

  14. Probing Scenarios A B

  15. Data Gathering • Problem • We have resources: Other machines • But how do we use them efficiently? • We need real data • Approach • Collecting data • Collecting Scenarios • Implementing prototype

  16. Implementation • Architecture http://wiki.cs.columbia.edu/display/res/DYSWIS

  17. For the detail, visit : http://wiki.cs.columbia.edu/display/res/DYSWIS

  18. Demo • Demo

  19. Future work • Implementation • http://www.cs.columbia.edu/~khkim/project/dyswis • Coming soon : Mac & Linux • Testbed - PlanetLab • Mature research for analysis • Support real time protocols • How to find solutions for end users

  20. backup • Check local network. • Select two nodes, one from same subnet, another one from outer subnet. • Let the nodes try to connect the server. • If both nodes failed to connect the server, log this fault as ‘server failure’. • If only internal node failed, execute traceroute to check where the packet is blocked. • If internal node succeeded, it is possible that this problem is caused by local firewall or something else. • Check incoming/outgoing port; Let other nodes open same port, and try to connect there. Check the remote node received packet or not. Check the ACK from remote node came back.

More Related