1 / 43

A Study of Applications for Optical Circuit-Switched Networks

Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640 grants. A Study of Applications for Optical Circuit-Switched Networks. Xiuduan Fang May 1, 2006. Outline. Introduction CHEETAH Background CHEETAH concept and network CHEETAH end-host software

halden
Download Presentation

A Study of Applications for Optical Circuit-Switched Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Supported by NSF ITR-0312376, NSF EIN-0335190, and DOE DE-FG02-04ER25640 grants A Study of Applications for Optical Circuit-Switched Networks Xiuduan Fang May 1, 2006

  2. Outline • Introduction • CHEETAH Background • CHEETAH concept and network • CHEETAH end-host software • Analytical Models of GMPLS Networks • Application (App) I: Web Transfer App • App II: Parallel File Transfers • Summary and Conclusions

  3. Introduction • Many optical connection-oriented (CO) testbeds • E.g., CANARIE's CA*net 4, UKLight, and CHEETAH • Primarily designed for e-Science apps • Use Generalized Multiprotocol Label Switching (GMPLS) • Immediate request, call blocking • Motivation: extend these GMPLS networks to million of users • Problem Statement • What apps are well served by GMPLS networks? • Design apps to use GMPLS networks efficiently

  4. Circuit-switched High-speed End-to-End Transport ArcHitecture (CHEETAH) • Designed as an “add-on” service to the Internet and leverages the services of the Internet Packet-switched Internet IP router IP router End host NIC I End host NIC I Optical circuit-switched CHEETAH network NIC II NIC II Ethernet-SONET gateway Ethernet-SONET gateway CHEETAH concept

  5. CHEETAH Network NYC HOPI Force10 CUNY Foundry UVa WASH HOPI Force10 CUNY Host UVa Catalyst 4948 mvstu6 CUNY WASH Abilene T640 NCSU M20 NC ORNL, TN Centuar FastIron FESX448 zelda4 zelda5 MCNC Catalyst 7600 1G Sycamore SN16000 wukong SN16000 Atlanta, GA zelda1 Direct fibers zelda2 VLANs MPLS tunnels zelda3 OC-192 lambda Sycamore SN16000

  6. CHEETAH End-host Software OCS: Optical Connectivity Service RD: routing decision RSVP-TE: ReSerVation Protocol-Traffic Engineering C-TCP: Circuit-TCP

  7. Outline • Introduction • CHEETAH Background • CHEETAH concept and network • CHEETAH end-host software • Analytical Models of GMPLS Networks • Application (App) I: Web Transfer App • App II: Parallel File Transfers • Summary and Conclusions

  8. Analytical Models of GMPLS Networks • Problem: what apps are suitable for GMPLS networks? • App properties: • Per-circuit BW • Call-holding time, • Measure of suitability: • Call-blocking probability, Pb • Link utilization, U • Assumptions: • Call arrival rate, (Poisson process) • Single link • Single class: all apps are of the same type • A link of capacity C; m circuits; per-circuit BW=C/m • m is a measure of high-throughput vs. moderate-throughput • For high-throughput (e.g., e-Science apps), m is small

  9. 1 1 RD Link L, capacity C … … Link L, capacity C … N N BW sharing models Two kinds of apps: whether is dependent on • is independent of • is dependent on File size distribution: , :shape , k:scale The Erlang-B formula :crossover file size

  10. Numerical Results: is independent of • Two equations, four variables • Fix U and m, compute Pb and

  11. m=10 Pb=23.62% Numerical Results: is independent of Conclusions: to get high U • Small m (~10): high Pb, thus book-ahead or call queuing • Large m (~1000): high , thus large N • Intermediate m (~100): large is preferred

  12. Numerical Results: is dependent on , when Conclusions: to get high U • Small m (~10): high Pb, thus book-ahead or call queuing • As m increases, N does not increase • m=100, to get U>80%, Pb<5%: 6MB< <29MB, thus

  13. Conclusions for Analysis • Ideal apps require BW on the order of one-hundredth the link capacity as per-circuit rate • Apps where is independent of • long call-holding time is preferred • Apps where is dependent on • need short call-holding time

  14. Outline • Introduction • CHEETAH Background • CHEETAH concept and network • CHEETAH end-host software • Analytical Models of GMPLS Networks • Application (App) I: Web Transfer App • App II: Parallel File Transfers • Summary and Conclusions

  15. APP I: Web Transfer App on CHEETAH • Why web transfer? • Web-based apps are ubiquitous • Based on the previous analysis, m=100 is suitable for CHEETAH • Consists of a software package WebFT • Leverages CGI for deployment without modifying web client and web server software • Integrated with CHEETAH end-host software APIs to allow use of the CHEETAH network in a mode transparent to users

  16. WebFT Architecture Web server Web client URL CGI scripts (download.cgi & redirection.cgi Web Browser (e.g. Mozilla) Web Server (e.g. Apache) Response RSVP-TE daemon WebFT sender OCS daemon OCS API RD API WebFT receiver Control messages via Internet RD daemon RSVP-TE API RSVP-TE API Data transfers via a circuit RSVP-TE daemon C-TCP API C-TCP API Cheetah end-host software APIs and daemons Cheetah end-host software APIs and daemons

  17. Experimental Testbed for WebFT Internet IP routers IP routers • zelda3 and wukong: Dell machines, running Linux FC3 and ext2/3, with RAID-0 SCCI disks • RTT between them: 24.7ms on the Internet path, and 8.6ms for the CHEETAH circuit. • load Apache HTTP server 2.0 on zelda3 zelda3 NIC I wukong NIC I NIC II CHEETAH Network NIC II Atlanta, GA NCSU Sycamore SN16000 MCNC, NC Sycamore SN16000 Atlanta, GA

  18. Experimental Results for WebFT • Test parameters: • Test.rm: 1.6 GB, circuit rate: 1 Gbps • Test results • throughput: 680 Mbps, delay: 19 s The web page to test WebFT

  19. Outline • Introduction • CHEETAH Background • CHEETAH concept and network • CHEETAH end-host software • Analytical Models of GMPLS Networks • Application (App) I: Web Transfer App • App II: Parallel File Transfers • Summary and Conclusions

  20. APP II: Parallel File Transfers on CHEETAH • Motivation: E-Science projects need to share large volumes of data (TB or PB) • Goal: achieve multi-Gb/s throughput • Two factors limit throughput • TCP’s congestion-control algorithm • End-host limitations • Solutions to relieve end-host limitations • Single-host solution • Cluster solution, which has two variations • General case: non-split source file • Special case: split source file

  21. General-Case Cluster Solution transfer Host 1 Host 1’ … … … assemble split transfer Original Source Host i Host i’ Original Sink … … … transfer Host n Host n’

  22. Software Tools: GridFTP and PVFS2 • GridFTP: a data-transfer protocol on the Grid • Extends FTP by adding features for partial file transfer, multi-streaming and striping • We mainly use the GridFTP striped transfer feature. • PVFS: Parallel Virtual File System • An open source implementation of a parallel file system • Stripes a file across multiple I/O servers like RAID0 • A second version: PVFS2

  23. Block 1 Block 1 Block 1 Block 1 Block n+1 Block n+1 Block n+1 Block n+1 Parallel File System Parallel File System … … … … data node S1 data node R1 … data node Sn data node Rn globus-url-copy a list of host-port pairs SPAS response to SPOR SPOR <host-port pairs> GridFTP server GridFTP server sending front end receiving front end Sending data nodes initiate data connections to receiving nodes … GridFTP striped transfer

  24. General-Case Cluster Solution:Design

  25. Block 1 Block 1 Block 1 Block 1 Block n+1 Block n+1 Block n+1 Block n+1 PVFS2 … … … … data node R1 data node Rn General-Case Cluster Solution:Implementation • To get a high throughput, we need to make data nodes responsible for data blocks in their local disks • Make PVFS2 and GridFTP have the same stripe pattern • Problems: • PVFS2 1.0.1 does not provide a utility to inspect data distribution • Data connections between sending and receiving nodes are random PVFS2 data node S1 … … data node Sn

  26. Block 1 Block 1 Block 1 Block 1 Block n+1 Block n+1 Block n+1 Block n+1 PVFS2 … … … … data node R1 data node Rn Random data connections PVFS2 data node S1 … … data node Sn

  27. Block 1 Block 1 Block 1 Block 1 Block n+1 Block n+1 Block n+1 Block n+1 PVFS2 … … … … data node R1 data node Rn Random data connections PVFS2 data node S1 … … data node Sn

  28. Implementation - Modifications to PVFS2 • Goal: know a priori how a file is striped in PVFS2 • Use strace command to trace systems calls called by pvfs2-cp • Pvfs2-fs-dump gives the (non-deterministic) I/O server order of file distribution • Pvfs2-cp ignores the –s option for configuring stripe size • Modify PVFS2 code • For load balance, PVFS2 stripes files starting with a random server: jitter = (rand() % num_io_servers); • Set jitter = -1 to get a fixed order of data distribution • Change the default stripe size (original: 64KBytes)

  29. Implementation - Modifications to GridFTP • Goal: use a deterministic matching sequence between sending and receiving data nodes • Method: modify the implementation of SPAS and SPOR commands • SPAS: sort the list of host-port pairs based on the IP-address order for receiving data nodes • SPOR: request sending data nodes to initiate data connections sequentially to receiving data nodes

  30. Experimental Results • Conducted on a 22-node cluster, sunfire • Reduced network-and-disk contention • Performance of PVFS2 implementation was poor

  31. Summary and Conclusions • Analytical Models of GMPLS Networks • Ideal apps require BW on the order of one-hundredth the link capacity as per-circuit rate • Application I: Web Transfer Application • provided deterministic data services to CHEETAH clients on dedicated end-to-end circuits • No modifications to the web client and web server software by leveraging CGI • Application II: Parallel File Transfers • Implemented a general-case cluster solution by using PVFS2 and GridFTP striped transfer • Modified PVFS2 and GridFTP code to reduce network-and-disk contention

  32. Publication Lists • M. Veeraraghavan, X. Fang, and X. Zheng, On the suitability of applications for GMPLS networks, submitted to IEEE Globecom2006 • X. Fang, X. Zheng, and M. Veeraraghavan, Improving web performance through new networking technologies, IEEE ICIW'06, February 23-25, 2006 Guadeloupe, French Caribbean

  33. Future Work • Analytical Models of GMPLS Networks • Multi-class • Multiple links and network models • Application I: Web Transfer Application • Design a Web partial CO transfer to enable non-CHEETAH hosts to use CHEETAH • Connect multiple CO networks to further reduce RTT • Application II: Parallel File Transfers • Test the general-case cluster solution on CHEETAH • Work on PVFS2 or try GPFS to get a high I/O throughput

  34. A Classification of Networks that Reflects Sharing Modes

  35. The flow chart for the WebFT sender

  36. The WebFT Receiver • Integrates with the CHEETAH end-host software modules similar to the WebFT sender. • Runs as a daemon in the background on the client host to avoid manual intervention. • Also provides the WebFT sender a desired circuit rate.

  37. Experimental Results for WebFT

  38. PVFS2 Architecture

  39. Experimental Configuration • Configuration of PVFS2 I/O servers • The 1st PVFS2: sunfire1 through sunfire5 • The 2nd PVFS2: sunfire10, and sunfire6 through 9 • Configuration of GridFTP servers • Sending front end: sunfire1 with data nodes sunfire1 through sunfire5 • Receiving front end: sunfire10 with data nodes sunfire10, sunfire6 through sunfire9 • GridFTP striped transfer globus-url-copy -vb –dbg -stripe ftp://sunfire1:50001/pvfs2/test_1G ftp://sunfire10:50002/pvfs2/test_1G1 2>dbg1.txt

  40. Four Conditions to Avoid Unnecessary Network-and-disk Contention • Know a priori how data are striped in PVFS2 • PVFS2 I/O servers and GridFTP servers run on the same hosts • GridFTP stripes data across data nodes in the same sequence as PVFS2 does across PVFS2 I/O servers • GridFTP and PVFS2 have the same stripe size

  41. The Specific Cluster Solution for TSI

  42. Numerical Results for is dependent on Conclusions: • Large m (~1000): does not increase N

More Related