1 / 29

Emulab and its lessons and value for A Distributed Testbed

Emulab and its lessons and value for A Distributed Testbed. Jay Lepreau University of Utah March 18, 2002. What?. A configurable Internet emulator in a room Today: 168+160 nodes, 1646 cables, 4x BFS (switch) virtualizable topology, links, software

kasi
Download Presentation

Emulab and its lessons and value for A Distributed Testbed

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Emulaband its lessons and value forA Distributed Testbed Jay Lepreau University of Utah March 18, 2002

  2. What? • A configurable Internet emulator in a room • Today: 168+160 nodes, 1646 cables, 4x BFS (switch) • virtualizable topology, links, software • Bare hardware with lots of tools:Management Software • An instrument for experimental CS research • Universally available to any remote experimenter • Simple to use

  3. Points • Programmable, automated mgmt, complete virtualization: • Qualitatively new environment • Most of it will work in wide area

  4. New Stuff • Integrated event system • Underlying pub/sub system • Integrated into ‘ns’ (statically scheduled) • Start/stop programs • Replayable • Dynamic events • User-accessible • Traffic generation • Automatic, from ns script • New generators: • TG (tcp, udp) • ‘nse’ with udp, tcp, ftp, telnet

  5. New Stuff (cont’d) • 4 node types: • Real, running in the local rack, controlled env. • Real, running ‘nse’ • [Simulated] • [Real, in wide-area] • Link configuration and monitoring • Latency, bw, plr, RED, queue size • Link monitoring and capture • GUI network config applet • Full-day SIGCOMM tutorial Aug’02

  6. Sharks Sharks PC Internet Web/DB/SNMP Switch Mgmt Users PowerCntl Control Switch/Router Serial PC 168 160 “Programmable Patch Panel”

  7. Fundamental Leverage: • Extremely Configurable • Easy to Use • Power • Performance • Virtualization

  8. Key Design Aspects • Allow experimenter complete control • Configurable link bandwidth, latency, and loss rates, via transparently interposed “traffic shaping” nodes that provide WAN emulation • … but provide fast tools for common cases • OS’s, state mgmt tools, IP, batch, ... • Disk loading – 6GB disk image FreeBSD+Linux • Unicast tool: 88 seconds to load • Multicast tool: 40 nodes simultaneously in < 5 minutes • Virtualization • of all experimenter-visible resources • node names, network interface names, network addrs • Allows swapin/swapout, easily scriptable

  9. Key Design Aspects (cont’d) • Flexible, extensible, powerful allocation algorithm • Matches desired “virtual” topology to currently available physical resources • Persistent state maintenance: • none on nodes, all in database • work from known state at boot time • Familiar, powerful, extensible configuration language: ns • Separate, isolated control network

  10. Lessons for wide area testbed • Central control: at this scale (1000s) it’s easy • Database! • Control node for each site: great benefits, cheap marginal cost • Trusted, firewall, local disk cache, power control, console line • Ease of use is dominant driver

  11. Lessons… • Generalized resource alloc/mapping algorithm is great (eg, vs Grid) • Get it going quickly, keep it going while add new stuff • Like a startup • Use feedback and demand • 2.5 years in • Simple authorization model • Most of our model and code will work in wide-area

  12. Lessons… • Freedom for users is freedom for the management software and people “You’ve got root, use it.” Over-provision FreeBSD Jail, or Eclipse/BSD, or VMWare, or ….

  13. Testing is tricky • Have real hardware that can’t virtualize • Test suite part of build • Clone DB works some… • 8-node minibed • Nightly regression testing • Schema evolution script/diff/check • Developers use/test 3 diff. browsers

  14. Code Base Today • 24,100 Web front end • 23,900 Back end • 2000 ns front end • 4200 Resource mapping • 4900 Diskimg compression/casting/load • 8400 Scripts/daemons from nodes to DB • 5000 Event system • 6200 Remote console interaction/logging • 3300 Regression testing harness and tests • 700 Node health monitoring • 3700 Documention of internals

  15. More stats • 21 “programs” • 318 “scripts” (including 90 php scripts, 71 small boot-time scripts) • 35% Perl • 32% C • 19% php • 12% html, Java, tcl, other

  16. The Database Today • Started with ~18 tables • 54 tables, 413 columns • General categories • Physical world: 11 tables, 65 cols • Virtual world: 7 tables, 83 cols • Operational state: 22 tables, 180 cols • Admin data: 14 tables, 85 cols • Note how much operational state  shows how much work needs to be done

  17. Testbed Users • 30 active projects • more registered • 25 External • About 40/30/30%dist sys/activenets/traditional networking • ~110 users • 990 “experiments” in last 8 months • 7.5/day recently • 40% testbed development

  18. More Sites • More emulab’s under construction: • Kentucky • Umass • Duke, CMU, Cornell, Stuttgart • Others stated intent:MIT, WUSTL, Princeton, HPLabs, Intel/UCB, Mt. Holyoke, …

  19. Federation heteregeneous sites resource allocation Wireless nodes, mobile nodes IXP1200 nodes, tools, code fragments Routers, high-capacity shapers Simulation/emulation transparency Event system Scheduling system Topology generation tools and GUI Data capture, logging, visualization tools Microsoft OSs, high speed links, more nodes! Ongoing and Future Work

  20. A Global-scale Testbed • Federation key • Bottom-up “organic” growth • Local autonomy and priority • Existing hardware resources • Provides diverse hardware • PCs • Wireless, mobile • Real routers, switches (Wisconsin, …) • Network processors (IXP’s) • Research switches (WUSTL) • But, top-down is much easier: a good start

  21. NSF ITR Proposal (Nov 01) • Global-scale testbed • Utah primary • Research emphasis: software component for heterogeneity; resource allocation/mapping • Collaborators: • Brown, co-PI (resource allocation) • MIT (RON overlay, wireless) • Duke (ModelNet muxing, early adopter) • Mt. Holyoke (education)

  22. Types of Sites • High-end facilities • Generic clusters • Generic labs • “Virtual machines” • Internet2 links between some sites

  23. Result… • Loosely coupled distributed system • Controlled isolation • “Internet Petri Dish”

  24. New Stuff: Extending to Wireless and Mobile Problems with existing approaches • Same problems as wired domain • But worse (simulation scaling, ...) • And more (no models for new technologies, ...)

  25. Wireless Virtual to Physical Mapping

  26. Available for universities, labs, and companies, for research and teaching, at:www.emulab.net

  27. A Few Research Issues and Challenges • Network management of unknown and untrusted entities • Security (root!) • Scheduling of experiments • Calibration, validation, and scaling • Artifact detection and control • NP-hard virtual --> physical mapping problem • Providing a reasonable user interface • ….

  28. How To Use It ... • Submit ns script or GUI via web form • Behind the scenes: • Generates config from script & stores in DB • Maps specified virtual topology to physical nodes • Allocate resources • Provides user accounts for node access • Assigns IP addresses and host names • Configures VLANs • Loads disks, reboots nodes, configures Oss • Starts event system, traffic generators, link monitoring/control • Yet more odds and ends ... • User does his/her experiment • [Reports results if batch] • Takes ~3 min to set up 25 nodes, 5 secs/node

  29. An “Experiment” • emulab’s central operational entity • Directly generated by an ns script, • … then represented entirely by database state • Steps: Web, compile ns script, map, allocate, provide access, assign IP addrs, host names, configure VLANs, load disks, reboot, configure OS’s, run, report

More Related