Reducing Allocation Errors in Network Testbeds

GSS 2012 USC/ISI IMC 2012 Boston, USA Scenario Problem Statement Current Practice Improvements Reducing Allocation Errors in Network Testbeds National Science Foundation Grant No. 1049758 Jelena Mirkovic Hao Shi Alefiya Hussain

* + Overview • Scenario • What is a testbed and how people use it? • Problem Statement • Emulab-based practice • Allocation Errors • A great portion can be avoided • Improvement • Deterministic-search based method

* + Scenario – an user case • How people launch multiple experiment instances in testbed

+ Scenario – features of resources • Limited quantities (until Jan 2011) • Heterogeneity: none of them has absolute advantages • Network Testbed Mapping Problem • how to allocate resources efficiently?

* + Problem Statement – Illustration • Network Testbed Mapping Problem

+ Problem Statement – Goals/Challenges • Economize inter-switch bandwidth • Accommodate heterogeneous nodes • Maximize possibility for future mappings • Generate one solution in a timely fashion

* + Current Practice – Emulab’s Algorithm (assign) • Simulated Annealing • A heuristic that performs a cost-function-guided exploration • Starts from a random solution and scores it using a cost function • Perturbs the solution using a generation function to find next one • If better: accept • If worse: accept with small possibility controlled by temperature • Cooling schedule converges algorithm to a single “best” solution • No guarantee that the best solution can be found

+ Current Practice – Performance • Allocation Errors • 11,176 TEMP errors (out of a total of 24,206 errors) • A huge space to improve!

* + Our Strategy – assign+ • Deterministic fashion • Explore 5 possible solution spaces using expert knowledge of possible network testbed architecture • 1) PART: minimizes partitions in the virtual topology • 2) SCORE: minimizes the score of the allocation strategy • 3) ISW: prefers physical machine classes (pclasses) that have high-bandwidth inter-switch links • 4) PREF: prefers pclassesthat share a switch with pclasses, which host neighbors of the allocating node • 5) FRAG: tries to use the smallest number of pclasses • Choose the solution with lowest inter-switch bandwidth as best

+ Our Strategy – Evaluation • Reconstruct DeterLab state on Jan 1, 2011 • Use virtual topology and state snapshot data from file system • hardware types, OS supported, switch connectivity, … • 255 available machines in the pool • Replay all successful and failed allocations in 2011 • start time, end time, experiment size, … • Failed allocations: generate their duration based on past successful distribution • Keep only the first instance if overlapping

+ Our Strategy – Performance • Allocation failure rates and Running time

* + Other key components in the paper • Relaxing virtual topology requirements can get better results • OS, node type, hardware, … • Most testbed usage patterns show heavy-tail distributions • experiment sizes, duration, … • due to human dynamics based on priorities • Potential improvements for allocation policy • Take-a-Break: release a long-running instance and queue it • Borrow-and-Return: borrow from long-running instance for 4 hours • For more details: • http://www-net.cs.umass.edu/imc2012/papers/p495.pdf • http://steel.isi.edu/TestbedUsageData

Reducing Allocation Errors in Network Testbeds

Reducing Allocation Errors in Network Testbeds

Presentation Transcript

GMPLS networks and optical network testbeds

IMPROVING PATIENT SAFETY BY REDUCING MEDICATION ERRORS

Network Bandwidth Allocation

MIT Technology Testbeds

Reducing Prescribing Errors Case Analysis

Customizable, Fast, Virtual Network Testbeds on Commodity Hardware

Reducing Insulin Administration Errors: The Independent Double Check

Network Aware Resource Allocation in Distributed Clouds

Cloud Computing Testbeds

FY1314 Network Budget Allocation

For Testbeds

Reducing Vaccine Administration Errors

Measuring Allocation Errors in Land Change Models in Amazonia

IMPROVING PATIENT SAFETY BY REDUCING MEDICATION ERRORS

Distributed Dynamic Channel Allocation in Wireless Network

NOAA Testbeds

Reducing Medication Errors Using Automated Prescriptions

Lecture 8: Testbeds

IMPROVING PATIENT SAFETY BY REDUCING MEDICATION ERRORS

Network Testbeds: Infrastructure / Connectivity Issues

Testbeds

Testbeds Breakout