1 / 9

Administration in the Cloud

Administration in the Cloud. Ashraf Aboulnaga University of Waterloo. Software Services in the Cloud. How is administration different? More resources available for administration Number of instances is small Many users per instance Many administrators per instance

navid
Download Presentation

Administration in the Cloud

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Administration in the Cloud Ashraf AboulnagaUniversity of Waterloo

  2. Software Services in the Cloud • How is administration different? • More resources available for administration • Number of instances is small • Many users per instance • Many administrators per instance • But still many users per administrator • Administration mistakes can have a huge impact • Complex distributed system • Many machines per administrator • Failure of an instance is very visible

  3. Software Services in the Cloud • More frequent software releases • Easier to make changes and fix bugs • Easier iterative improvement of administration • Administrators close to developers • Better cooperation on administrative goals • Easier problem determination • How does this change administration? • Monitoring and dashboards are important and non-trivial • Automation is essential • Experimentation to collect data and test administrative changes • Iterative administrative changes as the workload evolves • Other???

  4. Learning from Each Other • Database Systems • Working on data without loading it first • Scalability and fault tolerance • MapReduce • Specialized algorithms depending on data • Increase in the number of parameters is expected • Make parameters meaningful and robust • Do not force user to tune opaque low-level parameters • Automatically understand workload (experimentation) • Auto-tune of parameters based on workload

  5. MapReduce Tuning • Parameters that are easy for a user to set • Locations of different files • Level of detail in logged execution data • Degree of replication for fault tolerance • Global resource allocation (e.g., number of machines, memory per machine, disk space budget) • Parameters that can be auto-tuned • Number of threads of different types • Number of nodes per job • Local resource allocation

  6. Example: Hadoop Scheduling • Given a batch of Hadoop jobs • How many nodes to give to each job? • How to schedule the jobs to minimize total completion time? • Solution approach: • Experimentation to learn workload characteristics • Formulate scheduling problem as an optimization problem and solve using standard combinatorial search techniques • Saved up to 30% of total completion time compared to sequential execution on entire cluster

  7. Workload Characteristics

  8. Scheduling Results W1: 2 sort, 2 k-means, and 1 inverted index W2: 2 sort, 1 k-means, 1 inverted index, and 1 doc sim W3: 3 sort, 2 k-means, 1 inverted index, and 1 doc sim

More Related