1 / 1

A new breed of distributed, petabyte-scale file systems uses many Object Storage Devices (OSDs)

Load Balancing in File Systems. Nadine Amsel Dr. Carlos Maltzahn Storage Systems Research Center (SSRC) at UCSC http://ssrc.cse.ucsc.edu. Results. Introduction. What is the length of each period of overload time?. Will more hardware prevent overload?.

deron
Download Presentation

A new breed of distributed, petabyte-scale file systems uses many Object Storage Devices (OSDs)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Load Balancing in File Systems Nadine Amsel Dr. Carlos Maltzahn Storage Systems Research Center (SSRC) at UCSC http://ssrc.cse.ucsc.edu Results Introduction What is the length of each period of overload time? Will more hardware prevent overload? • A new breed of distributed, petabyte-scale file systems uses many Object Storage Devices (OSDs) • Search in such file systems requires OSDs to store large indices and cope with ever-changing hot spots due to a diverse query stream • What is the extent of query hot spots? How long do they persist? Most overload periods last only a few minutes. The distribution of period lengths follows a heavy-tailed power law so the variance is infinite (there is no stable average). Methods • Time-stamped queries by 500,000 AOL users over 3 months used to determine overload patterns • Each term in a query maps to one OSD (i.e. assuming  a term-distributed index) • Two questions to answer: Overload occurs all the time. Just one overloaded OSD can slow down the whole storage system. • How many OSDs are overloaded? • How long does an OSD stay overloaded? The median overload length is ~4 minutes for 128 OSDs and ~2 minutes for 1K OSDs. In 99% of all cases, the overload period lasts no longer than an hour. • OSD address determined by taking the hash of the term and extracting the last n bits (where n is determined by the number of OSDs) • An OSD’s load is determined by the number of queries it receives per minute • Query traces analyzed using different numbers of OSDs and overload thresholds: • 128, 1K, and 64K OSDs • 10, 30, and 50 queries/minute overload thresholds Conclusion • Index query workloads cannot be effectively addressed by increasing the number of OSDs. • Load-balancing mechanism needs to adapt on a minute-by-minute basis and any mechanism that takes longer than an hour to adapt will not be able to keep up with 99% of the workload changes. The query workload leads to overload even if distributed over a large number of nodes. Increasing the number of nodes is not a solution. This work was completed as part of UCSC's SURF-IT summer undergraduate research program, an NSD CISE REU Site. This material is based upon work supported by the National Science Foundation under Grant No. CCF-0552688.

More Related