1 / 25

FAWN: Fast Array of Wimpy Nodes

FAWN: Fast Array of Wimpy Nodes. A technical paper presentation in fulfillment of the requirements of CIS 570 – Advanced Computer Systems – Fall 2013 Scott R. Sideleau ssideleau@umassd.edu 14-Nov-2013. Overview. Identify the problem space FAWN as a solution

hafwen
Download Presentation

FAWN: Fast Array of Wimpy Nodes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. FAWN: Fast Array of Wimpy Nodes A technical paper presentation in fulfillment of the requirements of CIS 570 – Advanced Computer Systems – Fall 2013 Scott R. Sideleau ssideleau@umassd.edu 14-Nov-2013

  2. Overview • Identify the problem space • FAWN as a solution • Architecture principles • Unique key-value storage • Evaluate and benchmark a 21-node FAWN cluster • Identify when FAWN makes sense

  3. Theoretical Problem Space • CPU I/O gap • Modern processors are so efficient that a lot of time is spent idle • CPU power consumption scales linearly • Increased caches to keep the superscalar pipelines fed is a driver • Dynamic Voltage Frequency Switching (DVFS) is inefficient • Intel SpeedStep technology • CPU still operates generally at 50% power consumption

  4. What’s the real problem? • Electricity is expensive! • Home usage is measured in KW, data center usage in MW • Facebook use up to $1 million a month in electricity • Only three data centers! • Oregon, USA • Virginia, USA • Sweden

  5. Facebook’s Not Playing Around • Fourth data center to be powered by renewable wind • Iowa, USA http://goo.gl/sFmmxz dtd 14-Nov-2013

  6. Proposed Solution • Fast Array of Wimpy Nodes (FAWN) • Bridge the I/O gap • Use slower CPUs and faster Flash storage • Reduce power consumption per node • Embedded CPUs consume significantly less power • Address distributed storage for the new architecture • New key-value storage system (FAWN-KV) • Complementary per node data store (FAWN-DS)

  7. System Architecture

  8. Basic Functions

  9. Replication & Consistency

  10. Understanding Flash Storage • Fast random reads • 175x faster than HDDs • Vary wildly between make/models • Efficient I/O • Very low power • High query per Joule rate vs. HDDs • Slow random writes • Expensive erase/write cycle • Motivation for log structured (i.e. sequential) data storage

  11. Optimized Maintenance Functions • Split • Used when adding a node to the cluster • Read, then sequential write to two new data stores if key is in range • Merge • Used when deleting a node from the cluster • Mutually exclusive stores, so append one data store to the other • Compact • Cleans up entries in a data store • Skip orphans, out-of-range, deleted and write to new data store

  12. Optimized Sequential Read & Writes

  13. Front-end Consistent Hashing

  14. Node Join

  15. Node Leave • Rather than splitthe data stores, nodes merge them • In reality, this means… • Add a new replica into each chain the departing node belonged to • So, the processing is the same as a join event

  16. Failure Detection • Nodes are assumed to be fail-stop • Front-end and back-end nodes gossip at a known rate • If timeout, front-end initiates leave operation for failed node • Current design only copes with node failures • Coping with network failures require future work

  17. Single Node Evaluation • Performance almost entirely dependent on flash media

  18. 21-Node Evaluation • In general, the back-ends prove to be well-matched

  19. 21-Node Evaluation • Relatively responsive through maintenance operations

  20. 21-Node Evaluation • Slightly slower than production key-value systems • Worst case response times on-par

  21. 21-Node Evaluation • Power draw is low and consistent across operations

  22. 21-Node Evaluation • Power draw is low and consistent across operations • Query per Joule is an order of magnitude higher than traditional production distributed systems • 1 billion instructions per Joule • 1/3 the frequency • 1/10 (or less) the power

  23. When does FAWN matter? • It depends on the workload…

  24. Thanks very much! Questions?

More Related