1 / 19

Usenet Training

Usenet Training. May 4-7, 2004 Orlando, FL At Disney’s Coronado Spring Resort. Day Two. Building an Enterprise Usenet Environment. Questions to ask. How many concurrent NNTP reading sessions? I don't know What is the size of the user base? 5-10% or more may access news

jaclyn
Download Presentation

Usenet Training

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Usenet Training May 4-7, 2004 Orlando, FL At Disney’s Coronado Spring Resort

  2. Day Two Building an Enterprise Usenet Environment

  3. Questions to ask • How many concurrent NNTP reading sessions? • I don't know • What is the size of the user base? • 5-10% or more may access news • How much retention? • What content? • Full feed, internal discussion, partial feed • Binaries/Text • 97-99% size from binaries • Full feed is 1.1 TB/day - doubles every 9 months • Redundancy desired • Full data replication • Disaster recovery • Failover • Feeding architecture to provide/receive • Authentication and classes of service • Performance • Honda, Audi, Porsche

  4. Sample Architecture Criteria • 8,000 Concurrent Connections • 14 days binary, 180 days text • Two data centers • Full redundancy, outage of one DC can support 4000 connections. • Sell 10 full feeds, peer with 50 providers • Authentication by IP address

  5. Calculations • Note about calculations • Change over time • Not hard and fast • Change with experience and new architectures

  6. Storage • 14 days Binary and 180 days text • 17 TB • x2 with full redundancy • Adaptive spooling

  7. Machine Class Choices • Problems we're solving • Totally random I/O • Significant network volume • We're not NASA • No Complex Number Theory modelling here • WOPR is misplaced • Disk mechanics • Slowest evolving technology • Physical limitations • Cost • Administration • Deployment

  8. Machine Classes • 2000 connections per host - 4 hosts • 1000 connections per host - 8 hosts • 400 connections per host - 20 hosts • Machine Class dictates OS or vice versa • Summary • We're looking for an efficient I/O mover • Many paths, independent busses • Carl Lewis, not Andre the Giant.

  9. Architecture Diagramming • Start with Tornado Front Ends • At least 100 GB storage • At least 1 "fast" spindle per 400 connections • At least 2 "slow" spindles per 300 connections • Sun 280R's, 2 CPU, 4 GB RAM, Sun 3510 attached array (shared) • Module to support 2000 connections

  10. Tornado Back Ends • 4000 connections accessing articles • 1 BE with one array can be enough • BE's can be smaller than an FE • Direct spool sharing only • Not quite a Commodore 64… but close

  11. Tornado Back Ends • Storage Array • Very rough estimate 1 drive per 75 connections. • Optimize for Random I/O • Benchmark Precautions • Usually sequential I/O, push for random access data. • Plan on performance 10-20% of spec benchmark or even less. • 48x72GB 10,000 RPM drives (3456 GB) • FCAL Switching • Writes – approx 120 mbps ~ 13 MB/sec • Reads • 4000 connections • Average? 1Mbps • Many “constant downloaders” • .11 MB/sec X 4000 = 450 MB/sec • 450 MB/sec x ~6 (truly random adjustment) • 2.6 GB/sec Switching Spec • Start with one 2GB/sec switch with plans to add another. • We only have 3 TB and we need 17?!

  12. Enter Storm Cellar • 17TB total, 3 TB deployed - need 14 TB • 3 Storm cellars with 6 TB usable • Split cascade feed from Tornado Back Ends • 4U NAS Appliance 24x300GB drives (7.03 TB / ATA Drives) • Linux 2.6.2+

  13. Feeding • 10 Full feeds @ 120 mbps ~ 3-5 feeds per box • 2 Peering Cyclones per site • 1 Master Cyclone • 1 Hurricane

  14. Data Center Architecture

  15. Full Architecture

  16. What could we do with Linux? • Features we can use • FE adaptive caching • Cyclone Split Feeding • Newest Linux kernels • FCAL sharing probably not feasible • IP spool sharing with adaptive caching • Hardware choices • 1 Day’s worth of disk on each FE for cache • BE's have 4+ TB each

  17. Local spool cache on Tornado FE’s serves recent articles that are actually read (8 copies – 16 for 2 Data Centers) Spools are aggregated by all FE’s Back Ends contain only partial data – spreading I/O over many BE’s. Linux Architecture

  18. Quiz • Trace the path of locally posted articles. • Trace the path of articles received from peers • Where and how are articles numbered? • If a Front End is down, what happens? • If a Back End is down what happens? • If the storage array is out, what happens? • What do we do if the Master Cyclone goes down? • What do we do if the Hurricane goes down? • How do we add a feed from another data center to only fill in articles we’re missing? • How do we add a feed to only turn on if our Master Cyclone goes down? • Who should we peer with?

  19. Questions Questions

More Related