830 likes | 844 Views
Data Center (and Network) Lecture 4. Seungwon shin CNS group, ee , kaist. Data Center. A data center is a facility used to house computer systems and associated components, such as telecommunications and storage systems. - from wikipedia. Why do we need it?. How to manage data
E N D
Data Center (and Network)Lecture 4 Seungwon shin CNS group, ee, kaist
Data Center A data center is a facility used to house computer systems and associated components, such as telecommunications and storage systems. - from wikipedia
Why do we need it? • How to manage data • now, we have huge amount of data • How to compute something • now, we have really complicated applications • Can we handle these issues by ourselves?
Top of Rack (ToR) ToR switch
Data Center Switch Products - Cisco Core - Nexus 7000 100G Access or Aggregation - Nexus 6000 40G ToR - Nexus 3000 10G
Data Center Cost • Servers: 45% • CPU, memory, disk • Infrastructure: 25% • UPS, cooling, power distribution • Power draw: 15% • Electrical utility costs • Network: 15% • Switches, links, transit
Data Center Challenges • Traffic load balance • Support for VM migration • Achieving bisection bandwidth • Power savings / Cooling • Network management (provisioning) • Security (dealing with multiple tenants)
Problems • Single point of failure • Over subscription of links higher up in the topology • Tradeoff between cost and provisioning
Background • Current Datacenter architecture • recommended by CISCO
Background • Data center design requirements • Data centers typically run two types of applications • outward facing (e.g., serving web pages to users) • internal computations (e.g., MapReduce for web indexing) • Workloads often unpredictable: • Multiple services run concurrently within a DC • Demand for new services may spike unexpected • Spike of demands for new services mean success! • But this is when success spells trouble (if not prepared)! • Failures of servers are the norm • Recall that GFS, MapReduce, etc., resort to dynamic re-assignment of chunkservers, jobs/tasks (worker servers) to deal with failures; data is often replicated across racks, … • “Traffic matrix” between servers are constantly changing
Background • Data center cost • Total cost varies • upwards of $1/4 B for mega data center • server costs dominate • network costs significant • Long provisioning timescales: • new servers purchased quarterly at best
Background • Networking issues in Data center • Uniform high capacity • Capacity between servers limited only by their NICs • No need to consider topology when adding servers • In other words, high capacity between two any servers no matter which racks they are located ! • Performance isolation • Traffic of one service should be unaffected by others • Ease of management: “Plug-&-Play” (layer-2 semantics) • Flat addressing, so any server can have any IP address • Server configuration is the same as in a LAN • Legacy applications depending on broadcast must work
A scalable, commodity data center network architecture UCSD SIGCOMM 2008 some slides from Prof. Amin Vadhat and Prof. Zhi-Li Zhang
Problem Domain • Single point of failure • Core routers are bottleneck • Require high-end routers • High-end routers are very expensive • Switching hardware cost to interconnect 20,000 hosts with full bandwidth • $7,000 for each 48-port GigE switch at the edge • $700,000 for 128-port 10 GigE switches in the aggregation and core layers. • approximately $37M.
Main Goal • Main Goal: addressing the limitations of today’s data center network architecture • single point of failure • oversubscription of links higher up in the topology • trade-offs between cost and providing
Considerations • Backwards compatible with existing infrastructure • No changes in application • Support of layer 2 (Ethernet) and IP • Cost effective • Low power consumption & heat emission • Cheap infrastructure • Scalable interconnection bandwidth • an arbitrary host can communicate with any other host at the full bandwidth of its local network interface.
Fat-Tree 2 3 0 1
Fat-Tree • Fat-Tree • a special type of Clos Networks (after C. Clos) • K-ary fat tree: three-layer topology (edge, aggregation and core) • each pod consists of (k/2)2 servers & 2 layers of k/2 k-port switches • each edge switch connects to k/2 servers & k/2 aggr. switches • each aggr. switch connects to k/2 edge & k/2 core switches • (k/2)^2 core switches: each connects to k pods
Fat-Tree • Why Fat-Tree? • Fat tree has identical bandwidth at any bisections • Each layer has the same aggregated bandwidth • Can be built using cheap devices with uniform capacity • Each port supports same speed as end hosts • All devices can transmit at line speed if packets are distributed uniform along available paths • Great scalability
Addressing in Fat-Tree • Use ”10.0.0.0/8" private addressing block • Pod switches have address "10.pod.switch.1" • Core switches have address "10.k.j.i" • “i” and “j" denote core position in (k/2)^2 core switches • Hosts have address "10.pod.switch.ID" • ID is host ID in switch subnet ([2, (k/2) + 1]) • k < 256, this scheme does not have scalability issue
Lookup in Fat-Tree • First level is prefix lookup • Used to route down the topology to end host • Second level is a suffix lookup • Used to route up towards core • Diffuses and spreads out traffic • Maintains packet ordering by using the same ports for the same end host
Lookup in Fat-Tree • Two-level table lookup
Routing • Pod switches • prefix - /24 matching (but no prefix for the lower level pod switches) • suffix - /8 matching • Core switches • prefix - /16 matching • e.g., • 10.0.0.0/16 - port 0 • 10.1.0.0/16 - port 1 • 10.2.0.0/16 - port 2 • 10.3.0.0/16 - port 3
10.0.1.2 —> 10.2.0.3 \ 2 3 0 1
10.0.1.2 —> 10.2.0.3 Routing Example 2 3 0 1
More Functions • Flow Classification • Dynamic Port-assignment • Measure against Packet reordering • Ensure Fair distribution • Flow Scheduling • Edge Switches • Central Scheduler
Evaluation • Cost of maintenance
Evaluation • Cost
Critiques • Scalability issues • K > 256 ?? • What kinds of routing protocols?
Jellyfish: Networking Data Centers Randomly UIUC and UC Berkeley NSDI 2012
Problem Domain • Structured DC network • structure constrains expansion • how to maintain structure; fixed topology, fixed connection…..
Solution • Then, how to? • Forget about the structure • let’s have no structure at all !!! • random graph
Throughput Simulation Can you believe this?
Critiques • OK, it seems that is fine • throughput • easy to build (scalability) • Can we realize this in a real environment? • how can you connect switches like a random graph?