Computer architecture ii
This presentation is the property of its rightful owner.
Sponsored Links
1 / 40

Computer architecture II PowerPoint PPT Presentation


  • 56 Views
  • Uploaded on
  • Presentation posted in: General

Computer architecture II. Network topologies. Plan for today Scalable interconnection networks. Basic concepts, definitions Topologies Switching Routing Performance. Outline. Basic concepts, definitions Topologies Switching Routing Performance. Formalism. Graph G=(V,E)

Download Presentation

Computer architecture II

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Computer architecture II

Network topologies

Computer Architecture II


Plan for todayScalable interconnection networks

  • Basic concepts, definitions

  • Topologies

  • Switching

  • Routing

  • Performance

Computer Architecture II


Outline

  • Basic concepts, definitions

  • Topologies

  • Switching

  • Routing

  • Performance

Computer Architecture II


Formalism

  • Graph G=(V,E)

    V : switches and nodes

    E: communication channels (edges) e ÍV ´ V

  • Route: (v0, ..., vk) path of length k between nodes 0 und k, where (vi,vi+1)E

  • Routing distance

  • Diameter: the maximal route length between two nodes

  • Average distance

  • Degree: number of input (output) channels of a node

  • Bisection width: minimal number of parallel connections that saturates the network

Computer Architecture II


What characterizes a network?

  • Bandwidth (offered bandwidth)b = wf

    • where width w (in bytes) and signaling rate f = 1/t (in Hz)

  • Latency

    • Time a message travels between two nodes

  • Throughput (delivered bandwidth)

    • How much from the offered bandwidth is effectively used

Computer Architecture II


What characterizes a network?

  • Topology

    • physical interconnection structure of the network graph

  • Routing Algorithm

    • restricts the set of paths that messages may follow

    • many algorithms with different properties

  • Switching Strategy

    • how data in a message traverses a route

    • circuit switching vs. packet switching

  • Flow Control Mechanism

    • when a message or portions of it traverse a route what happens when traffic is encountered?

Computer Architecture II


Goals

  • Latency as small as possible

  • High Throughput

  • As many concurrent transfers as possible

    • Bisection width gives the potential number of parallel connection

  • Cost as low as possible

Computer Architecture II


1

2

3

4

5

Bus (e.g. Ethernet)

  • Degree = 1

  • diameter = 1

    • No routing necessary

  • bisection width = 1

    CSMA/CD-protocol limited bus length

Simplest and cheapest

dynamic network

Computer Architecture II


1

2

3

4

5

Complete graph

  • degree= n-1

    too expensive for big nets

  • diameter = 1

  • bisection width=ën/2ûén/2ù

Static Network

Connection between each

Pair of nodes

When cutting the network into two

halves, each node has connection to

n/2 other nodes. There are n/2 such

Nodes.

Computer Architecture II


1

2

3

4

5

Ring

  • degree= 2

  • diameter = n/2

    slow for big networks

  • bisection width = 2

Static network

A node i linked with nodes

i+1 and i-1 modulo n.

  • Examples: FDDI, SCI, FiberChannel Arbitrated Loop, KSR1

Computer Architecture II


Cray T3D und T3E.

d-dimensional grid

1,2

1,3

1,1

For d dimensions

  • degree= d

  • diameter = d (dn –1)

  • bisection width = (dn) d–1

2,1

2,2

2,3

3,1

3,2

3,3

Static network

Computer Architecture II


Crossbar

1

  • fast and expensive (n2 switches)

  • Most: Processor x memory

  • degree= 1

  • diameter = 2

  • bisection width = n/2

    Ex: 4x4, 8x8, 16x16

2

3

1

2

3

 switch

Dynamic network

Computer Architecture II


0010

0110

0011

0111

0000

0100

0001

0101

Hypercube (1)

Hamming-Distance =

number of bits in which the binary representation of two numbers differ

Two nodes are connected if the Hamming distance is 1

Routing from x to y by decreasing the Hemming distance

0010

0011

0000

0001

Static network

Computer Architecture II


0110

0010

0010

0011

0111

0011

0000

0100

0000

0001

0001

0101

Hypercube (2)

k dimensions, n= 2k nodes

  • degree= k

  • diameter = k

  • bisection width = n/2

    Two (k-1)-hypercubes are linked through n/2 edges to form a k-hypercube

Intel iPSC/860,

SGI Origin 2000

Computer Architecture II


Omega-Network (1)

  • Building block: 2x2 Shuffle

  • Perfect Shuffle Target = cyclic left shift

000

000

001

001

010

010

011

011

100

100

101

101

110

110

111

111

Computer Architecture II


000

000

001

001

010

010

011

011

100

100

101

101

110

110

111

111

Omega-Network (2)

  • Log2n levels of of 2x2 Shuffle building block

  • dynamic network

Level i looks at bit i

If 0 goes up

If 1 goes down

See example for 100

sending to 110

Computer Architecture II


Omega-Network (3)

n nodes, (n/2) log2n building blocks

  • degree= 2 for nodes, 4 for building blocks

  • diameter = log2n

  • bisection width = n/2

    • for a random permutation, n/2 messages are expected to cross the network in parallel

    • Extremes

      • If all the nodes want to send to 0, only one message in parallel

      • If each sends a message to himself n messages in parallel

Computer Architecture II


Fat Tree /Clos-Network (1)

  • Nodes = leaves of a tree

  • Tree has the diameter 2log2n

    „von farthest left over the root to farthest right"

  • Simple tree has bisection width = 1

    bottleneck

  • Fat Tree:

    • Edges at level i have double capacity as edges at level i-1

    • At level i expensive switches with 2i inputs and 2i outputs

    • Known as Clos-networks

Computer Architecture II


Fat Tree/Clos-Network (2)

  • Routing:

    • Direct way over the lowest common parent

    • When alternative exists, choose randomly.

    • Tolerance to node failure

  • diameter 2log2n, bisection width: n

  • CM-5

    Computer Architecture II


    Switching

    • How a message traverses the network from one node to the other

    • Circuit switching

      • One path from source to destination established

      • All packets will take that way

      • Like the telephone system

    • Packet switching

      • A message broken into a sequence of packets which can be sent across different routes

      • Better utilization of network resources

    Computer Architecture II


    Packet Routing

    • There are two basic approaches to routing packets, based on what a switch does when the packet begins arriving

    • Store-and-forward

    • Cut-through

      • Virtual cut-through

      • Wormhole


    Packet routing: Store-and-Forward

    • A packet is completely stored at a switch before being forwarded

    • The packet is always on at least two nodes

    • Pb: Switches need lots of memory for storing the incoming packets

    • Switching takes place step-by-step, the blocking danger is small

    Computer Architecture II


    Packet routing: Cut through

    • A packet may come partially into the switch and leave its tail on other nodes

      • It may reside on more than 2 switches

    • The decision to forward the packet may be taken right away

    • What to do with the rest of the packet if the head blocks?

      • Cut-through: gather tail where the head is

        • It degenerates into store-and-forward for high contention

      • Wormhole: If the head blocks the whole “worm” blocks

    Computer Architecture II


    Store&Forward vs Cut-Through Routing

    h(n/b + D) vsn/b + h D

    h: number of hops n: message size

    b: bandwidth D: routing delay per hop

    Computer Architecture II


    Routing Algorithm

    • How do I know where a packet should go?

      • Topology does NOT determine routing

    • Routing algorithms

      • Arithmetic

      • Source-based

      • Table lookup

      • Adaptive—route based on network state (e.g., contention)


    (1) Arithmetic Routing

    • For regular topology, use simple arithmetic to determine route

    • E.g., 3D Torus xy-routing

      • Packet header contains signed offset to destination (per dimension)

      • At each hop, switch +/- to reduce offset in a dimension

      • When x == 0 and y == 0, then at correct processor

    • Drawbacks

      • Requires ALU in switch

      • Must re-compute CRC at each hop

    (1,1,1)

    (0,1,1)

    (0,0,1)

    (1,0,1)

    (0,1,0)

    (1,1,0)

    (0,0,0) (1,0,0)


    (2) Source Based & (3) Table Lookup Routing

    Source Based

    • Source specifies output port for each switch in route

    • Very simple switches

      • No control state

      • Strip output port off header

    • Myrinet uses this

    • Can’t be made adaptive

      Table Lookup

    • Very small header: contains a field that is a index into table for output port

    • Big tables, must be kept up-to-date


    110

    010

    111

    011

    100

    000

    101

    001

    Deterministic vs. Adaptive Routing

    • Deterministic—follows a pre-specified route

      • K-ary d-cube: dimension-order routing

        • (x1, y1)  (x2, y2)

        • First Dx = x2 - x1,

        • Then Dy = y2 - y1,

      • Tree: common ancestor

    • Adaptive—route determined by contention for output port


    (4) Adaptive Routing

    • Essential for fault tolerance

      • At least multipath

    • Can improve utilization of the network

    • Simple deterministic algorithms easily run into bad permutations

    Computer Architecture II


    Contention

    • Two packets trying to use the same link at same time

      • limited buffering

      • drop?

    • Most parallel machines networks block in place

      • Traffic may back up toward the source

      • tree saturation: backing up all the way long toward destination

        • Discard packets and inform the source about that

    Computer Architecture II


    Communication Perf: Latency

    • Time(n)s-d = overhead + routing delay + channel occupancy + contention delay

      • Overhead: time necessary for initiating the sending and reception of a message

      • occupancy = (n + ne) / b

        • n: data (payload) size

        • ne: packet envelope size

      • Routing delay

      • Contention

    Computer Architecture II


    Bandwidth

    • What affects local bandwidth?

      • packet densityb x n/(n + ne)

      • routing delayb x n / (n + ne + wD)

        D: nr. Of cycles waiting for a routing decision

        w: width of the channel

      • contention

        • endpoints

        • within the network

    • Aggregate bandwidth

      • bisection bandwidth

        • sum of bandwidth of smallest set of links that partition the network

        • Bad if not uniform distribution of communication

      • total bandwidth of all the channels

    Computer Architecture II


    Interconnects

    Computer Architecture II


    Myrinet

    • Offered bandwidth 2+2 Gbit/s, full duplex

    • 5-7 s latency

    • Arbitrary Topology, Fat Tree/Clos-Network preferable

    • Routing: Wormhole, Source Routing

    • Cable (8+1 Bit parallel) or fiber optics

    • Flow-control on each link

    • Adaptor

      • programmable RISC-Processor 333 MHz,

      • PCI/PCI-X connection, upto 133 MHz, 64-Bit,

      • 8 Gb/s over PCI-X Bus uni-directional

      • 2 MB

    Computer Architecture II


    16x16 crossbar

    Myrinet Fat Tree (128 node)

    Computer Architecture II


    Myrinet PCI-Bus-Adaptor

    cable

    connect

    Netw.

    interface

    Net-

    DMA

    2 MB

    SRAM

    Host-

    DMA

    PCI Bridge

    LanAI

    CPU

    2MB SRAM

    PCI (-X)-bridge,64 Bit, 66-133 MHz

    LanAI RISC, 333 MHz

    2 LWL-connectors,

    both duplex

    Computer Architecture II


    Myrinet 16x16 crossbar

    • 8 computers connected in the front side (2 chanels)

    • On the backside 8 outputs (2 chanels) toward next level of Clos network

    • 32x32, two

    Computer Architecture II


    128-nodes Clos

    Building block

    from earlier

    Computer Architecture II


    Myrinet 256+256-Clos-Network

    • Routing network with bisection width

    • 256

    • Front side 256 computer connection

    • Back side 256 connection to next

    • level routing units

    Computer Architecture II


    Clos-Network with full bisection width: 64 nodes and 32 nodes

    Computer Architecture II


  • Login