Computer architecture ii
This presentation is the property of its rightful owner.
Sponsored Links
1 / 40

Computer architecture II PowerPoint PPT Presentation


  • 50 Views
  • Uploaded on
  • Presentation posted in: General

Computer architecture II. Network topologies. Plan for today Scalable interconnection networks. Basic concepts, definitions Topologies Switching Routing Performance. Outline. Basic concepts, definitions Topologies Switching Routing Performance. Formalism. Graph G=(V,E)

Download Presentation

Computer architecture II

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Computer architecture ii

Computer architecture II

Network topologies

Computer Architecture II


Plan for today scalable interconnection networks

Plan for todayScalable interconnection networks

  • Basic concepts, definitions

  • Topologies

  • Switching

  • Routing

  • Performance

Computer Architecture II


Outline

Outline

  • Basic concepts, definitions

  • Topologies

  • Switching

  • Routing

  • Performance

Computer Architecture II


Formalism

Formalism

  • Graph G=(V,E)

    V : switches and nodes

    E: communication channels (edges) e ÍV ´ V

  • Route: (v0, ..., vk) path of length k between nodes 0 und k, where (vi,vi+1)E

  • Routing distance

  • Diameter: the maximal route length between two nodes

  • Average distance

  • Degree: number of input (output) channels of a node

  • Bisection width: minimal number of parallel connections that saturates the network

Computer Architecture II


What characterizes a network

What characterizes a network?

  • Bandwidth (offered bandwidth)b = wf

    • where width w (in bytes) and signaling rate f = 1/t (in Hz)

  • Latency

    • Time a message travels between two nodes

  • Throughput (delivered bandwidth)

    • How much from the offered bandwidth is effectively used

Computer Architecture II


What characterizes a network1

What characterizes a network?

  • Topology

    • physical interconnection structure of the network graph

  • Routing Algorithm

    • restricts the set of paths that messages may follow

    • many algorithms with different properties

  • Switching Strategy

    • how data in a message traverses a route

    • circuit switching vs. packet switching

  • Flow Control Mechanism

    • when a message or portions of it traverse a route what happens when traffic is encountered?

Computer Architecture II


Goals

Goals

  • Latency as small as possible

  • High Throughput

  • As many concurrent transfers as possible

    • Bisection width gives the potential number of parallel connection

  • Cost as low as possible

Computer Architecture II


Bus e g ethernet

1

2

3

4

5

Bus (e.g. Ethernet)

  • Degree = 1

  • diameter = 1

    • No routing necessary

  • bisection width = 1

    CSMA/CD-protocol limited bus length

Simplest and cheapest

dynamic network

Computer Architecture II


Complete graph

1

2

3

4

5

Complete graph

  • degree= n-1

    too expensive for big nets

  • diameter = 1

  • bisection width=ën/2ûén/2ù

Static Network

Connection between each

Pair of nodes

When cutting the network into two

halves, each node has connection to

n/2 other nodes. There are n/2 such

Nodes.

Computer Architecture II


Computer architecture ii

1

2

3

4

5

Ring

  • degree= 2

  • diameter = n/2

    slow for big networks

  • bisection width = 2

Static network

A node i linked with nodes

i+1 and i-1 modulo n.

  • Examples: FDDI, SCI, FiberChannel Arbitrated Loop, KSR1

Computer Architecture II


D dimensional grid

Cray T3D und T3E.

d-dimensional grid

1,2

1,3

1,1

For d dimensions

  • degree= d

  • diameter = d (dn –1)

  • bisection width = (dn) d–1

2,1

2,2

2,3

3,1

3,2

3,3

Static network

Computer Architecture II


Crossbar

Crossbar

1

  • fast and expensive (n2 switches)

  • Most: Processor x memory

  • degree= 1

  • diameter = 2

  • bisection width = n/2

    Ex: 4x4, 8x8, 16x16

2

3

1

2

3

 switch

Dynamic network

Computer Architecture II


Hypercube 1

0010

0110

0011

0111

0000

0100

0001

0101

Hypercube (1)

Hamming-Distance =

number of bits in which the binary representation of two numbers differ

Two nodes are connected if the Hamming distance is 1

Routing from x to y by decreasing the Hemming distance

0010

0011

0000

0001

Static network

Computer Architecture II


Hypercube 2

0110

0010

0010

0011

0111

0011

0000

0100

0000

0001

0001

0101

Hypercube (2)

k dimensions, n= 2k nodes

  • degree= k

  • diameter = k

  • bisection width = n/2

    Two (k-1)-hypercubes are linked through n/2 edges to form a k-hypercube

Intel iPSC/860,

SGI Origin 2000

Computer Architecture II


Omega network 1

Omega-Network (1)

  • Building block: 2x2 Shuffle

  • Perfect Shuffle Target = cyclic left shift

000

000

001

001

010

010

011

011

100

100

101

101

110

110

111

111

Computer Architecture II


Omega network 2

000

000

001

001

010

010

011

011

100

100

101

101

110

110

111

111

Omega-Network (2)

  • Log2n levels of of 2x2 Shuffle building block

  • dynamic network

Level i looks at bit i

If 0 goes up

If 1 goes down

See example for 100

sending to 110

Computer Architecture II


Omega network 3

Omega-Network (3)

n nodes, (n/2) log2n building blocks

  • degree= 2 for nodes, 4 for building blocks

  • diameter = log2n

  • bisection width = n/2

    • for a random permutation, n/2 messages are expected to cross the network in parallel

    • Extremes

      • If all the nodes want to send to 0, only one message in parallel

      • If each sends a message to himself n messages in parallel

Computer Architecture II


Fat tree clos network 1

Fat Tree /Clos-Network (1)

  • Nodes = leaves of a tree

  • Tree has the diameter 2log2n

    „von farthest left over the root to farthest right"

  • Simple tree has bisection width = 1

    bottleneck

  • Fat Tree:

    • Edges at level i have double capacity as edges at level i-1

    • At level i expensive switches with 2i inputs and 2i outputs

    • Known as Clos-networks

Computer Architecture II


Fat tree clos network 2

Fat Tree/Clos-Network (2)

  • Routing:

    • Direct way over the lowest common parent

    • When alternative exists, choose randomly.

    • Tolerance to node failure

  • diameter 2log2n, bisection width: n

  • CM-5

    Computer Architecture II


    Switching

    Switching

    • How a message traverses the network from one node to the other

    • Circuit switching

      • One path from source to destination established

      • All packets will take that way

      • Like the telephone system

    • Packet switching

      • A message broken into a sequence of packets which can be sent across different routes

      • Better utilization of network resources

    Computer Architecture II


    Packet routing

    Packet Routing

    • There are two basic approaches to routing packets, based on what a switch does when the packet begins arriving

    • Store-and-forward

    • Cut-through

      • Virtual cut-through

      • Wormhole


    Packet routing store and forward

    Packet routing: Store-and-Forward

    • A packet is completely stored at a switch before being forwarded

    • The packet is always on at least two nodes

    • Pb: Switches need lots of memory for storing the incoming packets

    • Switching takes place step-by-step, the blocking danger is small

    Computer Architecture II


    Packet routing cut through

    Packet routing: Cut through

    • A packet may come partially into the switch and leave its tail on other nodes

      • It may reside on more than 2 switches

    • The decision to forward the packet may be taken right away

    • What to do with the rest of the packet if the head blocks?

      • Cut-through: gather tail where the head is

        • It degenerates into store-and-forward for high contention

      • Wormhole: If the head blocks the whole “worm” blocks

    Computer Architecture II


    Store forward vs cut through routing

    Store&Forward vs Cut-Through Routing

    h(n/b + D) vsn/b + h D

    h: number of hops n: message size

    b: bandwidth D: routing delay per hop

    Computer Architecture II


    Routing algorithm

    Routing Algorithm

    • How do I know where a packet should go?

      • Topology does NOT determine routing

    • Routing algorithms

      • Arithmetic

      • Source-based

      • Table lookup

      • Adaptive—route based on network state (e.g., contention)


    1 arithmetic routing

    (1) Arithmetic Routing

    • For regular topology, use simple arithmetic to determine route

    • E.g., 3D Torus xy-routing

      • Packet header contains signed offset to destination (per dimension)

      • At each hop, switch +/- to reduce offset in a dimension

      • When x == 0 and y == 0, then at correct processor

    • Drawbacks

      • Requires ALU in switch

      • Must re-compute CRC at each hop

    (1,1,1)

    (0,1,1)

    (0,0,1)

    (1,0,1)

    (0,1,0)

    (1,1,0)

    (0,0,0) (1,0,0)


    2 source based 3 table lookup routing

    (2) Source Based & (3) Table Lookup Routing

    Source Based

    • Source specifies output port for each switch in route

    • Very simple switches

      • No control state

      • Strip output port off header

    • Myrinet uses this

    • Can’t be made adaptive

      Table Lookup

    • Very small header: contains a field that is a index into table for output port

    • Big tables, must be kept up-to-date


    Deterministic vs adaptive routing

    110

    010

    111

    011

    100

    000

    101

    001

    Deterministic vs. Adaptive Routing

    • Deterministic—follows a pre-specified route

      • K-ary d-cube: dimension-order routing

        • (x1, y1)  (x2, y2)

        • First Dx = x2 - x1,

        • Then Dy = y2 - y1,

      • Tree: common ancestor

    • Adaptive—route determined by contention for output port


    4 adaptive routing

    (4) Adaptive Routing

    • Essential for fault tolerance

      • At least multipath

    • Can improve utilization of the network

    • Simple deterministic algorithms easily run into bad permutations

    Computer Architecture II


    Contention

    Contention

    • Two packets trying to use the same link at same time

      • limited buffering

      • drop?

    • Most parallel machines networks block in place

      • Traffic may back up toward the source

      • tree saturation: backing up all the way long toward destination

        • Discard packets and inform the source about that

    Computer Architecture II


    Communication perf latency

    Communication Perf: Latency

    • Time(n)s-d = overhead + routing delay + channel occupancy + contention delay

      • Overhead: time necessary for initiating the sending and reception of a message

      • occupancy = (n + ne) / b

        • n: data (payload) size

        • ne: packet envelope size

      • Routing delay

      • Contention

    Computer Architecture II


    Bandwidth

    Bandwidth

    • What affects local bandwidth?

      • packet densityb x n/(n + ne)

      • routing delayb x n / (n + ne + wD)

        D: nr. Of cycles waiting for a routing decision

        w: width of the channel

      • contention

        • endpoints

        • within the network

    • Aggregate bandwidth

      • bisection bandwidth

        • sum of bandwidth of smallest set of links that partition the network

        • Bad if not uniform distribution of communication

      • total bandwidth of all the channels

    Computer Architecture II


    Interconnects

    Interconnects

    Computer Architecture II


    Myrinet

    Myrinet

    • Offered bandwidth 2+2 Gbit/s, full duplex

    • 5-7 s latency

    • Arbitrary Topology, Fat Tree/Clos-Network preferable

    • Routing: Wormhole, Source Routing

    • Cable (8+1 Bit parallel) or fiber optics

    • Flow-control on each link

    • Adaptor

      • programmable RISC-Processor 333 MHz,

      • PCI/PCI-X connection, upto 133 MHz, 64-Bit,

      • 8 Gb/s over PCI-X Bus uni-directional

      • 2 MB

    Computer Architecture II


    Myrinet fat tree 128 node

    16x16 crossbar

    Myrinet Fat Tree (128 node)

    Computer Architecture II


    Myrinet pci bus adaptor

    Myrinet PCI-Bus-Adaptor

    cable

    connect

    Netw.

    interface

    Net-

    DMA

    2 MB

    SRAM

    Host-

    DMA

    PCI Bridge

    LanAI

    CPU

    2MB SRAM

    PCI (-X)-bridge,64 Bit, 66-133 MHz

    LanAI RISC, 333 MHz

    2 LWL-connectors,

    both duplex

    Computer Architecture II


    Myrinet 16x16 crossbar

    Myrinet 16x16 crossbar

    • 8 computers connected in the front side (2 chanels)

    • On the backside 8 outputs (2 chanels) toward next level of Clos network

    • 32x32, two

    Computer Architecture II


    128 nodes clos

    128-nodes Clos

    Building block

    from earlier

    Computer Architecture II


    Myrinet 256 256 clos network

    Myrinet 256+256-Clos-Network

    • Routing network with bisection width

    • 256

    • Front side 256 computer connection

    • Back side 256 connection to next

    • level routing units

    Computer Architecture II


    Clos network with full bisection width 64 nodes and 32 nodes

    Clos-Network with full bisection width: 64 nodes and 32 nodes

    Computer Architecture II


  • Login