Other architectures examples
This presentation is the property of its rightful owner.
Sponsored Links
1 / 23

Other Architectures & Examples PowerPoint PPT Presentation


  • 92 Views
  • Uploaded on
  • Presentation posted in: General

Other Architectures & Examples. Multithreaded architectures Dataflow architectures Multiprocessor examples 1 st May, 2006. Context switching. Delays and poor resource utilization due to - Data/control hazards cache misses waiting for some event Solution –

Download Presentation

Other Architectures & Examples

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Other architectures examples

Other Architectures & Examples

Multithreaded architectures

Dataflow architectures

Multiprocessor examples

1st May, 2006

Anshul Kumar, CSE IITD


Context switching

Context switching

  • Delays and poor resource utilization due to -

    • Data/control hazards

    • cache misses

    • waiting for some event

  • Solution –

    • context switch to another thread

  • Context switch mechanism –

    • operating system - slow

    • hardware - fast

Anshul Kumar, CSE IITD


Multithreaded architecture

Multithreaded architecture

  • Hardware context switching

  • Models

    • control flow or hybrid (control flow, data flow)

  • Granularity

    • fine grain or coarse grain

  • Memory organization

    • shared?, distributed?, cache coherent?

  • No. of threads

    • small, medium, large

Anshul Kumar, CSE IITD


Ilp and multithreading

ILP and Multithreading

ILP Coarse MT Fine MT SMT

Hennessy and Patterson


Chip level multithreading

Chip level multithreading

Executing instructions from multiple threads within one processor chip at the same time.

  • Multithreading: Interleaved issue of multiple instructions from different threads

  • Simultaneous multithreading (SMT): Issue multiple instructions from multiple threads in one cycle.

  • Chip-level multiprocessing (CMP or Multicore): integrate two or more superscalar processors into one chip, each execute one thread independently

  • Any combination of multithreading/SMT/CMP

Wikipedia

Anshul Kumar, CSE IITD


Historical examples

Historical Examples

MachineGranu-ProcsThreads/MemoryYear

larity proc

HEP fromfinemax 168 activeshared1978

Denelcor64 maxcentralized

Terafinemax 256128distributed1990

shared

Alewifecoarsemax 5121 activeCC1990

(MIT)sparcle3 loaded

Anshul Kumar, CSE IITD


Modern examples

Modern examples

  • Pentium 4Hyperthreading

  • MIPS MT8 cores with 4 threads each

  • IBM Power 5dual core, 2 threads each

  • Ultrasparc T1fine grained multithreading

Anshul Kumar, CSE IITD


Other architectures examples

HEP

Control loop

8 stage pipeline

scheduler function unit

PSW

queue

Program

memory

Matching

unit

Increment

control

Registers

Operand

fetch

SFU

FU1

FU2

FUn

To/from

data

memory

Anshul Kumar, CSE IITD


Control flow data flow models

Control Flow & Data Flow models

  • Control Flow (von Neumann)

    • control flows through a sequence of instructions, branches can alter the flow

    • instructions get data from or put data in memory

    • explicit parallelism through control operators – fork/join

  • Data Flow

    • instructions are triggered by availability of data

    • data flows from instruction to instruction

    • explicit parallelism

Anshul Kumar, CSE IITD


Dataflow model

Dataflow Model

A

B

1

-

+

A-B

B+1

*

R=(A-B)*(B+1)

Anshul Kumar, CSE IITD


Dataflow program

Dataflow Program

-

L1:

Compute B

A

L3:

L2/2

L2:

L3/1

+

-

B

B

1

L4/2

L4/1

L4:

A-B

*

B+1

L6/1

R=(A-B)*(B+1)

Anshul Kumar, CSE IITD


Static dataflow architecture

Static Dataflow Architecture

Activity

Store

Fetch

unit

FU1

FU2

FUn

Instruction

queue

Update

unit

to/from other PEs

Anshul Kumar, CSE IITD


Tagged token dataflow architecture

Tagged-token dataflow architecture

Matching

unit

Matching

store

Instruction/

data

memory

Fetch

unit

FU1

FU2

FUn

Token

queue

Form

token unit

to/from other PEs

Anshul Kumar, CSE IITD


Uma examples

UMA Examples

  • Earlier approach : Large number of processors (e.g. Denelcor HEP, NYU Ultracomputer)

  • Now realized : Good only for small number of processors (e.g. Encore Multimax - 1980’s, SGI Power Challenge - 1990’s)

Anshul Kumar, CSE IITD


Sgi power challenge

SGI Power Challenge

  • 18 MIPS R 8000

  • 16 GB RAM, 8-way interleaved

  • 4 power channel-2, each 320 MB/s (I/O bus)

  • Power path-2 : split transaction shared bus (256 bit data, 40 bit address)

  • Snoopy cache coherence protocol

Anshul Kumar, CSE IITD


Numa examples

NUMA Examples

  • BBN TC2000

  • IBM RP3

  • Hector

  • Cray T3D

Anshul Kumar, CSE IITD


Hector

Hector

  • Hierarchical Structure

    global ring

    local rings

    stations

    Proc module (P+C+M)

    I/O module

Anshul Kumar, CSE IITD


Hector1

Hector

station

station

station

local ring

global ring

local ring

station

station

station

Station

Station bus

Station

controller

Proc

module

Proc

module

Proc

module

I/O

module

Anshul Kumar, CSE IITD


Cray t3d

Cray T3D

  • Alpha 21064 ProcCray Y-MP host

  • upto 128 GB memory

  • 4x4x4 3D torus - config upto 8x8x8

  • 2 PEs in each node

Anshul Kumar, CSE IITD


Cc numa examples

CC-NUMA examples

MachineNodesMemCacheNet

Wisconsinsingle procper col bussnoopybus grid

Multicube

Aquariussingle procper nodesnoopy+bus grid

Multimultidirectory

Stanfordclusterper clustersnoopy+pair of

Dash4 R3000+directorymeshes

FPU on bus

Stanfordsingle procper nodedirectory2D

FlashT5+magic chipmesh

Convexhyper nodeperSCIX bar

Exemplar8 PA-RISChyper node (hyper node)

multi rings

Magic chip : memory + I/O + network controller

Anshul Kumar, CSE IITD


Coma examples

COMA examples

  • DDM (Data Diffusion Machine)

    • single bus (split transaction)

    • can be made hierarchical

  • KSR 1

    • hierarchical rings

    • distributed directory is a matrix :

      rows for pages, columns for caches

Anshul Kumar, CSE IITD


Distr mem arch examples

Distr Mem Arch Examples

MachineComp.Comm.Vec.SwitchTopology

procprocproc

nCUBE2customcustomhyper cube

iPSC2i386yesyeshyper cube

Inteli860i860custom2D mesh

Paragon

Genesisi870i870custom2 level X bar

Mannai860i86016x16 X bar hierarch.

ParsytecP.PC601T805C0043D mesh

Transtechi860T805C004variable

Paramid

IBM SP2Power2i860customfat tree

MeikoSPARCcustomFujitsucustomfat tree

C32

ParsysT900T900C104hierarch sw

SN9800

Anshul Kumar, CSE IITD


References

References

  • D. Sima, T. Fountain, P. Kacsuk, "Advanced Computer Architectures : A Design Space Approach", Addison Wesley, 1997.

Anshul Kumar, CSE IITD


  • Login