parallelism multi threaded
Download
Skip this Video
Download Presentation
Parallelism (Multi-threaded)

Loading in 2 Seconds...

play fullscreen
1 / 20

Parallelism (Multi-threaded) - PowerPoint PPT Presentation


  • 90 Views
  • Uploaded on

Parallelism (Multi-threaded). Outline. A touch of theory Amdahl’s Law Metrics (speedup, efficiency, redundancy, utilization) Trade-offs Tightly coupled vs. Loosely coupled Shared Distributed vs Shared Centralized CMP vs SMT (and while we are at it: SSMT) Heterogeneous vs. Homogeneous

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Parallelism (Multi-threaded)' - havyn


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
outline
Outline
  • A touch of theory
    • Amdahl’s Law
    • Metrics (speedup, efficiency, redundancy, utilization)
  • Trade-offs
    • Tightly coupled vs. Loosely coupled
      • Shared Distributed vs Shared Centralized
    • CMP vs SMT (and while we are at it: SSMT)
    • Heterogeneous vs. Homogeneous
    • Interconnection networks
  • Other important issues
    • Cache Coherency
    • Memory Consistency
metrics
Metrics
  • Speed-up:
  • Efficiency:
  • Utilization:
  • Redundancy:
tightly coupled vs loosely coupled
Tightly-coupled vs Loosely-coupled
  • Tightly coupled (i.e., Multiprocessor)
    • Shared memory
    • Each processor capable of doing work on its own
    • Easier for the software
    • Hardware has to worry about cache coherency,

memory contention

  • Loosely-coupled (i.e., Multicomputer Network)
    • Message passing
    • Easier for the hardware
    • Programmer’s job is tougher
  • An early distributed shared example: cm*
cmp vs smt
CMP vs SMT
  • CMP (chip multiprocessor)
    • aka Multi-core
  • SMT (simultaneous multithreading
    • One core – a PC for each thread, in a pipelined fashion
    • Multithreading introduced by Burton Smith, HEP ~1975
    • “Simultaneous” first proposed by H.Hirata et.al, ISCA 1992
    • Carried forward by Nemirovsky, HICSS 1994
    • Popularized by Tullsen, Eggers, ISCA 1995
heterogeneous vs homogeneous
Heterogeneous vs Homogeneous

Large

core

Largecore

Large

core

Largecore

Largecore

ACMP Approach

“Tile-Large” Approach

“Niagara” Approach

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

large core vs small core
Large core vs. Small Core

Out-of-order

Wide fetch e.g. 4-wide

Deeper pipeline

Aggressive branch predictor (e.g. hybrid)

Many functional units

Trace cache

Memory dependence speculation

LargeCore

SmallCore

  • In-order
  • Narrow Fetch e.g. 2-wide
  • Shallow pipeline
  • Simple branch predictor (e.g. Gshare)
  • Few functional units
interconnection networks
Interconnection Networks
  • Basic Topologies
    • Bus
    • Crossbar
    • Omega Network
    • Hypercube
    • Tree
    • Mesh
    • Ring
  • Parameters: Cost, Latency, Contention
  • Example: Omega network, N=64, k=2
three issues
Three issues
  • Interconnection networks
    • Cost
    • Latency
    • Contention
  • Cache Cohererency
    • Snoopy
    • Directory
  • Memory Consistency
    • Sequential Consistency and Mutual Exclusion
sequential consistency
Sequential Consistency
  • Definition:

Memory sees loads/stores in program order

  • Examples
  • Why it is important: guarantees mutual exclusion
two processors and a critical section
Two Processors and a critical section

Processor 1 Processor 2

L1=0 L2=0

A: L1 = 1 X: L2=1

B: If L2 =0 Y: If L1=0

{critical section} {critical section}

C: L1=0 Z: L2=0

What can happen?

cache coherence
Cache Coherence
  • Definition:

If a cache line is present in more than one cache,

it has the same values in all caches

  • Why is it important?
  • Snoopy schemes
    • All caches snoop the common bus
  • Directory schemes
    • Each cache line has a directory entry in memory
ad