Parallelism multi threaded
Download
1 / 20

Parallelism (Multi-threaded) - PowerPoint PPT Presentation


  • 77 Views
  • Uploaded on
  • Presentation posted in: General

Parallelism (Multi-threaded). Outline. A touch of theory Amdahl’s Law Metrics (speedup, efficiency, redundancy, utilization) Trade-offs Tightly coupled vs. Loosely coupled Shared Distributed vs Shared Centralized CMP vs SMT (and while we are at it: SSMT) Heterogeneous vs. Homogeneous

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha

Download Presentation

Parallelism (Multi-threaded)

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript



Outline
Outline

  • A touch of theory

    • Amdahl’s Law

    • Metrics (speedup, efficiency, redundancy, utilization)

  • Trade-offs

    • Tightly coupled vs. Loosely coupled

      • Shared Distributed vs Shared Centralized

    • CMP vs SMT (and while we are at it: SSMT)

    • Heterogeneous vs. Homogeneous

    • Interconnection networks

  • Other important issues

    • Cache Coherency

    • Memory Consistency


Metrics
Metrics

  • Speed-up:

  • Efficiency:

  • Utilization:

  • Redundancy:



Tightly coupled vs loosely coupled
Tightly-coupled vs Loosely-coupled

  • Tightly coupled (i.e., Multiprocessor)

    • Shared memory

    • Each processor capable of doing work on its own

    • Easier for the software

    • Hardware has to worry about cache coherency,

      memory contention

  • Loosely-coupled (i.e., Multicomputer Network)

    • Message passing

    • Easier for the hardware

    • Programmer’s job is tougher

  • An early distributed shared example: cm*


Cmp vs smt
CMP vs SMT

  • CMP (chip multiprocessor)

    • aka Multi-core

  • SMT (simultaneous multithreading

    • One core – a PC for each thread, in a pipelined fashion

    • Multithreading introduced by Burton Smith, HEP ~1975

    • “Simultaneous” first proposed by H.Hirata et.al, ISCA 1992

    • Carried forward by Nemirovsky, HICSS 1994

    • Popularized by Tullsen, Eggers, ISCA 1995


Heterogeneous vs homogeneous
Heterogeneous vs Homogeneous

Large

core

Largecore

Large

core

Largecore

Largecore

ACMP Approach

“Tile-Large” Approach

“Niagara” Approach

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore

Niagara-likecore


Large core vs small core
Large core vs. Small Core

Out-of-order

Wide fetch e.g. 4-wide

Deeper pipeline

Aggressive branch predictor (e.g. hybrid)

Many functional units

Trace cache

Memory dependence speculation

LargeCore

SmallCore

  • In-order

  • Narrow Fetch e.g. 2-wide

  • Shallow pipeline

  • Simple branch predictor (e.g. Gshare)

  • Few functional units



Interconnection networks
Interconnection Networks

  • Basic Topologies

    • Bus

    • Crossbar

    • Omega Network

    • Hypercube

    • Tree

    • Mesh

    • Ring

  • Parameters: Cost, Latency, Contention

  • Example: Omega network, N=64, k=2


Three issues
Three issues

  • Interconnection networks

    • Cost

    • Latency

    • Contention

  • Cache Cohererency

    • Snoopy

    • Directory

  • Memory Consistency

    • Sequential Consistency and Mutual Exclusion


Sequential consistency
Sequential Consistency

  • Definition:

    Memory sees loads/stores in program order

  • Examples

  • Why it is important: guarantees mutual exclusion


Two processors and a critical section
Two Processors and a critical section

Processor 1 Processor 2

L1=0 L2=0

A: L1 = 1 X: L2=1

B: If L2 =0 Y: If L1=0

{critical section} {critical section}

C: L1=0 Z: L2=0

What can happen?



Cache coherence
Cache Coherence

  • Definition:

    If a cache line is present in more than one cache,

    it has the same values in all caches

  • Why is it important?

  • Snoopy schemes

    • All caches snoop the common bus

  • Directory schemes

    • Each cache line has a directory entry in memory




ad
  • Login