1 / 25

Parallelism: A Serious Goal or a Silly Mantra (some half-thought-out ideas)

Parallelism: A Serious Goal or a Silly Mantra (some half-thought-out ideas). Random thoughts on Parallelism. Why the sudden preoccupation with parallelism? The Silliness (or what I call Meganonsense) Break the problem  Use half the energy 1000 mickey mouse cores Hardware is sequential

avel
Download Presentation

Parallelism: A Serious Goal or a Silly Mantra (some half-thought-out ideas)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parallelism: A Serious Goal or a Silly Mantra (some half-thought-out ideas)

  2. Random thoughts on Parallelism • Why the sudden preoccupation with parallelism? • The Silliness (or what I call Meganonsense) • Break the problem  Use half the energy • 1000 mickey mouse cores • Hardware is sequential • Server throughput (how many pins?) • What about GPUs and Data Base? • Current bugs to exploiting parallelism (or are they?) • Dark silicon • Amdahl’s Law • The Cloud • The answer • The fundamental concept vis-à-vis parallelism • What it means re: the transformation hierarchy

  3. Random thoughts on Parallelism • Why the sudden preoccupation with parallelism? • The Silliness (or what I call Meganonsense) • Break the problem  Use half the energy • 1000 mickey mouse cores • Hardware is sequential • Server throughput (how many pins?) • What about GPUs and Data Base? • Current bugs to exploiting parallelism (or are they?) • Dark silicon • Amdahl’s Law • The Cloud • The answer • The fundamental concept vis-à-vis parallelism • What it means re: the transformation hierarchy

  4. It starts with the raw material (Moore’s Law)‏ • The first microprocessor (Intel 4004), 1971 • 2300 transistors • 106 KHz • The Pentium chip, 1992 • 3.1 million transistors • 66 MHz • Today • more than one billion transistors • Frequencies in excess of 5 GHz • Tomorrow ?

  5. And what we have done with this raw material

  6. Too many people do not realize:Parallelism did not start with Multi-core • Pipelining • Out-of-order Execution • Multiple operations in a single microinstruction • VLIW (horizontal microcode exposed to the software)

  7. Random thoughts on Parallelism • Why the sudden preoccupation with parallelism? • The Silliness (or what I call Meganonsense) • Break the problem  Use half the energy • 1000 mickey mouse cores • Hardware is sequential • Server throughput (how many pins?) • What about GPUs and Data Base? • Current bugs to exploiting parallelism (or are they?) • Dark silicon • Amdahl’s Law • The Cloud • The answer • The fundamental concept vis-à-vis parallelism • What it means re: the transformation hierarchy

  8. One thousand mickey mouse cores • Why not a million? Why not ten million? • Let’s start with 16 • What if we could replace 4 with one more powerful core? • …and we learned: • One more powerful core is not enough • Sometimes we need several • Morphcore was born • BUT not all morphcore (fixed function vs flexibility)

  9. Large core Largecore Niagara-likecore Niagara-likecore Niagara-likecore Niagara-likecore Large core Niagara-likecore Niagara-likecore Niagara-likecore Niagara-likecore Niagara-likecore Niagara-likecore Niagara-likecore Niagara-likecore Largecore Largecore Niagara-likecore Niagara-likecore Niagara-likecore Niagara-likecore Niagara-likecore Niagara-likecore Niagara-likecore Niagara-likecore Niagara-likecore Niagara-likecore Niagara-likecore Niagara-likecore Niagara-likecore Niagara-likecore Niagara-likecore Niagara-likecore ACMP Approach “Tile-Large” Approach “Niagara” Approach The Asymmetric Chip Multiprocessor (ACMP)

  10. Large core vs. Small Core LargeCore SmallCore • Out-of-order • Wide fetch e.g. 4-wide • Deeper pipeline • Aggressive branch predictor (e.g. hybrid)‏ • Many functional units • Trace cache • Memory dependence speculation • In-order • Narrow Fetch e.g. 2-wide • Shallow pipeline • Simple branch predictor (e.g. Gshare)‏ • Few functional units

  11. Throughput vs. Serial Performance

  12. Server throughput • The Good News: Not a software problem • Each core runs its own problem • The Bad News: How many pins? • Memory bandwidth • More Bad News: How much energy? • Each core runs its own problem

  13. What about GPUs and Data Base • In theory, absolutely! • GPUs (SMT + SIMD + Predication) • Provided there are no conditional branches (Divergence) • Provided memory accesses line up nicely (Coalescing) • Data Bases • Provided there are no critical sections

  14. Random thoughts on Parallelism • Why the sudden preoccupation with parallelism? • The Silliness (or what I call Meganonsense) • Break the problem  Use half the energy • 1000 mickey mouse cores • Hardware is sequential • Server throughput (how many pins?) • What about GPUs and Data Base? • Current bugs to exploiting parallelism (or are they?) • Dark silicon • Amdahl’s Law • The Cloud • The answer • The fundamental concept vis-à-vis parallelism • What it means re: the transformation hierarchy

  15. Dark Silicon • Too many transistors: we can not power them all • All those cores powered down • All that parallelism wasted • Not really: The Refrigerator! (aka: Accelerators) • Fork (in parallel) • Although not all at the same time!

  16. Amdahl’s Law • The serial bottleneck always limits performance • Heterogeneous cores AND control over them can minimize the effect

  17. The Cloud • It is behind the curtain, how to manage it • Answer: the on-chip run-time system • Answer: Pragmas beyond the Cloud

  18. Random thoughts on Parallelism • Why the sudden preoccupation with parallelism? • The Silliness (or what I call Meganonsense) • Break the problem  Use half the energy • 1000 mickey mouse cores • Hardware is sequential • Server throughput (how many pins?) • What about GPUs and Data Base? • Current bugs to exploiting parallelism (or are they?) • Dark silicon • Amdahl’s Law • The Cloud • The answer • The fundamental concept vis-à-vis parallelism • What it means re: the transformation hierarchy

  19. The fundamental concept: Synchronization

  20. Algorithm Program ISA (Instruction Set Arch)‏ Microarchitecture Circuits Problem Electrons

  21. At every layer we synchronize • Algorithm: task dependencies • ISA: sequential control flow (implicit) • Microarchitecture: ready bits • Circuit : clock cycle (implicit)

  22. Who understands this? • Should this be part of students’ parallelism education? • Where should it come in the curriculum? • Can students even understand these different layers?

  23. Parallel to Sequential to Parallel • Guri says: think sequential, execute parallel • i.e. don’t throw away 60 years of computing experience • The original HPS model of out-of-order execution • Synchronization is obvious: restricted data flow • At the higher level, parallel at larger granularity • Pragmas in JAVA? Who would have thought! • Dave Kuck’s CEDAR project, vintage 1985 • Synchronization is necessary: course grain data flow

  24. Can we do more? • The run-time system – part of the chip design • The chip knows the chip resources • On-chip monitoring can supply information • The run-time system can direct the use of those resources • The Cloud – the other extreme, and today’s be-all • How do we harness its capability? • What is needed from the hierarchy to make it work

  25. My message • Parallelism is a serious goal IF we want to solve the most challenging problems (Cure cancer, predict tsunamis) • Telling people to think parallel is nice, but often silly • Examining the transformation hierarchy and seeing where we can leverage seems to me a sounder approach

More Related