1 / 100

The Future of Parallel Computing

C. P. R. A. The Future of Parallel Computing. SA ISA PIPS RM OH. Special Purpose Mesh Architectures. Heiko Schröder, 1998. Fine grain 1983. Coarse grain 1997. Contents. Why meshes ??? Application specific parallel mesh architectures. - Systolic Arrays

nike
Download Presentation

The Future of Parallel Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. C P R A The Future of Parallel Computing SA ISA PIPS RM OH Special Purpose Mesh Architectures Heiko Schröder, 1998

  2. Fine grain 1983 Coarse grain 1997 Contents • Why meshes ??? • Application specific parallel mesh architectures • -Systolic Arrays • -Instruction Systolic Arrays • -PIPS • -Reconfigurable mesh • -Optical Highway

  3. Physical limits • OPS -- 0.3 mm/OP • 1000 PEs with OPS --30cm/OP • massive parallelism • distributed memory c=300 000 km/sec

  4. 1000 Pentium 2 Pentium 100 80486 10 80386 Performance (MIPS) 80286 1 8080 0.1 4004 0.01 1970 1975 1980 1985 1990 1995 Year Processor power

  5. Scaling • Faktor 2: • 1/2 width • 1/2 hight • 1/2 switching time 0,5 µ 8 x performance! 0,25 µ

  6. CMOS transistors 10m Size of minimal transistor 1m 0,1m ca. 0,03m 0,01m 1960 1970 1980 1990 2000 2010 2020 2030

  7. diameter bisection width 2D mesh Mesh/Torus

  8. 0 00 10 diameter log n bisection width n 0-D 1-D 2-D 1 01 11 0 1 000 010 001 011 3-D 4-D 100 110 101 111 Hypercube

  9. VLSI Very Large Scale Integration • simple cells • few types • regular architecture • short connections mesh -- torus

  10. diameter 256 diameter 16 16 pins 16x12 pins 16x16 pins Pin limitations

  11. Bisection width 256 Bisection width 32K 25 cm 32 m Bisection width

  12. Programming • SA --- Systolic Array • SIMD ---Single Instruction Multiple Data • ISA ---Instruction Systolic Array • MIMD ---Multiple Instruction Multiple Data

  13. parallel merge initial situation: 1.) sort columns (odd-even-transposition sort) 2.) sort rows (odd-even-transposition sort) sorted !!!! x1 x2 x3 x4 x5 x6 ... x7 ... x17 x18 y1 y2 y3 y4 y5 y6 ... y7 ... y17 y18

  14. 0s 1s initially 0s 0s after vertical sort 1s 0s after horizontal sort 1s 0-1 principle • The 0-1 principle states that if all sequences of 0 and 1 are sorted properly than this is a correct sorter. • The sorter must be based on moving data.

  15. MIMD-mesh (clocked) min max Time: 2n

  16. 1 3 3 4 5 5 6 7 9 8 8 7 4 4 3 2 systolic merge

  17. 1 3 3 4 5 5 6 7 9 8 8 7 4 4 3 2 systolic merge

  18. 1 3 3 4 5 5 6 7 9 8 8 7 4 4 3 2 systolic merge

  19. 1 3 3 4 5 5 6 7 9 8 8 7 4 4 3 2 systolic merge

  20. 1 3 3 4 5 5 6 7 9 8 8 7 4 4 3 2 1 3 3 4 5 5 6 7 4 4 3 2 9 8 8 7 systolic merge

  21. 1 3 3 4 5 5 6 7 4 4 3 2 9 8 8 7 systolic merge

  22. 1 3 3 4 4 4 3 2 5 5 6 7 9 8 8 7 systolic merge

  23. 1 3 3 4 4 4 3 2 5 5 6 7 9 8 8 7 systolic merge

  24. 1 3 3 2 4 4 3 4 5 5 6 7 9 8 8 7 systolic merge

  25. 1 3 3 2 4 4 3 4 5 5 6 7 9 8 8 7 systolic merge

  26. 1 3 3 2 4 4 3 4 5 5 6 7 9 8 8 7 systolic merge

  27. 1 3 2 3 4 3 4 4 5 5 6 7 9 8 8 7 systolic merge

  28. 1 3 2 3 4 3 4 4 5 5 6 7 9 8 8 7 systolic merge

  29. 1 2 3 3 3 4 4 4 5 5 6 7 9 8 8 7 systolic merge

  30. 1 2 3 3 3 4 4 4 5 5 6 7 9 8 8 7 systolic merge

  31. 1 2 3 3 3 4 4 4 5 5 6 7 9 8 8 7 systolic merge

  32. 1 2 3 3 3 4 4 4 5 5 6 7 8 9 7 8 systolic merge

  33. 1 2 3 3 3 4 4 4 5 5 6 7 8 9 7 8 systolic merge

  34. 1 2 3 3 3 4 4 4 5 5 6 7 8 7 9 8 systolic merge

  35. 1 2 3 3 3 4 4 4 5 5 6 7 8 7 9 8 systolic merge

  36. 1 2 3 3 3 4 4 4 5 5 6 7 7 8 8 9 systolic merge • sorted !!!

  37. Characteristics of SAs Extremely high cost-performance no flexibility -- long development time Suitable for special signal processing tasks ???

  38. Systolic architectures I

  39. Systolic architectures II

  40. C:=min{C, CE} C:=max{C, CW} 1 3 3 4 5 5 6 7 9 8 8 7 4 4 3 2 ISA merge

  41. 1 3 3 4 5 5 6 7 9 8 8 7 4 4 3 2 ISA merge

  42. 1 3 3 4 5 5 6 7 9 8 8 7 4 4 3 2 ISA merge

  43. 1 3 3 4 5 5 6 7 9 8 8 7 4 4 3 2 ISA merge

  44. 1 3 3 4 5 5 6 7 9 8 8 7 4 4 3 2 ISA merge

  45. 1 3 3 4 5 5 6 7 4 8 8 7 9 4 3 2 ISA merge

  46. 1 3 3 4 5 5 6 7 4 8 8 7 9 4 3 2 ISA merge

  47. 1 3 3 4 4 5 6 7 5 4 8 7 9 8 3 2 ISA merge

  48. 1 3 3 4 4 5 6 7 5 4 8 7 9 8 3 2 ISA merge

  49. 1 3 3 4 4 4 6 7 5 5 3 7 9 8 8 2 ISA merge

  50. 1 3 3 4 4 4 6 7 5 5 3 7 9 8 8 2 ISA merge

More Related