Parallel Processing

1 / 24

# Parallel Processing - PowerPoint PPT Presentation

Parallel Processing. I’ve gotta spend at least 10 hours studying for the IT 344 final!. I’m going to study with 9 friends… we’ll be done in an hour. Next up: TIPS. Mega- = 10 6 , Giga- = 10 9 , Tera- = 10 12 , Peta- = 10 15 BOPS, anyone?

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Parallel Processing' - yanka

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Parallel Processing

I’ve gotta spend at least 10 hours studying for the IT 344 final!

I’m going to study with 9 friends… we’ll be done in an hour.

Next up: TIPS
• Mega- = 106, Giga- = 109, Tera- = 1012, Peta- = 1015
• BOPS, anyone?
• Light travels about 1 ft / 10-9 secs in free space.
• A Tera-Hertz uniprocessor could have no clock-to-clock path longer than 300 microns…
• We already know of problems that require greater than a TIP (Simulations of weather, weapons, brains)
Solution: Parallelism
• Pipelining – reasonable for a small number of stages (5-10), after that bypassing and stalls become unmanageable.
• Superscalar – replicate data paths and design control logic to discover parallelism in traditional programs.
• Explicit parallelism – must learn how to write programs that run on multiple CPUs.
Superscalar – How far can it go?
• Multiple functional units (ALUs, Addr, Floating point, etc.)
• Instruction dispatch
• Dynamic scheduling
• Pipelines
• Speculative execution
Explicit Parallelism
• Distributed
• Transaction-oriented
• Geographically dispersed locations
• E.g. SETI@home
• Parallel
• Single goal computing
• Computing intense and/or data-intense
• High-speed data exchange
• Often on custom hardware
• E.g. Geochemical surveys
Challenges
• For distributed processing, parallelism is given and usually cannot easily change. Programming is relatively easy.
• For parallel processing, the programmer defines parallelism by partitioning the serial program(s). Parallel programming in general is more difficult than transaction applications.
Other vocabulary
• Decomposition
• The way that a program can be broken up for parallel processing
• Course-grain
• Breaks into big chunks (fewer processors)
• SMP
• Distributed (often)
• Fine-grain
• Breaks into small chunks (more processors)
• Image processing
Inter-processor communications

Loosely-coupled

Tightly-coupled

Custom supercomputers

Distributed processors

Beowulf clusters

More Terminology
• SIMD (Single Instruction Multiple Data)
• MIMD (Multiple Instruction Multiple Data)
• MISD (Pipeline)
SIMD
• Same instruction executed in multiple units, on different data
• Examples: Vector processors, AltiVec

D1

I

D2

I

D3

I

D4

I

D1

I1

D2

I2

D3

I3

D4

I4

MIMD
• Each unit does own instruction on own text
• Examples: Mercury, Beowulf, etc.
MISD (pipeline)

D4

D3

D2

D1

I1

I2

I3

I4

Distributed Programming Tools
• C/C++ with TCP/IP
• Perl with TCP/IP
• Java
• Corba
• ASP
• .Net
Parallel Programming Tools
• PVM
• MPI
• Synergy
• Others (proprietary hardware)
Parallel Programming Difficulties
• Program partition and allocation
• Data partition and allocation
• Program(process) synchronization
• Data access mutual exclusion
• Dependencies
• Process(or) failures
• Scalability…
Software techniques
• Shared Memory Buffers — Areas of memory that any node can read or write
• Sockets — Provide full-duplex message passing between processes.
• Semaphores and Spinlocks — Provide locking and synchronization functions
• Mailbox Interrupts — Provide an interrupt-driven communication mechanism
• Direct Memory Access — Provides asynchronous shared memory bufferI/O.
What it really looks like

Note: this computer would rank well on www.top500.org

Summary
• Prospects for future CPU architectures:
• Pipelining - Well understood, but mined-out
• Superscalar - Nearing its practical limits
• SIMD - Limited use for special applications
• VLIW - Returns controls to S/W. The future?
• Prospects for future Computer System architectures:
• SMP - Limited scalability. Harder than it appears.
• MIMD/message-passing - It’s been the future for over 20 years now. How to program?