CS 179: Lecture 4 Lab Review 2

1 / 28

# CS 179: Lecture 4 Lab Review 2 - PowerPoint PPT Presentation

CS 179: Lecture 4 Lab Review 2. Groups of Threads (Hierarchy). ( largest to smallest ) “Grid”: All of the threads Size: (number of threads per block) * (number of blocks) “Block”: Size: User-specified Should at least be a multiple of 32 (often, higher is better)

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' CS 179: Lecture 4 Lab Review 2' - vangie

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### CS 179: Lecture 4Lab Review 2

(largest to smallest)

• “Grid”:
• Size: (number of threads per block) * (number of blocks)
• “Block”:
• Size: User-specified
• Should at least be a multiple of 32 (often, higher is better)
• Upper limit given by hardware (512 in Tesla, 1024 in Fermi)
• Features:
• Shared memory
• Synchronization
• “Warp”:
• Execute in lockstep

(same instructions)

• Susceptible to divergence!
Divergence

“Two roads diverged in a wood…

…and I took both”

Divergence
• What happens:
• Executes normally until if-statement
• Branches to calculate Branch A (blue threads)
• Goes back (!) and branches to calculate Branch B (red threads)
“Divergent tree”

… 506, 508, 510

… 500, 504, 508

… 488, 496, 504

… 464, 480, 496

“Divergent tree”

Assumes block size is power of 2…

//Let our shared memory block be partial_outputs[]...

set offset to 1

while ( (offset * 2) <= block dimension):

if (thread index % (offset * 2) is 0):

double the offset

“Non-divergent tree”

Assumes block size is power of 2…

//Let our shared memory block be partial_outputs[]...

set offset to highest power of 2 that’s less than the

block dimension

while (offset >= 1):

halve the offset

“Divergent tree”Where is the divergence?
• Two branches:
• Accumulate
• Do nothing
• If the second branch does nothing, then where is the performance loss?
“Divergent tree” – Analysis
• First iteration: (Reduce 512 -> 256):
• Warp of threads 0-31: (After calculating polynomial)
• (same thing!)
• (up to) Warp of threads 480-511
• Number of executing warps: 512 / 32 = 16
“Divergent tree” – Analysis
• Second iteration: (Reduce 256 -> 128):
• Warp of threads 0-31: (After calculating polynomial)
• (same thing!)
• (up to) Warp of threads 480-511
• Number of executing warps: 16 (again!)
“Divergent tree” – Analysis
• (Process continues, until offset is large enough to separate warps)
“Non-divergent tree” – Analysis
• First iteration: (Reduce 512 -> 256): (Part 1)
• Accumulate
• Accumulate
• (up to) Warp of threads 224-255
• Then what?
“Non-divergent tree” – Analysis
• First iteration: (Reduce 512 -> 256): (Part 2)
• Do nothing!
• (up to) Warp of threads 480-511
• Number of executing warps: 256 / 32 = 8 (Was 16 previously!)
“Non-divergent tree” – Analysis
• Second iteration: (Reduce 256 -> 128):
• Warp of threads 0-31, …, 96-127:
• Accumulate
• Warp of threads 128-159, …,

480-511

• Do nothing!
• Number of executing warps: 128 / 32 = 4 (Was 16 previously!)
What happened?
• “Implicit divergence”
Why did we do this?
• Performance improvements
• Reveals GPU internals!
Final Puzzle
• What happens when the polynomial order increases?
• All these threads that we think are competing… are they?
In medicine…
• More sensitive devices -> more data!
• More intensive algorithms
• Real-time imaging and analysis
• Most are parallelizable problems!

http://www.varian.com

MRI
• “k-space” – Inverse FFT
• Real-time and high-resolution imaging

http://oregonstate.edu

CT, PET
• Low-dose techniques
• Safety!
• 4D CT imaging
• X-ray CT vs. PET CT
• Texture memory!

http://www.upmccancercenter.com/