Parallel beam back projection implementation
Download
1 / 23

Parallel Beam Back Projection: Implementation - PowerPoint PPT Presentation


  • 101 Views
  • Uploaded on

Parallel Beam Back Projection: Implementation. Srdjan Coric Miriam Leeser Eric Miller. Outline. Annapolis Wildstar “Simple Architecture” algorithm datapath Performance Results Parallelism extraction “Advanced Architecture 4x” datapath Performance Results Implementation issues

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Parallel Beam Back Projection: Implementation' - jamar


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Parallel beam back projection implementation

Parallel Beam Back Projection:Implementation

Srdjan Coric

Miriam Leeser

Eric Miller


Parallel beam back projection implementation

Outline

  • Annapolis Wildstar

  • “Simple Architecture”

    • algorithm

    • datapath

    • Performance

    • Results

  • Parallelism extraction

  • “Advanced Architecture 4x”

    • datapath

    • Performance

    • Results

    • Implementation issues

  • Future directions


Parallel beam back projection implementation

Sinogram data address generation

Sinogram data retrieval

Sinogram data prefetch

Linear

interpolation

Data

accumulation

Data

read

Data

write

Data Flow


Parallel beam back projection implementation

LUT1 starting position

Critical error-accumulation path

LUT1 quantization error

Bit reduction error

LUT2 quantization error

LUT3 quantization error

5

10

.

LUT1:

15

1

.

LUT2:

15

.

2

LUT3:

Interpolation factor errorCorner starting position



Parallel beam back projection implementation

Performance Results: Software vs. FPGA Hardware

  • Software - Floating point - 450 MHz Pentium : ~ 240 s

  • Software - Floating point - 1 GHz Dual Pentium : ~ 94 s

  • Software - Fixed point - 450 MHz Pentium : ~ 50 s

  • Software - Fixed point - 1 GHz Dual Pentium : ~ 28 s

  • Hardware - 50 MHz : ~ 5.4 s

Parameters: 1024 projections

1024 samples per projection

512*512 pixels image

9-bit sinogram data

3-bit interpolation factor


Parallel beam back projection implementation

Original image

Hardware output image

Zoom: ~200%

Grayscale range < Pixel value range

(heart features in focus)


Parallel beam back projection implementation

Original image

Hardware output image

Zoom: ~200%

Grayscale range < Pixel value range

(lung features in focus)



Parallel beam back projection implementation

Memory bandwidth requirements at 50 MHz (for data accumulation)

Case 1: 0.4 GB/s

Case 2: 1.6 GB/s

Case 3: 0.4 GB/s

Memory bandwidth limit

1.2 GB/s

Parallelism Issues

Case 1:

No parallelism extracted

Case 2:

Pixel level parallelism extracted

Case 3:

Projection level parallelism extracted

Projections

Image

columns

V1

Image

rows

V3

V2

T~k1*V1

T~k1*V2

T~k2*V3

k1 <k2, V2 =V3 =V1 /4, T=Execution time


Parallel beam back projection implementation

Simple Architecture accumulation)

Advanced Architecture - Data Path

projection parallelism extracted


Parallel beam back projection implementation

Performance Results: accumulation)Software vs. FPGA Hardware

  • Software - Floating point - 450 MHz Pentium : ~ 240 s

  • Software - Floating point - 1 GHz Dual Pentium : ~ 94 s

  • Software - Fixed point - 450 MHz Pentium : ~ 50 s

  • Software - Fixed point - 1 GHz Dual Pentium : ~ 28 s

  • Hardware - 50 MHz : ~ 5.4 s

  • Hardware (Advanced Architecture) - 50 MHz : ~ 1.3 s

Parameters: 1024 projections

1024 samples per projection

512*512 pixels image

9-bit sinogram data

3-bit interpolation factor


Parallel beam back projection implementation

Implementation Issues accumulation)

- fanout -

prj_num(3)

fanout = 1565 !

routing delay = 7.913 ns (~39.99%)


Parallel beam back projection implementation

Implementation Issues accumulation)

- fanout -

odd_2_A_4[4]

fanout = 144 !


Memory bridges stuff
Memory Bridges Stuff accumulation)

3 architectures implemented:

  • “Simple Architecture” = non-parallel (on slide 6)

  • “Advanced Architecture” = 4-way parallel (slide 12)

  • “Bridge Free Advanced Arch” =

    as B but contains no memory bridges (all design buffers in BlockRAMs) from PCI bus to memory banks required for Host-Memory communication. Bridges are separate design that is downloaded before (after) design C is downloaded so that input data can be stored to (output data read from) memories on the WildStar board.

    Virtex1000 resource utilization:

  • 11% logic, 90% BlockRAMs (with bridges)

  • 39% logic, 100% BlockRAMs

  • 21% logic, 100% BlockRAMs


Parallel beam back projection implementation

Floorplan of the accumulation)

“Bridge Free Advanced Architecture”

(design C on the previous slide)


Parallel beam back projection implementation

Future Directions accumulation)

  • Graduate