Loading in 5 sec....

Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1PowerPoint Presentation

Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Download Presentation

Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Loading in 2 Seconds...

- 69 Views
- Uploaded on
- Presentation posted in: General

Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Charles B. Cameron

United States Naval Academy

Department of Electrical Engineering

United States Naval Academy

105 Maryland Avenue, Stop 14B

Annapolis, Maryland 21402-5025

- Research supported by:
- NASA Goddard Space Flight Center (Code 586)
- NRL Applied Optics Branch (Code 5630)
- DoD High Performance Computing Modernization Program at NRL (Code 5593)
- United States Naval Academy
- Xilinx, Inc.

- Ray tracing
- Conventional parallel processing
- Modulo scheduling
- Coordination of sequential and parallel processing
- Expected Performance

- MODIS
- Moderate-resolution Imaging Spectroradiometer

- The Intersection Problem
- Finding the Perpendicular
- Refraction
- Reflection

- 485 pinholes
- 400 rays per pinhole
- 241 ´ 121 rays reflected from the diffuser
- 5.66 ´ 109 rays

- MODIS
- Moderate-resolution Imaging Spectroradiometer

- The Intersection Problem
- Finding the Perpendicular
- Refraction
- Reflection
- Coordinate Transformation

- MODIS
- Moderate-resolution Imaging Spectroradiometer

- The Intersection Problem
- Finding the Perpendicular
- Refraction
- Reflection
- Coordinate Transformation

- MODIS
- Moderate-resolution Imaging Spectroradiometer

- The Intersection Problem
- Finding the Perpendicular
- Refraction
- Reflection
- Coordinate Transformation

- MODIS
- Moderate-resolution Imaging Spectroradiometer

- The Intersection Problem
- Finding the Perpendicular
- Refraction
- Reflection
- Coordinate Transformation

- MODIS
- Moderate-resolution Imaging Spectroradiometer

- The Intersection Problem
- Finding the Perpendicular
- Refraction
- Reflection
- Coordinate Transformation

- MODIS
- Moderate-resolution Imaging Spectroradiometer

- The Intersection Problem
- Finding the Perpendicular
- Refraction
- Reflection
- Coordinate Transformation
(Hard to visualize this!)

- Ray tracing
- Conventional parallel processing
- Modulo scheduling
- Coordination of sequential and parallel processing
- Expected Performance

*

99.998 %

5,857 %

* Rate based on a linear regression of results obtained using a varying numbers of processors.

- Ray tracing
- Conventional parallel processing
- Modulo scheduling
- Coordination of sequential and parallel processing
- Expected Performance

Not too many of these

Lots of these

Latency

Critical Path

(Data-Flow Limit)

88 cycles

Equal to the Data-Flow Limit

One collective computation

Multipliers are 100 % utilized

No schedule conflicts

Two multipliers with two multiplications each

One adder with two additions

Two cycles

Maximum efficiency

Improved efficiency:

Up from 25 %

Less than the Data-Flow Limit

Less than the Data-Flow Limit, but double the throughput.

- Ray tracing
- Conventional parallel processing
- Modulo scheduling
- Coordination of sequential and parallel processing
- Expected Performance

- MPI (Message Passing Interface)
- Master node
- Reads file
- Distributes file
- Collates results

- Open MP (Multi Processing)
- 144 of 220 nodes have a Xilinx Virtex II Pro FPGA
- Opteron processors
- Sequential program
- Depth first

- FPGA
- Pipelined hardware
- Breadth first

- Ray tracing
- Conventional parallel processing
- Modulo scheduling
- Coordination of sequential and parallel processing
- Expected Performance

- Modulo scheduling produces 100 % efficiency of critical resources.
- Sequential processors get a boost from supplemental FPGA processing.
- Deep pipelines are efficient only if filled much of the time.
- FPGAs beat ASICs only if they can take advantage of special problem knowledge.
- Opteron uses 55 W.
- Virtex II Pro FPGA uses 4 W to 45 W.

- Intersection of a Ray with a Plane
- Intersection of a Ray with a Sphere
- Intersection of a Ray with a Conicoid
- Finding the Perpendicular
- Interaction of a Ray with an Optical Surface
- Coordinate Transformations

Point in the plane

Initial direction

Final point

Initial point

Normal to the plane

List of equations

Initial direction

Final point

Initial point

List of equations

Final point

Initial point

Initial direction

List of equations

Unit Vector Normal to a Sphere

Unit Vector Normal to a Conicoid

List of equations

Refraction

Reflection

Initial index of refraction

Final index of refraction

Normal to the plane

Initial direction

Final direction

List of equations

Position in Frame of Reference k

Positionin Frame of Reference k+1

Rotation and Translation

Rotation

Direction in Frame of Reference k+1

Rotation Matrix

Direction in Frame of Reference k

Translation Vector

List of equations