Hardware-Assisted Visibility Sorting for Tetrahedral Volume Rendering

Download Presentation

Hardware-Assisted Visibility Sorting for Tetrahedral Volume Rendering

Loading in 2 Seconds...

- 109 Views
- Uploaded on
- Presentation posted in: General

Hardware-Assisted Visibility Sorting for Tetrahedral Volume Rendering

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Hardware-Assisted Visibility Sortingfor Tetrahedral Volume Rendering

Steven Callahan Milan Ikits

João Comba Cláudio Silva

- Introduction
- Previous Work
- Hardware-Assisted Visibility Sorting
- Results
- Future Work
- Conclusion

- Real-time volume rendering
- Scalable (machine performance)
- Data of arbitrary size
- Simple and robust implementations

Regular

Irregular

Unstructured grids are the preferred data type in scientific computations

Level-Of-Detail (LOD) techniques intrinsically need unstructured grids

El-Sana et al, Ben-Gurion

Absorption plus emission

Light

s

s

Front-to-back

I1

I0

I2

1

2

0

I01

I2

2

01

Class 1 (+, +, +, -)

Class 2 (+, +, -, -)

Projected Tetrahedra [Shirley-Tuchman 1990]

Application

Object-Space

Sorting

i.e., let’s sort the geometry!

Rasterization

Image Space

Display

B

7

5

6

A

3

4

p

2

1

A < B

p

B < A

A < C

B < E

C < E

C < D

E < F

D < F

Idea: Define ordering relations

by looking at shared faces.

D

A

C

F

B

E

Viewing direction

Missing relations!

A < C

B < D

Idea: Using ray shooting queries to complement ordering relations.

C

D

A

B

A < B

Viewing direction

Application

Object Space

Rasterization

Image-Space

Sorting

Display

i.e., let’s sort the pixels!

- Idea: Keep a list of intersections for each pixel.

[Carpenter 1984]

Not sorted!

Sorted!

2

Number of Intersections: O(cn )

n x n pixels

c cells

- Problems
- Time: sorting takes too long
- Memory: storage too high

Application

Object-Space

Sorting

Rasterization

Image-Space

Sorting

Display

1

1

2

3

1

2

3

5

1

4

2

3

6

5

7

1

4

2

3

7

5

6

1

4

2

A Solution: Use an insertion-sort A-buffer!

What about the space problem?

3

7

5

6

1

4

2

Use a conservative bound on the intersections

- Sort in image-space and object-space
- Do an approximate object-space sorting of the cells on the CPU (i.e. sort by face centroid)
- Complete the sort in image-space by using a fixed depth A-buffer (called a k-buffer) implemented on the GPU
- Can handle non-convex meshes, has a low memory overhead, and requires minimal pre-processing of data

- Fixed size A-buffer of depth k
- Fragment stream sorter
- Stores k entries for each pixel. Each entry consists of the fragment’s scalar value and its distance to the viewpoint
- An incoming fragment replaces the entry that is closest to the eye (front-to-back compositing)
- Given a sequence of fragments such that each fragment is within k positions from its position is sorted order, it will output the fragments in sorted order

r

a

b

g

g comp

r comp

b comp

a comp

v1

v2

d2

d1

v3

d4

d3

v4

d5

v6

v5

d6

- Use multiple render target capability of ATI graphics cards (ATI_draw_buffers in OpenGL)
- Use P-buffer to accumulate color and opacity and three Aux buffers for the k-buffer entries

P-buffer

Aux 0

Aux 1

Aux 2

- Fix incorrect screen-space texture coordinates caused by perspective-correct interpolation

Projecting vertices to find tex coords

Projecting tex coords in shader

Perspective interpolation

- Simultaneously reading and writing to a buffer is undefined when fragments are rasterized in parallel

- The buffers are initialized and flushed using k screen-aligned rectangles with negative scalar values
- Handling non-convex objects requires the exterior faces to be tagged with a negative distance d and keeping track of when we are inside or outside of the mesh with the sign of the scalar value v

- Early ray termination reads accumulated opacity and kills fragment if it is over a given threshold. Early z-test is currently not available on ATI 9800 when using multiple rendering targets

- Previous Work
- Volume density optical model
- Williams and Max 1992

- Pre-integration on GPU
- Roettger et al. 2000
- 5 s to update a 128x128x128 table

- Incremental pre-integration on CPU
- Wieler et al. 2003
- 1.5 s to update a 128x128x128 table

- Volume density optical model

S

S

f

b

l

- Williams and Max

n = 0…l

max

T

3D

S

b

S

f

- Roettger et al.

S

S

S

f

p

b

l

l’

l

- Weiler et al.

- Our Approach
- Incremental pre-integration of the 3D transfer function completely on the GPU
- Compute base slice using [Roettger et al.]
- Compute the other slices using the base slice and the previously computed slice [Weiler et al.]

- 0.067 s to update a 128x128x128 table
- This allows interactive updates to the colormap and transfer function opacity

- Incremental pre-integration of the 3D transfer function completely on the GPU

- Environment
- 3.0 GHz Pentium 4
- 1024 MB RAM
- Windows XP
- ATI Radeon 9800 Pro

- Results
- k-buffer analysis
- Performance results

- Accuracy analysis
- Analysis of k depth required to correctly render datasets
- Max values from 14 fixed viewpoints

- Distribution analysis
- Shows actual pixels that require large k depths to render correctly for each viewpoint

k <= 2 (green) 2 < k <= 6 (yellow) k > 6 (red)

- Performance
- Average values from 14 fixed viewpoints
- Does not include partial sort on CPU
- 512 x 512 viewport with a 128 x 128 x 128 pre-integrated transfer function

- Optimize partial sort on CPU
- Develop techniques to refine datasets to respect a given k (subdivide degenerate tets)
- Incorporate isosurface rendering
- Parallel techniques
- Proper hole handling
- Dynamic data
- Use early z-test

- Renders up to 6 million Tets/sec when using a linear transfer function
- Handles arbitrary non-convex meshes
- Requires minimal pre-processing of data
- Maximum data size is bounded by main memory
- Uses simple vertex and fragment shaders