Efficient GPU Mesh Refinement for Real-Time Graphics Rendering

Generic Mesh Refinement on GPU Tamy Boubekeur & Christophe Schlick LaBRI – INRIA – CNRS Universityof Bordeaux

Today GPU Features • Combination of Vertex and Fragment Shaders • Dedicated hardware primitives (VBO, FBO) • C-like programming language (GLSL, CG, HLSL) > Most of the rendering task is on GPU

Today GPU Rendering • Per-pixel illumination • Hard and Soft Shadows • HDR images [ATI 2003] • Global Illumination Effects [NVIDIA 2005] > Near-Realistic real-time rendering

Realistic Image Means realistic lighting of course, but also: • Complex objects (many polygons) • Details (at the pixel resolution) Usual solution: • Shape complexity: smooth and/or detailled surfaces • Appearance complexity: high-resolution multi-texturing + fragments shaders

Fair Surface Representation • Multi-scale representation for real-time graphics: • Light representation for animation/physics. • On-the-fly geometric refinement for silhouette and contours. • Details at rasterization time. • One acceptable solution: • CPU: Low resolution “clever” mesh, carefully designed. • CPU/GPU: Medium resolution displacement map. • GPU: High resolution appearance map (normal, color).

Previous Work • Subdivision Surfaces • Many schemes [Catmull 1978][Loop 1987][Zorin 2000][Stam 2003] • Local control [Biermann 2001] • Hardware evaluation [Bishoff 2000][Bolz 2002] • GPU implementation [Bolz 2003][Shiue 2005] • Local Refinement • Curved PN Triangles [Vlachos 2001][Chung 2003] • Scalar Tagged Meshes [Boubekeur 2005]

Dynamic Refinement Frame Rendering: Graphics Bus Refinement Coarse Mesh Tessellated Mesh Refined Mesh Displacement Tessellator RENDERING CPU GPU Non-programmable Graphics Hardware

Dynamic Refinement Frame Rendering: Graphics Bus Refinement Coarse Mesh Tessellated Mesh Refined Mesh Displacement Tessellator RENDERING CPU GPU Programmable Graphics Hardware

Our goal Virtual GPU Tessellator Unit on today’s GPU Graphics Bus Refinement Coarse Mesh Tessellated Mesh Refined Mesh Displacement Tessellator RENDERING CPU GPU • Dynamic coarse meshes on CPU • Per-polygon tessellation on GPU • Local computation of refined vertices • General enough solution for various kinds of local displacement

Our Approach Storing only ONE triangle fully tessellated : the Refinement Pattern (RP), directly on the GPU memory • “Configuring” the GPU with the coarse triangle parameters • Rendering the RP instead of each coarse triangle • “Conforming” the RP to the current GPU « configuration »

GPU Refinement Pattern • The only one drawn primitive • A collection of triangle strips, pre-generated and stored on the GPU • Problem : • How to transform efficiently the RP toward the current coarse triangle ? • How to resolve this problem in a GPU-friendly fashion (ie. for each vertex of the RP independently) ?

GPU Refinement Pattern • Solution: • Encoding the barycentric coordinates {u, v, w} as position of the RP vertices. • Using this parameterization to interpolate parameters (position, normal, color, textures coordinates, etc) of the current triangle in the vertex shader

Displacement computation • Performed by the Vertex Shader for each refined vertex. • 3D-based displacement with interpolated position. • 2D-based displacement with interpolated texture coordinates > “texturing meshes with meshes”.

Level Of Detail • Discrete LOD by simply storing a set of RP at different resolution on the GPU memory • Switching on a per-object basis between the different level of detail • Continuous LOD by interpolation between 2 consecutive level of details.

Drawing algorithm Draw (CoarseMesh M, level L) GPU::RP::bindLOD (L); for allTriangle T of M do GPU::placeAttributeOnGPU (T); GPU::RP::draw ();

Drawing algorithm Draw (CoarseMesh M, level L) GPU::RP::bindLOD (L); for allTriangle T of M do GPU::placeAttributeOnGPU (T); GPU::RP::draw (); Refinement Shader 1: Attribute Interpolation 2:Displacement Computation 3: Output refined vertex

Implementation • Requires only Vertex Shader 1 compatible GPUs (all programmable GPUs) • No use of Fragment Shaders, no multi-pass, no render-to-texture, etc. • Small vertex shader code • RP stored as triangle strips in a VBO

Workload balance • Software side/CPU workload: • Initialization of the RP (set of RP for LOD). • High level modification of the coarse mesh. • Realtime graphics bus load: • Upload of the coarse per-triangle parameters through uniforms variables (positions, normals, colors, additional local parameters…) • Drawing call of RP (VBO) at the chosen LOD Note: Nb draw calls = Nb coarse triangles

Workload balance • GPU additional workload: • At the beginning of the Vertex Program, for the RP-based interpolation (i.e. the “tessellation”) • Essentially, the procedural displacement computation or query (displacement map with Vertex Shaders 3.0) Note: the RP-based interpolation masks the memory latency for texture access from the vertex shader [NVIDIA 2004]

Procedural Displacement 12 coarse triangle + frequency control; 786 432 on GPU triangles • Concept: Object = Simple Base Mesh + Procedural Function(s) • Deep refinement required for high frequencies: ex: a.sin(f|P|) • Full use of GPU (GPU-limited for deep refinements)

Curved PN Triangles • Local triangle smoothing with cubic Bezier patches • Point-Normal Construction • Linear or quadratic normal field [ATI/Valchos et al. 2004] • Interpolatory refinement • No dynamic topology to manage • Still non ordered triangle stream • Usually enough for real-time rendering

Curved PN Triangles • With our approach: • Each coarse triangle is provided to GPU with its Bezier coefficients: {b300, b030,b003, b210,b201,b021, b120,b102,b012, b111} • After tessellation, the displacement is computed with coefficients in the vertex program Note: coefficients can be computed on the GPU and stored as textures.

Scalar Tagged Mesh [EG 2005] • Enriched Meshes for real-time applications • PN Triangles + Scalar tag control on a per-vertex basis • Surface refinement driven by scalar tags

Scalar Tagged Mesh [EG 2005] • Shape factors: sharpness, tension, bias, … • Sharp creases locallydefined • Bezier coefficient: classification around each vertex • Intuitive behavior

Scalar Tagged Mesh [EG 2005] • Procedural Normal Map • Procedural Displacement Map • With our approach: still coarse triangle/coefficient transmission Note: branching needed in our implementation

Performance Simple tessellation of triangular meshes. • no full resolution topology generation/storage • For dynamic objects, between 1 and 3 orders of magnitude faster than a CPU refinement

Advantages • Negligible graphics bus bandwidth • Full use of GPU for deep refinement • Easy to implement • Easy to integrate as API extension (OpenGL driver implementation) • Drawing of objects that would not fit in memory

Drawbacks • Attributes transfer not optimal (uniform variables) • Need of a middleware support for full GPU utilization at low tessellation rate • Today more interesting for deep refinement (>6x6)

Alternative • Non triangular RP • Non polygonal rendering (point-based surfaces) • Non regular RP • Alternative RP parameterization • Alternative interpolation

Conclusion • A low cost Hardware Tessellator Unit • Generic method for hardware surface tessellation an displacement • An hardware implementation of Curved PN Triangles for ALL programmable GPUs • Very deep tessellation rate for procedural displacements • Easy to implement • Work on today’s GPU/no additional hardware

Conclusion • Refinement Shader: • RP-based interpolation = Tessellation • Function evaluation/query = Displacement • Output

Current and Future Work • Symmetry analysis for rough view-dependent LOD • Refinement with high order continuity using constrained meshes (vertex degree) [Bolz2002] • Semi-automatic conversion of larges meshes to GPU Fair Representation • New local smoothing algorithms (ST-Meshes)

Thank you for your attention ! Demo http://www.labri.fr/~boubek

Efficient GPU Mesh Refinement for Real-Time Graphics Rendering

Efficient GPU Mesh Refinement for Real-Time Graphics Rendering

Presentation Transcript

Real-time Mesh Simplification Using the GPU

New results on mesh refinement

Mesh refinement: sequential, parallel, and dynamic

AES on GPU

Hierarchical Finite Element Mesh Refinement

Generic Mesh Refinement on GPU

Optimal-time Dynamic Mesh Refinement

Parallel Block Adaptive Mesh Refinement For Multiphase Flows

Scalable Algorithms for Structured Adaptive Mesh Refinement

2D Analyses Mesh Refinement

Adaptive Mesh Refinement MHD

Adaptive mesh refinement in astrophysical simulations

Adaptive Hybrid Mesh Refinement for Multiphysics Applications

Adaptive Mesh Refinement

Adaptive mesh refinement

Mesh refinement methods in ROMS

Toward Automatic Parallel Adaptive Mesh Refinement

Visualization Tools for Adaptive Mesh Refinement Data

New results on mesh refinement

Real-time Mesh Simplification Using the GPU

Adaptive mesh refinement