1 / 86

Afrigraph Tutorial B: Interactive Ray-Tracing

Afrigraph Tutorial B: Interactive Ray-Tracing. Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group http://graphics.cs.uni-sb.de. For almost 20 years, researchers have argued that eventually, Ray-Tracing will become faster than rasterization.

step
Download Presentation

Afrigraph Tutorial B: Interactive Ray-Tracing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Afrigraph Tutorial B:Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group http://graphics.cs.uni-sb.de

  2. For almost 20 years, researchers have argued that eventually, Ray-Tracing will become faster than rasterization Tutorial on Interactive Raytracing

  3. For almost 20 years, researchers have argued that eventually, Ray-Tracing will become faster than rasterization And nothing happened... Well, almost ... Tutorial on Interactive Raytracing

  4. UNC Powerplant (12.5 Mtris, >10 fps) Tutorial on Interactive Raytracing

  5. Four Power Plants (50 Mtris) Tutorial on Interactive Raytracing

  6. Tutorial Overview • Introduction • Introduction to Ray-Tracing • Discussion: Ray-Tracing versus Rasterization • Previous Work • Approximating Ray-Tracing • Accelerated Ray-Tracing • Interactive Ray-Tracing on PCs • Coherent Ray-Tracing Implementation • Comparisons (SW / HW) • Distributed RT of Massive Models • Outlook: Hardware-Architectures for Ray-Tracing • Future Research and Conclusions Tutorial on Interactive Raytracing

  7. Tutorial Overview • Introduction • Introduction to Ray-Tracing • Discussion: Ray-Tracing versus Rasterization • Previous Work • Approximating Ray-Tracing • Accelerated Ray-Tracing • Interactive Ray-Tracing on PCs • Coherent Ray-Tracing Implementation • Comparisons (SW / HW) • Distributed RT of Massive Models • Outlook: Hardware-Architectures for Ray-Tracing • Future Research and Conclusions Tutorial on Interactive Raytracing

  8. Introduction to Ray-Tracing • In principle: Very simple algorithm • For each pixel • Create ray through that pixel • Cast ray into scene and find closest intersection • “Shade” ray at intersection point • Can also shoot new rays during shading: • Determine visibility of point lights by “shadow rays” • Compute reflected/refracted light by recursively tracing reflection-/refraction-rays • Basically, that´s all… Tutorial on Interactive Raytracing

  9. Ray-Tracing Algorithm Tutorial on Interactive Raytracing

  10. Introduction to Ray-Tracing • Only three main components: • Generating rays • Finding the closest intersection of a ray • Ray traversal • Ray-object intersection • Shading Tutorial on Interactive Raytracing

  11. Ray-Generation • Generate initial ray for each pixel • Other camera models are trivial… • Fisheye lens • Non-linear distortions/Lens effects • Motion blur, depth of field • … • Options • More samples for anti-aliasing • Adaptive Sampling • Combine with IBR • E.g. „RenderCache”: Reuse samples by reprojection Tutorial on Interactive Raytracing

  12. Ray-Traversal Grid (2D) • Need to find objectsquickly • “Exhaustive” searchinfeasible • Build spatial index structure • Grid, octree, BSP-tree, BVH, ... • Advantages • Logarithmic complexity • Occlusion culling • “Early ray termination” • Problems • Multiple intersection computations (objects often in multiple voxels) • Dynamic scenes ? Octree (2D) Tutorial on Interactive Raytracing

  13. Ray-Object-Intersection • Need to compute intersectionsfast • Requires many floating point operations • But typically dominated by traversal (2:1) • Plenty of algorithms • Plenty of primitives • Even for triangles • Optimizations • Use SIMD CPU-extensions (SSE, AltiVec, 3D-Now) • Data parallel execution • Proper caching of data Tutorial on Interactive Raytracing

  14. Shading • Lots of reflection models possible • Phong, Cook-Torrance, Ward, … • Direct use of Shading Languages (Renderman) • Shading after visibility has been computed • No overhead due to overdraw • Every ray is shaded exactly once • Can generate new rays • Shadow, reflection, transmission, ... • Need to deal with recursion • Rendering cost linear in #rays traced Tutorial on Interactive Raytracing

  15. Introduction to Ray-Tracing • Only three main components: • Generating rays • Finding the closest intersection of a ray • Ray traversal • Ray-object intersection • Shading • Problem: • “Find closest intersection” is very expensive • And: Lots of rays per image … Tutorial on Interactive Raytracing

  16. Rasterization Pipeline Application In Contrast: Rasterization • Efficient HW implementation • Use of object coherence • Many new features • Rendering is driven by App. • Application submits geometry • Visibility determined at end • Z-buffer fragment test T&L, Vertex Ops Rasterization Texturing Fragment Ops Fragment Tests Framebuffer Tutorial on Interactive Raytracing

  17. RasterizationDrawbacks Drawbacks of this approach • Use of object coherence • Only if triangle is large • Rendering is driven by App. • Application has to know what is visible… • Efficient occlusion culling is hard • Visibility determined at end • Overdraw: Discard all but one fragments • High depth complexity: very inefficient Tutorial on Interactive Raytracing

  18. Ray-Tracing versus Rasterization • Flexibility • Handling unstructured groups of rays • Image-based rendering, reflections, shadows … • Generality • Ray-Tracing is the basis for many algorithms • Global illumination, visibility, … • Used in many disciplines • Physics, Biology, Chemistry, Telecom, … Tutorial on Interactive Raytracing

  19. Ray-Tracing versus Rasterization • Simple and Efficient Shading • Shading happens after visibility computation • Direct use of Shading Languages • Correctness & Image Quality • Rasterization inherently relies on approximations • Environment maps, shadow maps, ... • Ray-traced images are “correct” by default • ´True´ reflections and shadows… • Use of approximations is optional Tutorial on Interactive Raytracing

  20. Ray-Tracing versus Rasterization • Parallel Scalability • Ray-Tracing is „embarrassingly parallel“ (e.g. each pixel independent of all others) • Scales well with the available hardware • Needs fast access to scene data base Tutorial on Interactive Raytracing

  21. Ray-Tracing versus Rasterization • Scalability with Scene Size: Occlusion Culling & Logarithmic Complexity • RT never even looks at invisible geometry • RT traversal allows for efficient searching: O(log N) • Rasterization shows linear behavior: O(N)  RT wins for complex scenes • But rasterization is improving Tutorial on Interactive Raytracing

  22. Ray-Tracing versus Rasterization • Coherence • Key to efficient rendering • Rasterization: Object coherence • Allows for efficient HW implementation • But only really efficient for large triangles • Ray-Tracing: Ray coherence • Improved caching & reduced bandwidth • Allows for data parallel computation • RT has much more coherence than assumed • But harder to exploit… Tutorial on Interactive Raytracing

  23. Ray-Tracing versus Rasterization • Conclusion of that Comparison • Ray Tracing has many advantages • These advantages become ever more pronounced • Not only qualty, also efficiency… • But: Ray-Tracing is (still) costly • Have to make it faster ! Tutorial on Interactive Raytracing

  24. Tutorial Overview • Introduction • Introduction to Ray-Tracing • Discussion: Ray-Tracing versus Rasterization • Previous Work • Approximating Ray-Tracing • Accelerated Ray-Tracing • Interactive Ray-Tracing on PCs • Coherent Ray-Tracing Implementation • Comparisons (SW / HW) • Distributed RT of Massive Models • Outlook: Hardware-Architectures for Ray-Tracing • Future Research and Conclusions Tutorial on Interactive Raytracing

  25. Previous and Related Work Two ways to achieve ray-tracing like quality interactively: • Trace less rays per frame: “Approximative ray-tracing” • Rasterization hardware • Image-based techniques • Interpolation of ray-traced results • Trace more rays/sec: “Accelerated ray-tracing” • Better data structures • Better algorithms • Better implementations • Parallel processing Tutorial on Interactive Raytracing

  26. Previous and Related Work Two ways to achieve ray-tracing like quality interactively: • Trace less rays per frame: “Approximative ray-tracing” • Rasterization hardware • Image-based techniques • Interpolation of ray-traced results • Trace more rays/sec: “Accelerated ray-tracing” • Better data structures • Better algorithms • Better implementations • Parallel processing Tutorial on Interactive Raytracing

  27. Approximated Ray-Tracing:Rasterization Hardware • „HW-Accelerated“ vista/shadow buffers • Compute visible geometry in HW • Lookup of geometry in frame buffer • Only works for primary rays and point lights • Creates artifacts (e.g. shadow buffer resolution) • Augmenting hardware with RT effects • Selective ray-tracing • Integrate ray-tracing with OpenGL rendering • Rasterization for diffuse objects • Textures or splatting [Stamminger/Haber 00/01] for ray-traced samples Tutorial on Interactive Raytracing

  28. Approximated Ray-Tracing:Corrective Textures Tutorial on Interactive Raytracing

  29. Approximated Ray-Tracing:Image-Based Techniques • RenderCache [Walter et al. 99] • Store ray samples per pixel (color, depth, ...) • Reproject samples for next frame • Detect and fill holes by sending few new rays • Heuristic algorithms based on neighborhood • Locate and correct errors (shadow, etc) • Pseudo-randomly sample a few other pixel • Adaptively sample near error regions • But: Reprojection and Heuristics are expensive • Pays off (only) when pixels are very expensive to compute directly (e.g. global illumination) • Scales badly with #CPUs Tutorial on Interactive Raytracing

  30. Approximated Ray-Tracing:Image-Based Techniques • Holodeck [Ward 98] • Similar to RenderCache, but • Long term storage of ray samples on disk • Fast access to samples based on grid structure • Builds light-field-like data representation Tutorial on Interactive Raytracing

  31. Approximated Ray-Tracing:Image-Based Techniques • Interpolation in the image plane • Pixel-selected ray-tracing [Akimoto, 89] • Coarse sampling grid • Adaptive refinement based on error criteria • Linear interpolation between samples • General ray interpolation [Bala, 99] • Object-/Ray-/Image-Space • Time • Error bounded Tutorial on Interactive Raytracing

  32. Previous and Related Work Two ways to achieve ray-tracing like quality interactively: • Trace less rays per frame: “Approximative ray-tracing” • Rasterization hardware • Image-based techniques • Interpolation of ray-traced results • Trace more rays/sec: “Accelerated ray-tracing” • Better data structures • Better algorithms • Better implementations • Parallel processing Tutorial on Interactive Raytracing

  33. Accelerated Ray Tracing:Better Data Structures/Algorithms • ´Best´ data structure (Grid vs BSP vs…) ? • Always scene and implementation dependent • In practice, most do about equally well… • Well-reserached topic  ´New´ data structures are unlikely to be found • But: Potential for better algorithms: • Can we better exploit coherence ? • Can we build data structures faster ? • Can we build data structures fully automatically ? • Also: Need for dynamic data structures Tutorial on Interactive Raytracing

  34. Accelerated Ray-Tracing:Parallelization on SuperComputers • RT of large CSG models [Muuss 95] • Motivation: Interactively render complex data sets • Idea: Use raytracing • Flexibility: Avoid tessellation of CSG-models • Take advantage of logarithmic complexity of RT • Exploit parallelism • Implementation • Optimized, general RT algorithm • 96 CPU, SGI PowerChallenge, shared memory • Results • 1-2 frames per second @ video resolution (in ´95!!!) Tutorial on Interactive Raytracing

  35. Accelerated Ray-Tracing:Parallelization on SuperComputers • Utah Parallel RT System [Parker 99] • Similar approach to Muuss • Parallelization on shared memory machine • Supports general primitives and volume data sets • Results • Has shown scalability up to 128 CPUs • Importance of cachinganalysis • New goal: interactive visual cues for visualization(Same information at less cost) Tutorial on Interactive Raytracing

  36. Tutorial Overview • Introduction • Introduction to Ray-Tracing • Discussion: Ray-Tracing versus Rasterization • Previous Work • Approximating Ray-Tracing • Accelerated Ray-Tracing • Interactive Ray-Tracing on PCs • Coherent Ray-Tracing Implementation • Comparisons (SW / HW) • Distributed RT of Massive Models • Outlook: Hardware-Architectures for Ray-Tracing • Future Research and Conclusions Tutorial on Interactive Raytracing

  37. IRT on PC´s:What to keep in mind • PC hardware has changed dramatically • Processors become much faster • But increase in ray-tracing speed is gradual • Increasing gap between speed of CPU and memory • But ray-tracing algorithm did not change • SIMD extensions • Flops become increasingly cheap • But difficult to take advantage of in ray-tracing • Fast (and cheap) networking & network of PCs • But good performance on non-shared-memory is hard • Small clusters are around everywhere… Tutorial on Interactive Raytracing

  38. IRT on PC´s:What to keep in mind • PC hardware has changed dramatically • Have to adapt our algorithms ! • Special emphasis on • Keeping the CPU busy • Memory & Caching(1 cache miss can cost several triangle intersections) • SIMD • Not so important any more: • Instruction count, avoiding float ops Tutorial on Interactive Raytracing

  39. General Optimizations: Cache Main memory is too slow for CPU (1:10) (bandwidth and latency) • Keep relevant data in caches • Design algorithms for cache reuse  coherence • Align data to cache lines (32 bytes) • Separate data according to usage • Separate volatile from non-volatile data • Store intersection data separate from shading data(e.g. shading normals not needed for intersection) • Prefetch data • Design algorithms to enable data access prediction Tutorial on Interactive Raytracing

  40. General Optimizations: Cache Cache Reuse Example: Triangle Data Structure • Variant 1: Struct Triangle { Vec3f *a,*b,*c; }; • Intersect() routine works on this structure • Prefetching hard (2 levels of indirection) • Data stored in 4 different memory regions (1 struct + 3 vectors) • Worst case: 8 cache misses (if each of the 4 data overlaps cacheline border) Tutorial on Interactive Raytracing

  41. General Optimizations: Cache Cache Reuse Example: Triangle Data Structure • Variant 2: With preprocessed intersection data • All necessary data packed into 48 aligned bytes(see paper) • Con: Additional data to store (48b/triangle) • But several advantages: • At most 2 cache misses • 1 continuous memory region  Trivial to prefetch Tutorial on Interactive Raytracing

  42. General Optimizations: Cache • This was only one example: Similarly for • BSP Nodes (even more important) • Triangle lists • Materials • Shading Data • … Tutorial on Interactive Raytracing

  43. General Optimizations: Simplification Today's CPUs have very long pipelines • Simplify the code to avoid pipeline stalls • Choose simple algorithms • “KISS” wins…(KISS = keep it simple and stupid) • E.g. BSP-tree traversal simpler than grids • Easier to maintain and optimize (e.g. prefetching) • Write tight inner loops • E.g. better caching and handling of branches • Avoid conditionals/relative jumps in inner loops • E.g. support only triangles • Avoid memory-access stalls  Caching, caching, caching !!! Tutorial on Interactive Raytracing

  44. Optimization:SIMD Extensions Most CPUs provide SIMD extensions Intel: SSE (Others: 3D-Now!, AltiVec, ...) • Use SIMD: higher speed & lower bandwidth • Up to four parallel floating point operations  For the cost of 1 ! • Fetch data once to reduce bandwidth to cache • Amortize loading cost over 4 operations Factor 4 in bandwidth reduction • Overhead due to restricted instruction set • E.g. no ´SSE dot product´ • Con: Programming in assembly language Tutorial on Interactive Raytracing

  45. Optimization:SIMD Extensions How to use SIMD Extensions ? • Either: Instruction-parallel • Combine 4 computations in ´normal´ algorithm • E.g. the 4 mults in a dot product • Or: Data-parallel • Run algorithm on 4 different data in parallel • E.g. 4 independent dot products Tutorial on Interactive Raytracing

  46. SIMD: Intersection • SIMD best used in data parallel fashion • Little instruction-level parallelism (in RT)  Just doesn´t work… • Data parallel: 1 ray  4 triangles • Hard to always have four triangles ready • Data parallel traversal for 1 ray ? • Data parallel: 4 rays 1 triangle • Must traverse rays in parallel  ray packets • Standard intersection code • Overhead for terminated rays(E.g. 1 ray hits, 3 rays miss) Tutorial on Interactive Raytracing

  47. SIMD: Intersection • Performance Results • Comparison against already optimized C code • Amortized cost for SSE code  20-36 million intersections/sec! (P-III, 800 MHz) Tutorial on Interactive Raytracing

  48. SIMD: BSP-Traversal • Recursive Traversal Algorithm Tutorial on Interactive Raytracing

  49. SIMD: BSP-Traversal • SIMD-Traversal • Traverse four rays in parallel • Intersection with split plane & traversal decision • Combine decisions flags • All rays must perform the same traversal • Make sure order is consistent • Easy to guarantee: Same ray origin or same signs of direction vector • Avoid recursion function calls • Maintain stack manually • Worst case: as bad as before… Tutorial on Interactive Raytracing

  50. SIMD: BSP-Traversal • Overhead of SIMD-Traversal (in %) • Fixed resolution at 10242 (l), fixed 2x2 packet (r) • Traversal still dominates rendering cost • Overall speedup factor: 2 to 2.3 Tutorial on Interactive Raytracing

More Related