1 / 47

3D Graphics Processor Architecture

3D Graphics Processor Architecture. Victor Moya. PhD Project. Research on architecture improvements for future Graphic Processor Units (GPUs). Design and implement a GPU simulator for 3D graphics. Goal: Real-Time radiosity on GPU. Outline. Rendering. Global Illumination. Ray Tracing.

dominguezm
Download Presentation

3D Graphics Processor Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 3D Graphics Processor Architecture Victor Moya

  2. PhD Project • Research on architecture improvements for future Graphic Processor Units (GPUs). • Design and implement a GPU simulator for 3D graphics. • Goal: • Real-Time radiosity on GPU.

  3. Outline • Rendering. • Global Illumination. • Ray Tracing. • Radiosity. • Status.

  4. Outline • Rendering. • Global Illumination. • Ray Tracing. • Radiosity. • Status.

  5. Rendering • Display a database of 3D objects over a screen, a picture (file) or a movie (file). • Rendering methods for 3D graphics: • Rasterization. • Reyes. • Raytracing. • Radiosity.

  6. Rasterization • Project 3D polygons into a view plane. • Rasterize those polygons into fragments. • Shade the generated fragments. • Apply and combine textures to calculate the fragment color. • Objective: • Real-time. • Make it look as realistic as possible. • Avoid simulation of physical light behavior. • With the help of vertex and fragment shaders can render realistic images.

  7. project polygons projection (near) plane rasterize polygons far plane

  8. Reyes • Reyes or Renderman is a rendering architecture designed for realistic offline rendering. • The 3D objects are reduced to a number of micropolygons. • Micropolygon: polygon smaller than a pixel. • The micropolygons are then shaded and later sampled and written to the framebuffer. • Can be combined with raytracing, radiosity or other global illumination techniques.

  9. Model Dice Shade Sample Visibility/Filter Image

  10. RayTracing • Project a ray from the camera (framebuffer) to the objects in the scene. • Secondary rays may be created as reflections and refractions of the primary rays or other secondary rays. • Rays may be sent from the light sources to create caustic light effects. • Good simulation of reflection and transparency (refractions).

  11. Radiosity • Simulates the physical behavior of the light. • Define the emission, reflection, refraction, absorption and scattering properties of the scene surfaces. • Mathematical formulation: system of linear equations. • Build iteratively an approximation to the illumination solution. • Used to implement global illumination • Diffuse lightning. • Indirect lightning.

  12. 16th pass 1st pass 2nd pass 3rd pass 4th pass

  13. Real-time. Implemented on current hardware. Optimized for large polygons and small number of polygons Global illumination: Emulated using: Shaders Shadow maps. Stencil shadows. Off-line rendering. Implemented by software renderers. Optimized for large number of polygons. Global illumination: Shaders. Raytracing. Radiosity. Rasterization VS Reyes

  14. Real-time. Implemented on current hardware. Optimized for large polygons and small number of polygons Global illumination: Emulated using: Shaders Shadow maps. Stencil shadows. Off-line rendering. Implemented by software renderers. Some hardware implementations. Optimized for large polygon numbers. small polygons. Global illumination: Whitted Ray Tracer. Photon Mapping. Montecarlo. Path Tracing. Rasterization vs Raytracing

  15. Real-time. Implemented on current hardware. Optimized for large polygons and small number of polygons Global illumination: Emulated using: Shaders Shadow maps. Stencil shadows. Off-line rendering. Implemented by software renderers. Optimized for large polygon numbers. small polygons. Global illumination: Inherent to the algorithm. Rasterization vs Radiosity

  16. Outline • Rendering. • Global Illumination. • Ray Tracing. • Radiosity. • Status.

  17. Global Illumination • Illumination and lightning depends on all the objects and lights in the scene. • BRDF: • Function that defines how light is reflected or refracted over a surface. • Soft shadows: • Umbra and penumbra effects. • Physically real reflections and refractions. • Indirect illumination: • Color blending. • Caustics.

  18. Why Global Illumination • Realism. • Single algorithm for the full the illumination problem: • Direct illumination. • Indirect illumination. • Shadows.

  19. Global Illumination vs Real-Time • Full scene global illumination algorithms are expensive. • Introduce illumination algorithm lod (level of detail). • Not all scenes may require a full global illumination implementation. • Not all parts of the scene may require a full global illumination implemention. • Combine normal rasterization algorithms and techniques and global illumination techniques. • Reyes architecture.

  20. Outline • Rendering. • Global Illumination. • Raytracing. • Radiosity. • Status.

  21. Raytracing • Highly parallel task. • Raytracing algorithms: • Raycasting. • Shadow Casting. • Whitted raytracing. • Photon mapping. • Montecarlo. • Path tracing.

  22. Rays • Types: • Eye rays. • Shadow rays. • Reflected rays. • Refracted rays. • Raytracing recursion depth. • Static. • Adaptative.

  23. Raytracing on Current GPUs • Limitations: • Integer arithmetic and addressing not supported in current shader models. • No generalized output buffers for fragment shader programs. • No branching, looping or funtion calls. • No stream buffer or conditional stream support. • Under utilization of the vertex shader (1 quad per pass). • Vertex shader represent ~30% of the computing resources in current GPUs.

  24. Outline • Rendering. • Global Illumination. • Raytracing. • Radiosity. • Status.

  25. Radiosity • Light energy per unit surface leaving any surface in the scene. • Highly parallel. • Scene is divided in patches. • Form factor. • Fraction of light that reaches a surface i from a surface j.

  26. Radiosity • Radiosity implementations: • Light maps and volumes. • Statically (off line) radiosity. • Used with rasterization as textures. • Cube maps and Spherical Harmonics. • Fast implementation on current hardware. • Photon Mapping. • Implemented using raytracing. • System of linear equations. • Matrix resolution or approximation. • Iterative resolution.

  27. Photon Mapping on Current GPUs • Limitations: • No integer ALU and addressing modes. • No support for large 1D texture addressing (CPU loads). • No scatter capability at the shaders (CPU stores).

  28. Outline • Rendering. • Global Illumination. • Ray Tracing. • Radiosity. • Status.

  29. Research Topics • Evaluate radiosity on Atila. • Propose software and hardware changes to make radiosity real-time.

  30. Immediate changes • Unifiy shader model. • Single shader model for vertex and fragment shaders. • Generalize shader model. • Integer operations. • Branches and function calls. • Looping. • Memory load (different from texture load). • Texture write and memory store (scatter).

  31. New architecture proposals • Reconfigurable shader architecture. • Streaming. • Deferred rendering. • Embedded DRAM. • Virtualization.

  32. Reconfigurable Architecture • Static: • Variable rendering configuration for each algorithm: • 2:6:16. • 0:8:16. • 0:0:24. • Dynamic. • Work balancing. • Streaming between shader units.

  33. Surface Shaders Vertex Shaders Fragment Shaders Vertex Shaders Fragment Shaders Ray shaders

  34. Interconnection Network Dynamically reconfigurable shader network.

  35. Streaming • Streaming on-chip buffers between shader units. • Conditional streams. • Any shader can: • Stream in from memory. • Stream out to memory. • Stream in from another shader. • Stream out to another shader.

  36. Interconnection Network MC Interconexion Network MC Interconexion Network

  37. Deferred rendering • Store all the scene in local video memory before rendering. • Rasterization: • After geometric stage. • Reduces the overdraw overhead. • Raytracing: • Before any processing. • Build acceleration structure for dynamic scenes.

  38. MC Interconexion Network

  39. Embedded DRAM • On chip large embedded DRAM memory buffer. • Store stream buffers between shaders. • Store framebuffer and Z buffer for rasterization: • Reduced overhead from overdraw. • With deferred tiled rendering: fast low external bandwidth supersampling antialiasing. • Store acceleration structures for raytracing and radiosity.

  40. System Memory GPU Memory eDRAM L2 Interconexion Network

  41. Virtualization • Virtualize GPU resources: • Number of shader processors. • Size of on-chip stream buffers. • Virtualize shader resources: • Scratch RAM for register spills. • Instruction cache rather than intruction memory for unlimited program length.

  42. Virtualization • Virtualize GPU memory: • Memory hierarchy: • On chip caches (L1 or L2). • On chip embedded DRAM buffers. • GPU video memory (L2 or L3). • System memory. • Disk.

  43. Specific vs Programable • Rasterization specific purpose hardware: • Hierarchical Z Buffer. • Z and Stencil Buffer. • Rasterizer: • Triangle Setup. • Fragment Generation. • Interpolation.

  44. Specific vs Programable • Future GPUs may replace specific purpose units with programable units (shaders). • Example: • Triangle Setup using homogenous coordinates setup algorithm [Olano & Greer] can be efficiently implemented by current shaders. • Only use specific purpose units for those tasks that are more efficient using specific hardware.

  45. Specific purpose hardware • Acceleration hardware for ray-triangle intersection. • Acceleration hardware for scene traversal. • Acceleration hardware for photon maps.

  46. General Purpose GPUs • A generalized GPU can be used as a highly parallel coprocessor for highly parallizable computation tasks. • Simulations: • Collision. • Fluid simulation. • FFT. • Matrix resolution. • Grid computing.

More Related