stencil routed a buffer
Download
Skip this Video
Download Presentation
Stencil Routed A-Buffer

Loading in 2 Seconds...

play fullscreen
1 / 29

Stencil Routed A-Buffer - PowerPoint PPT Presentation


  • 145 Views
  • Uploaded on

Stencil Routed A-Buffer. Kevin Myers and Louis Bavoil NVIDIA. Our Cool Thing. What is it?. A-Buffer Simply a list of fragments per-pixel “The A-buffer, an antialiased hidden surface method” [Carpenter 84] Related Work Depth Peeling [Mammen 89] [Everitt 01] k-Buffer [Bavoil et al. 07].

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Stencil Routed A-Buffer' - pete


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
stencil routed a buffer

Stencil Routed A-Buffer

Kevin Myers and Louis Bavoil

NVIDIA

what is it
What is it?
  • A-Buffer
    • Simply a list of fragments per-pixel
      • “The A-buffer, an antialiased hidden surface method” [Carpenter 84]
  • Related Work
    • Depth Peeling [Mammen 89] [Everitt 01]
    • k-Buffer [Bavoil et al. 07]
why do i need this
Why do I need this?
  • Often want more than nearest
    • Alpha blending
    • Volume rendering
    • Collision detection
    • Refraction and caustics
    • Global illumination
why is it hard
Why is it hard?
  • GPU’s optimized to capture nearest layer
    • Z buffering and early z test
    • Fine for most real-time lighting models
    • Wasteful if not rendering front to back
things that don t work
Things that don’t work
  • Blending can’t just turn of z-buffering
    • Most operations non-commutative
  • MRT
    • Can’t direct output
  • Reading what you’re writing
    • Hazardous
      • “Multi-Layer Depth Peeling via Fragment Sort” [Liu et al. 06]
      • k-Buffer [Bavoil et al. 07]
a buffer
A-Buffer
  • “A list of fragments per-pixel”
    • Anything on the GPU that resembles this?
  • MSAA
    • “A list of samples per-pixel”
    • Samples store coverage
msaa in review
MSAA in review
  • Multisampled Antialiasing
    • Fragments are rasterized at a higher res
      • 8xMSAA == 8 x aliased resolution
    • Pixel shader is run once per-pixel
    • Frame buffer storage is at sample resolution
say what
Say What?
  • MSAA samples == A-Buffer pixels??
  • MSAA sample patterns don’t help
  • Need all MSAA samples at pixel center
line up your sub samples
Line up your Sub-samples
  • Turn off multisampling
    • Still render to an MSAA buffer
    • Pixel shader output bloats to all sub-samples
    • BOOL D3D10_RASTERIZER_DESC::MultisampleEnable
  • Now writing 8 samples per pixel
    • All have the same value!!
bloating your pixel
Bloating Your Pixel
  • Applause?
  • Meets the definition
    • “List of fragments per-pixel”
  • Not exactly what we want
    • Each item contains same value
    • Next fragment will clobber the entire list
    • Need to update one entry in the list
      • Once and only once
stencil routing
Stencil Routing

Stencil always increments

Stencil passes when 4

stencil routing1
Stencil Routing
  • First introduced by Purcell et al 2003
    • Did not work for general rasterization
      • Tile aligned points
    • Fat point is spread across four pixels
      • Four pixels get same value
      • Stencil allows one pixel to update
stencil routing and msaa
Stencil Routing and MSAA
  • Stencil always operates at sample res
    • Regardless of MultisampleEnable state
    • DX10 Spec
  • Use sub-samples to route
    • Allows any pixel shader output to be routed
      • Arbitrary primitives
a stencil test that works
A Stencil Test That Works
  • StencilFunc
    • D3D10_COMPARISON_EQUAL
  • StencilRef
    • 2
      • More on this later
  • StencilPassOp and StencilFailOp
    • D3D10_STENCIL_OP_DECR_SAT
initializing stencil
Initializing Stencil
  • Clear stencil buffer to pass value ( 2 )
    • Initializes sample 0 to 2
  • Use SampleMask to selectively update
    • Stencil set to replace with refrence value
why start at 2
Why start at 2?
  • When all sub-samples are written
    • Most stencil values will be 0
      • Except the last one written
    • Last sample written stencil == 1
  • When overflow occurs
    • All stencil values will be 0
occlusion query test
Occlusion Query Test

Pixel did not

overflow

Pixel

overflowed

handling overflow
Handling Overflow
  • Set sample mask to last sample updated
  • Draw full screen quad
    • Issue an occlusion query
    • Set stencil to pass if stencil == 0
  • Check occlusion query
    • Sample pass count == overflow count
handling overflow1
Handling Overflow
  • Occlusion query
    • Good
      • Very fast
      • Allows for dynamic A-Buffer sizing
    • Bad
      • Requires some CPU intervention
        • Ideally A-Buffer size is fixed
secrets of the dragon
Secrets of the Dragon
  • Single A-Buffer
    • RG32F
      • R is packed color
      • G is depth
    • Saves on texture loads
  • Post process sort
    • 8 fragment per-pixel bitonic sort
      • Additional fragments, insertion sort
8800 gtx performance
8800 GTX Performance

Alpha Blended Stanford Dragon

limits doh
Limits…DOH!
  • 254 layers of depth max
    • 8-bit stencil ( 255 – 1 for overflow bit )
    • If you do this call us cause that’s crazy
  • Fragments at same depth
    • Must be handled in post-process
  • MSAA
summary
Summary
  • Stencil Routed A-Buffer
    • Ideally suited for complex geometries
      • Much faster than depth peeling
  • A-buffer can be dynamically resized
    • Use an occlusion query
    • Best to pre-determine size
future work
Future Work
  • Render target arrays
    • Each target has its own stencil buffer
    • Target replaces sub-sample
      • Or augments sub-sample
    • #arrays * MSAA level in one “CPU pass”
      • With dx10 saturates 254 layers
    • Use instancing for additional “GPU passes”
thanks for all the fish
Thanks for all the fish
  • Claudio Silva, Steven Callahan, Joao Comba, Aaron Lefohn, Cass Everitt, Peach Myers
ad