1 / 58

Direct3D 9

Direct3D 9. Or why programmable hardware kicks ass Matthew M Trentacoste. Introduction. Direct3D API has changed fundamentally to meet changes in hardware API has been adjusted to fit the paradigm shift that has occurred in real-time graphics

issac
Download Presentation

Direct3D 9

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Direct3D 9 Or why programmable hardware kicks ass Matthew M Trentacoste

  2. Introduction • Direct3D API has changed fundamentally to meet changes in hardware • API has been adjusted to fit the paradigm shift that has occurred in real-time graphics • Adapted to the fact that someone programming real-time graphics is writing code for 2 asymmetric processors, the GPU and CPU

  3. Differences • OpenGL not as much bad, as outdated • OpenGL was very well designed, but that was 15 years ago • Everything that has happened in real-time graphics since then has been stapled on • It is more of a pain for beginners learning to program graphics in Direct3D • But much more elegant once you are experienced enough to fully utilize the functionality provided • Much less of a state machine than OpenGL

  4. Differences (2) • Has no immediate mode, can’t just specify vertices, colors etc… directly from code • Built around a stream based model of data • All data must be put into a buffer of elements to be loaded onto the hardware • Trying to gracefully give control of the flow of data between CPU and GPU while still being efficient • API is getting there, streamlining of functionality means fewer objects to accomplish all tasks

  5. Direct3D 9 API Object List • IDirect3DSwapChain9 (back buffers) • IDirect3DTexture9 (textures) • IDirect3DVolume9 (volume textures) • IDirect3DVertexBuffer9 (vertex lists) • IDirect3DIndexBuffer9 (index lists) • IDirect3DSurface9 (render targets) • IDirect3DStateBlock9 (render state container) • IDirect3DVDecl9 (vertex format) • IDirect3DVertexShader9 (vertex shader) • IDirect3DPixelShader9 (pixel shader)

  6. Other Reason D3D rocks • D3DX!!!! • All the math you could possibly need for graphics already written • Vectors, matrices, quaternions, textures, models, etc… • Optimized code using all special instruction sets (3Dnow, SSE2, and what not) • Best solution for almost anything you could want to do, unless some crazy special case

  7. Cool Shit • Still with me? • High-order primitives • Adaptive tessellation • Displacement maps • And pretty pictures of them

  8. Higher Order Primitives • Current primitives are not ideal for representing smooth surfaces • Direct3D 9 supports points, lines, triangles, and grid primitives • Higher-order interpolation methods, such as cubic polynomials, allow more accurate calculations in rendering curved shapes • The application need only provide a desired level of tessellation • Transmit the data using standard triangle syntax that includes normal vectors

  9. Adaptive Tessellation • Adaptively tessellates a patch, based on the depth value of the control vertex in eye space • Tessellation level computed per-vertex • From API value scaled by 1.0 / Zeye • Then surface is tessellated accordingly • API takes triangles, defines high order surfaces from them, and then tessellates those surfaces as needed • Meaning : more detail the closer you get

  10. Demo Time #1

  11. Displacement Mapping • Adaptive tessellation enables us to use a texture to deform a surface • A texture of a height field is spread across a high-order surface • Tessellates surface until the detail of the geometry is high enough to represent height field • Changes shape of surface to match displacement as opposed to merely modifying the surface normal vector to appear like a deformed surface • What bump maps wish they were

  12. Demo Time #2

  13. Vec0 Vec1 Vec2 Vec3 Pos Color TC1 TC2 Tex1 Tex0 Tex2 DirectX Graphics Architecture Vector Data VB Vertex Shader Geometry Ops Vertex Components Primitive Ops Pixel Shader Pixel Ops Image Surface Samplers Output pixels

  14. Pipeline Overview • Create VertexBuffer (where model goes) • Set up Vertex Stream (put model there) • Define VertexDecl (what data means) • Vertex Shader Object (operate on model) • Pixel Shader Object (render model) • FrameBuffer blender (add image of model to scene)

  15. Vertex Declaration Object • New syntax for describing vertex formats for DMA engine and tessellator behavior • New object IDirect3DVDecl9 • Separately createable • CreateVertexDeclaration() • Separately settable • SetVertexDeclaration() • Settable independent of vertex shader

  16. Default Semantics • VertexDecl now supports “usage” field • Position, Normal, Tangent, Binormal, etc. • Provided to enable default semantics • Allows implementation to connect shaders together without requiring a fixed register convention • Acts as symbol table for run-time linking of shaders to core API and therefore hardware • No addl. policy is imposed over DirectX 8 • Default semantics can be overridden • Deals with concepts, not memory addresses

  17. DirectX 8 Vertex Declaration Strm0 Strm1 Vertex layout v0 skip v1 Declaration vs 1.1 mov r0, v0 … Shader handle Shader program

  18. New Vertex Declaration Strm0 Strm1 Strm0 Vertex layout pos norm diff Declaration pos norm diff vs 1.1 dcl_position v0 dcl_diffuse v1 mov r0, v0 … vs 1.1 dcl_position v0 dcl_diffuse v1 mov r0, v0 … Shader program (Shader handle)

  19. TC0 Vec0 Vec1 TC1 Vec2 TC2 Vec3 TC3 Vertex Shader Architecture Vec4 … Vec15 A0 R0 Const0 R1 Const1 Vertex ALU R2 Const2 R3 Const3 … … R11 Const95 Hpos Color0 Color1

  20. Vertex ShadersVertex Shader 2.0 Register Reference *an Can only be written to by mov and result used as integer offset in relative addressing Note: Port Count = number of times a different register of that class can be used in single instruction

  21. Math Instructions • Parallel ops (componentwise): • add, sub, mul, mad, frc, cmp • Vector ops • dp3, dp4 • Scalar ops: • rcp, rsq, exp2, log2 • Macros • LRP, NRM3, POW, CRS, SINCOS, SGN, ABS

  22. Vertex ShadersInstruction reference

  23. Vertex Shader Flow Control • DirectX 9 vertex shaders vs2.0 supports flow control • Result is “Structured Assembly” language • Control logic based on constants only • Required by ISVs to solve • Enable/Disable environment mapping, etc. • “varying # of lights” problem • Brings support == to nonprogrammable • Ideally better skinning approach • “varying # of bones” problem

  24. Instruction Counts vs. Slots • Flow control means slots != counts • Instruction store is 256, but more instructions can be executed than are stored • Executed instruction count limit is higher • Recommend to not exceed 1024

  25. Sampler State Separation • TextureStageState (TSS) has been split • One category for Texture Sampler data • One category for Texture Iterator control • Why? • Sampler State has 16 elements as 16 textures may be sampled in one pass • Other state has only 8 elements • Much of this state is for legacy pipelines • All enum indices remain the same • DDI impact is minimal

  26. Pixel Shaders • Float data precision supported • Enables photoreal rendering of high-dynamic range scenes - cf Debevec • Pixel shader ALU must support • At least s10e5 precision for color data • At least s17e6 precision for all other data • Any inputs data of 32-bit float such as texture iterators or reads of 32-bit float texture formats • _pp modifier supported on any instruction • Highlights operations where reduced precision is acceptable for performance

  27. Demo Time #3

  28. t0 oC0 t1 oC1 t2 oC2 t3 oC3 Pixel Shader 2.0 Architecture v0 v1 t4 … t7 r0 c0 r1 c1 Pixel ALU r2 c2 r3 c3 … … r11 c31

  29. Pixel ShadersPixel Shader 2.0 Register Reference • Port Count = # of times different registers of same class can be used in one instruction

  30. Texture Load Instructions • 3 instructions provided in ps_2_0 • Standard texture load: texld r0, t1, s3 • Texture with per-pixel LOD bias: texldb r0, t0, s2 • Bias value stored in t0.w • Projected texture load: texldp r1, t2, s0 • Does perspective divide before lookup

  31. Dependent Reads • Can be serialized, but only to a max depth of 4: • dcl t0.xy; dcl_2d s0.rg; texld r0, s0, t0; texld r1, s1, r0; texld r2, s1, r1; texld r3, s1, r2; • Is legal

  32. Dependent Reads Rock • What’s so great? • Textures become functional maps • Any continuous function that takes up to 3 inputs and produces up to 4 outputs can be stored as a texture • Pre-compute results and store in texture • Load texture at coordinates of input • Returns output as value at that point

  33. Dependent Reads Rock • Allows for results far too complicated to be calculated in real-time to be used on GPU with minimal cost • Stop thinking of textures as mere images, but stores of data • Lookup tables, noise generators, and most arbitrary functions are all capable of being emulated in current hardware quickly

  34. Multi-Render Target (MRT) • Step towards rationalizing textures and vertex buffers • Allow writing out multiple values from a single pixel shader pass • Up to 4 color elements plus Z/depth • Facilitates multipass algorithms • Can have a pixel shader output 4 vector-4s + depth for each pixel • That is 17 pieces floating point of data that can be stored

  35. MRT Example : Depth of Field The images on the left are the original. The center is the alpha map. Black is in focus, white is out of focus. We can move the focal plane anywhere we like. Alpha of Original Blurred Result Original

  36. MRT Example : Edge Detection World Space Normals • Edge Detection, Images courtesy of ATI Technologies, Inc. Edge Detect Eye Space Depth Outlines

  37. MRT Example : Edge Detection • Composite outlines to get a cell-shaded effect. Images courtesy of ATI

  38. High Level Shader Language • Why? • Because assembly sucks • Allows all the things that make C so much better than machine code • Can separate pixel and vertex shader code from data • No longer have to map elements of a stream to registers, done semantically

  39. DirectX® 8 Assembly tex t0 ; base texture tex t1 ; environment map add r0, t0, t1 ; apply reflection

  40. DirectX 9 HLSL Syntax outColor = tex2d( baseTextureCoord, baseTexture )+ texCube( EnvironmentMapCoord, Environment ); Maybe more characters, but makes much more sense

  41. Datatypes • Ints, bools, floats, etc… • All the things you know and love • Plus things that make graphics easy like vectors and matrixes • 1x1 up to 4x4 first order floating point data • matrix4x4 not matrix[4][4] • All operations designed to operate on up to 4x4 data-types natively

  42. DirectX 8 Vertex Declaration (again) Strm0 Strm1 Vertex layout v0 skip v1 Declaration vs 1.1 mov r0, v0 … Shader handle Shader program

  43. New Vertex Declaration (again) Strm0 Strm1 Strm0 Vertex layout pos norm diff Declaration pos norm diff vs 1.1 dcl_position v0 dcl_diffuse v1 mov r0, v0 … vs 1.1 dcl_position v0 dcl_diffuse v1 mov r0, v0 … Shader program (Shader handle)

  44. Vertex Shader Input Semantics • position[n] untransformed position • blendweight[n] skinning blending weight • blendindices[n] skinning blending indices • normal[n] normal vector • psize[n] point size (particle system) • diffuse[n] diffuse (matte) color • specular[n] specular (shiny) color • texcoord[n] texture coordinates • tangent[n] these two with normal vector • binormal[n] make a 3D coordinate system

  45. VS output / PS input semantics • Position transformed position • Psize Pointsize • Fog Fog blending value • color[n] Computed colors • texcoord[n] Texture coordinates

  46.  Uses for Semantics • A data binding protocol: • Between vertex data and shaders • Between pixel and vertex shaders • Between pixel shaders and hardware • Between shader fragments • One smooth process of describing the flow of data in an out of various elements of the render process

  47. So… • Yeh, we got all this programmable hardware • What does it really give us?OPTIONS!!! • Are finally able to compute what you want • No longer the fixed function pipeline’s bitch • Can render Pong, even Wolfenstein on GPU • Think of the GPU as a signal processor of vertex and pixel data, not merely rendering pictures

  48. Finally • All graphics that use the fixed function pipeline, ie. Standard Lighting Equation fundamentally look the same • Many hacks to work around • But still stuck with:ambient + diffuse + specular • Allows graphics programmers to tailor the look of their work to fit the content

  49. Choose Your Look • Pick a unique “Look” and do it • Toon several methods • Cheesy unlit or flat shaded • Retro standard FF pipelines • Radiosity soft lighting only • Shadows horror movie, Doom III • Gritty ultra realistic • And many more

  50. Time for Hands On

More Related