1 / 22

Image Synthesis

Image Synthesis. GP-GPU. Graphics hardware. Current performace – PlayStation 3 CPU: Cell Prozessor (3,2 GHz) 512 kB L2-Cache ~200 GFLOP/s GPU (Graphics Processing Unit) Nvidia RSX Reality Synthesizer (550 MHz, ~300 MTransistors ~ 1,8 TFLOP/s ~ 20 GPixels/s ~ 2 GTriangles/s.

landis
Download Presentation

Image Synthesis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Image Synthesis GP-GPU

  2. Graphics hardware • Current performace – PlayStation 3 • CPU: Cell Prozessor (3,2 GHz) • 512 kB L2-Cache • ~200 GFLOP/s • GPU (Graphics Processing Unit) • Nvidia RSX Reality Synthesizer (550 MHz, ~300 MTransistors • ~ 1,8 TFLOP/s • ~ 20 GPixels/s • ~ 2 GTriangles/s

  3. Graphics hardware - history • 80: simple rasterization • Windows, lines, polygons, text-fonts • 90-95: „Geometry-Engines“ only on High-End-Workstations • e.g. SGI O2 vs. Indigo2) • 95: newrasterizationfunctionality • Realismbytexturing, e.g: SGI Infinite Reality • 98: Geometryprocessor (T&L) on PC-Graphics • 2000: PC-Graphics achievessimilarperformanceto High-End-Workstations • 3D isbecomingstandard in Aldi-PC • 2001: PC-Graphics offersnewfunctionality • Multitextures, Vertex- andPixel-Shader • 2002: DirectX Level 9.0 Hardware • High Level ShaderLanguages • 2006: DirectX Level 10.0 Hardware • Geometry – Shader

  4. Trends in graphics hardware Numberoftransistorsdoublesevery 6 months Advances in performanceandfunctionality ATI R520 300 GeForceFX / ATI Radeon 9800 150 60 50 GeForce3 (57M) R200 (60M) 40 30 Transistors (Mi) Riva 128 (3M) 20 10 0 Time (month/year) 9/97 3/98 9/98 3/99 9/99 3/00 9/00 3/01 9/02

  5. Graphics CPU Performance Network Time Trends in graphics hardware • Grows faster than Moore‘s law predicts

  6. Parallel graphics hardware • Graphics hardware has always been parallel • Internal on chip or board • Multiple rasterizer serve one frame buffer • Multi-Pipe • Multiple graphics cards in one system for one or multiple displays • Multiple geometry engines • Distributed graphics • Multiple knots in a connected cluster with one or multiple cards serve one or multiple displays driven by one application

  7. Graphics architectures • State-of-the-Art GPUs • Highly parallel streamarchitecture • Stream ofvertices/fragmentsisprocessed • Pipelinedand SIMD parallel processing • SIMD: singlesetofinstructions on multiple streamelements • Specifiesnewrenderingpipeline • Additional stages a vertexor a fragmentispassingthrough • Specifiesnew (vendorspecific) OpenGLextensions • Allowsfornewclassesofalgorithms • Eventuallymakesprogramsplatformdependent

  8. Graphics architectures State-of-the-Art GPUs (G80)

  9. Graphics architectures • State-of-the-Art GPUs • Multiple (texture) render targets • Upto2GB videomemory • Floating pointtextures (4 x 32 Bit) • Internal computations in float /double precision • Z-cull: discardsfragments (beforeenteringthepixelpipelines) that will failthedepthtest • Dynamic flowcontrol: per-vertex/geometry/fragmentspecificoperations (ifthenelse) • PCIe: serial, pont2point protocol, dual channelstoallowforbandwidth in bothdirections (upload/download) • Fix fragment-to-pixelbound, i.e. a fragment (XY) can not bewrittento a pixel (X´Y´) • noscattering(at least not in DX/GL)– onlygathering

  10. Graphics architectures State-of-the-Art programmable GPUs

  11. Graphics architectures State-of-the-Art programmable GPUs

  12. GP-GPU Water

  13. Programmable graphics hardware Displacementmapping Simulation generatesheight field texture static grid water surface Displacer Rendering

  14. Programmable graphics hardware • GPU memory objects • Semantics can be specified for chunk of memory • Memory object can be a texture, a vertex array, a frame buffer object • What was a texture render target in the current pass becomes a vertex array in the upcoming pass • Texture elements can be interpreted as vertex attributes without any copying operations (not in OpenGL) • Same effect can be achieved with vertex texture fetch, but this fetch actually slows down performance

  15. Programmable graphics hardware • Example • Computationofheightvaluesuatverticesof a 2D grid • Startingwith an initialdistribution, computeevolutionover time t y Pij+1 Pi-1j+1 Pi+1j+1 h Pij Pi-1j Pi+1j Pij-1 Pi-1j-1 Pi+1j-1 h x

  16. Programmable graphics hardware Algorithm: • Load initial height values (NxxNy) as 2D texture (sGridPrev, sGrid) • Upload fragment shader (render to sGridNew): voidPerPixelSim ( float2 fragpos: TEXCOORD0, out height : COLOR0) { centerPrev = tex2D(sGridPrev, fragpos); float2 leftIndex = float2(-1.0/TexSize, 0.0); left = tex2D(sGrid, fragpos + leftIndex); // same forright, upper, lower, center height = f(left, right, upper, lower, center, centerPrev); }

  17. Programmable graphics hardware Algorithm contd.: • Simulation: • Render a Quad that covers Nx x Ny pixelswith appropriate texture coords. • Nx x Ny fragments will be generated • Data parallel execution of fragments • Swizzle texture identifiers • sGridPrev = sGrid, sGrid = sGridNew; sGridNew = sGrdPrev • Display height field in texture sGrid (0,1) (1,1) (1,0) (texCoord = 0,0)

  18. Programmable graphics hardware Algorithm contd.: • Display: • Upload fragment shader (render to color buffer): voidPerPixelRefract ( float2 fragpos: TEXCOORD0, out color : COLOR0) { tangent = float3(1.0, 0.0, tex2D(sGrid, fragpos + rightIndex).r - tex2D(sGrid, fragpos).r; binormal = float3(0.0, 1.0, tex2D(sGrid, fragpos + upper).r - tex2D(sGrid, fragpos).r); normal = normalize(cross(tangent, binormal)); refract = f(normal, refractionIndex); color = tex2D(sBackground, fragpos + refract); }

  19. GPGPU ParticleTracing

  20. GPU Partikelverfolgung

  21. GPU Partikelverfolgung Eingabe Strom VertexShader InputAssembler Rasterizer Ausgabe Strom Output Merger Pixel Shader

  22. Programmable graphics hardware Demonstration

More Related