Ddddrraw a prototype toolkit for distributed real time rendering on commodity clusters
Download
1 / 20

DDDDRRaw: A Prototype Toolkit for Distributed Real-Time Rendering on Commodity Clusters - PowerPoint PPT Presentation


  • 79 Views
  • Uploaded on

DDDDRRaw: A Prototype Toolkit for Distributed Real-Time Rendering on Commodity Clusters. Thu D. Nguyen and Christopher Peery Department of Computer Science Rutgers University John Zahorjan Department of Computer Science & Engineering University of Washington. Overview.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' DDDDRRaw: A Prototype Toolkit for Distributed Real-Time Rendering on Commodity Clusters' - thelma


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Ddddrraw a prototype toolkit for distributed real time rendering on commodity clusters

DDDDRRaw: A Prototype Toolkit for Distributed Real-Time Rendering on Commodity Clusters

Thu D. Nguyen and Christopher Peery

Department of Computer Science

Rutgers University

John Zahorjan

Department of Computer Science & Engineering

University of Washington


Overview
Overview

  • Improve real-time rendering performance using distributed rendering on commodity clusters

    • Real-time rendering -> interactive rendering applications

    • Improve performance -> Render more complex scenes at interactive rates

  • Why real-time rendering?

    • A critical component of an increasing number of continuous media applications

      • Virtual reality, data visualization, CAD, flight simulators, etc.

    • Rendering performance will continue to be a bottleneck

      • Model complexity increasing as fast (or faster) than hardware performance

      • Part of the challenge is to leverage increasingly powerful hardware accelerators


Challenges
Challenges

  • How to structure the distributed renderer to leverage hardware-assisted rendering

    • Information that is useful for work partitioning and assignment may be hidden in the hardware rendering pipeline

  • How to minimize non-parallelizable overheads (avoiding Amdhal’s Law)

  • How to decouple bandwidth requirement from the complexity of the scene and the cluster size


Image layer decomposition ild
Image Layer Decomposition (ILD)

  • Per-frame rendering load is partitioned using ILD

    • presented in IPDPS 2000

  • Briefly review ILD because it affects DDDDRRaW’s architecture and performance

  • Basic idea: assign scene objects such that sets of objects assigned to different nodes are not mutually occlusive

  • Advantages of using ILD

    • Do not need position of polygons in 2D

      • This information may be hidden inside the graphics pipeline

    • Do not need Z-buffer information

      • This reduces the required bandwidth by at least 50%


Image layer decomposition ild1

3

1

2

3

4

5

4

1

5

6

6

2

Image Layer Decomposition (ILD)

Spatial partitioning


Ild work assignment

3

5

4

1

6

2

ILD: Work Assignment

  • Non-mutually occlusive assignment -> legal for back-to-front compositing

  • Use heuristic-based algorithm to

    • Balance load across cluster

    • Minimize the screen real-estate covered by each assignment

Legal


Implementation architecture
Implementation: Architecture

Display

Node

App.

VRML

Scene,

Display

Window

  • Partitioning

  • Assignment

  • Decompress

  • Compositing

Display

Viewpoint

DDDDRRaW

Library

Work

Assignment

  • Rendering

  • Compress

Partial

Image

DDDDRRaW

Library

DDDDRRaW

Library

DDDDRRaW

Library

DDDDRRaW

Library

Rendering Nodes


Implementation details
Implementation Details

  • Implemented an optimization to ILD: dynamic selection of octants to be rendered

    • Minimize overhead of geometric transformation due to polygon splitting (in scene decomposition)

  • Compression of image layers before communication

    • Reduce bandwidth requirement to accommodate slower networks (eg., 100 Mb/s LANs)

  • Use dynamic clipping to enforce octant boundaries for scene with smooth shading and/or texturing

    • Simplification to ease implementation of prototype – this clipping could/should be done statically

    • 20-25 percent overhead for 5 of our 6 test scenes that would not be present in a production system


Performance measurement
Performance Measurement

  • Application: VRML viewer

    • VRweb – http://www.iicm.edu/vrwave

  • Collected 6 VRML scenes from the web

    • Use fix paths through scenes to measure performance in terms of average frame rate (frames/sec)

  • Two clusters representing different points in the technology spectrum

    • Cluster of 5 SGI O2s

      • 180 MHz Mips R5000, 256 MB memory, SGI Graphics Accelerator, 100 Mb/s switched Ethernet LAN

      • IRIX 6.5.7

    • Cluster of 13 PCs

      • Pentium III 800 MHz, 512 MB memory, Giganet 1 Gb/s cLAN

      • Red Hat Linux (kernel 2.2.14), Mesa 3D library version 3.2








Conclusions
Conclusions

  • Can build an ILD-based distributed renderer to significantly improve real-time rendering performance on commodity hardware

  • DDDDRRaW currently scales to modestly sized cluster

    • This limitation is due to non-optimal hardware configurations

    • This is NOT because more suitable hardware is not available!

    • Expect good scalability to clusters of 16-32 nodes

  • Overlapping communication with computation increases average frame rate but ONLY at the expense of increasing frame latency

    • Problem is CPU contention for rendering & communication

    • Either need dedicated hardware or can only optimize after reaching 10-15 fps, the nominal interactive frame rate

  • Project URL: www.cs.washington.edu/research/ddddrraw/


Overlapping communication computation
Overlapping Communication & Computation

  • Communication and compression are significant sources of overhead

  • Apply standard parallel optimization technique: overlap communication of rendered image layers for one frame with rendering of the next

  • Requires pipelining of DDDDRRaW


The ddddrraw pipeline
The DDDDRRaw Pipeline

Display Node

ILD

Send

Receive

Decompress

Composite & Display

Send

Receive

Rendering Nodes

Stage 1

Stage 3

Render

Compress

Stage 2




ad