cg a system programming graphics hardware in a c like language
Download
Skip this Video
Download Presentation
Cg: A system programming graphics hardware in a C-like language

Loading in 2 Seconds...

play fullscreen
1 / 62

Cg: A system programming graphics hardware in a C-like language - PowerPoint PPT Presentation


  • 109 Views
  • Uploaded on

Cg: A system programming graphics hardware in a C-like language. William R. Mark The University of Texas at Austin R. Steven Glanville NVIDIA Corporation Kurt Akeley NVIDIA Corporation Mark J. Kilgard NVIDIA Corporation.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Cg: A system programming graphics hardware in a C-like language' - kimo


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
cg a system programming graphics hardware in a c like language

Cg: A system programming graphics hardware in a C-like language

William R. MarkThe University of Texas at AustinR. Steven Glanville NVIDIA CorporationKurt Akeley NVIDIA CorporationMark J. Kilgard NVIDIA Corporation

Siggraph 2003

the graphics pipeline
The Graphics Pipeline

[Programming Graphics Hardware]

outline
Introduction

Background

Design Goals

Key Design Decisions

Cg Language Summary

Design Issues

CgFX

System Experiences

Conclusion

Outline
introduction
Introduction
  • Graphics architectures are now highly programmable, and support application-specified assembly programs for both vertex processing and fragment processing
  • Most effective tool for programming these architectures is a high level language
    • program portability, improved programmer productivity, easier develop programs incrementally and interactively
    • particularly valuable for shader programs
introduction1
Introduction
  • A system for programming graphics hardware that supports programs written in a new C-like language named Cg
outline1
Introduction

Background

Design Goals

Key Design Decisions

Cg Language Summary

Design Issues

CgFX

System Experiences

Conclusion

Outline
the evolution of gpu programming language

IRIS GL(SGI, 1982)

RenderMan(Pixar, 1988)

OpenGL(ARB, 1992)

PixelFlow Shading Language (UNC, 1998)

Reality Lab(RenderMorphics, 1994)

Real-Time Shading Language (Stanford, 2001)

Direct3D(Microsoft, 1995)

The Evolution of GPU Programming Language

C(AT&T, 1970s)

C++(AT&T, 1970s)

Java(Sun, 1970s)

HLSL(Microsoft, 2002)

Cg(NVIDIA, 2002)

GLSL(ARB, 2003)

[NVIDIA]

background
Background
  • In real-time rendering systems, support for user programmability has evolved with the underlying graphics hardware
  • For many years, mainstream commercial graphics hardware was configurable , but not user programmable
    • multipass rendering techniques: SGI’s OpenGL shader system [2000] and Quake III’s shading language [1999]
background1
Background
  • In response to this trend, graphics architects began to incorporate programmable processors into both the vertex-processing and fragment-processing stages of single-chip graphics architectures [2001]
  • The most recent generation of PC graphics hardware (DirectX 9 or DX9 hardware [2002]), continues the trend of adding programmable functionality to both the fragment and the vertex processors
dx9 class architectures
DX9-class Architectures
  • Vertex processor
    • adds conditional branching functionality
  • Fragment processor
    • adds flexible support for floating-point arithmetic and computed texture coordinates
outline2
Introduction

Background

Design Goals

Key Design Decisions

Cg Language Summary

Design Issues

CgFX

System Experiences

Conclusion

Outline
design goals
Design Goals
  • Ease of programming
    • programming in AL is slow and painful
    • easy reuse of code
  • Portability
    • hardware from different companies
    • hardware generations (DX8-class hardware or better)
    • operating systems (Windows, Linux, MacOS)
    • major 3D APIs (OpenGL, DirectX)
design goals1
Design Goals
  • Complete support for hardware functionality
  • Performance
  • Minimal interference with application data
  • Ease of adoption
  • Extensibility for future hardware
  • Support for non-shading uses of GPU
  • (some of these goals are in partial conflict with each other)
outline3
Introduction

Background

Design Goals

Key Design Decisions

Cg Language Summary

Design Issues

CgFX

System Experiences

Conclusion

Outline
key design decisions
Key Design Decisions
  • A “general-purpose language”,not a domain-specific “shading language"
  • A program for each pipeline stage
  • Permit subsetting of language
  • Modular system architecture
domain specific vs general purpose language
Domain-specific vs. General-purpose Language
  • Domain-specific languages
    • shading computation
  • General-purpose languages
    • expose the fundamental capabilities of programmable graphics architectures
general purpose language
“General-purpose Language"
  • When considered with our design goals,let us to develop a hardware focused general-purpose language
    • high performance
    • minimal management of application data
    • support for non-shading uses of GPU’s
cg follows c s philosophy
Cg follows C's philosophy
  • C language in achieving goals for performance, portability, and generality of CPU programs that were very similar to our goals for a GPU language
  • Extend and modify C to support GPU architectures effectively → Cg
    • language follows syntax and philosophy of C
    • reserves all C and C++ keywords
    • selectively uses ideas from C++, Java, RenderMan, RTSL
    • 3Dlabs, OpenGL ARB(GLSL), Microsoft (HLSL)
programming model
Programming Model
  • Choosing a programming modelto layer on top of the stream-processing architecture
    • RTSL, RenderMan: single program
    • OpenGL, Direct3D: two separate programs
    • the programs consume an element of data from one stream, and write an element of data to another stream
  • Single-program model is not a natural match for the underlying dual-processor architecture
a program for each pipeline stage

Vertex Program

Executed Once Per Vertex

Fragment Program

Executed Once Per Fragment

A Program for Each Pipeline Stage

The user-programmable processors in today's graphics architectures use a stream-processing model

[Programming Graphics Hardware]

a language for expressing stream kernels
A language for Expressing Stream Kernels
  • A single language specification for writing a stream kernel (i.e. vertex program or fragment program)
    • simplify and generalize the language by eliminating most of the distinctions between vertex / fragment programs
  • And then allowed particular processors to omit support for some capabilities of the language
    • e.g. use of texture lookuptoday’s vertex processor don’t support texture lookups
a language for expressing stream kernels1
A language for Expressing Stream Kernels
  • Current Cg system can be thought as a specialized stream processing system
  • Cg system relies on the established graphics pipeline dataflow of GPUs
    • not connect stream processing kernels together
  • Cg’s focus on kernel programming
    • specialized for stream-kernel programming
    • could be extended to support other parallel programming models
a data flow interface for program inputs and outputs
A Data-flow Interface for Program Inputs and Outputs
  • Should the system allow any vertex program communicates with any fragment program ?
    • via the rasterizer / interpolator
  • How should the vertex program outputs and fragment program inputs be defined to ensure compatibility ?
a data flow interface for program inputs and outputs1
A Data-flow Interface for Program Inputs and Outputs
  • When programming GPUs at the assembly level
    • the interface between fragment programs and vertex programs is established at the register level
    • For example: user can establish a conventionTEXCOORD3 I/O register
  • The binding names must be chosen from a predefined namespace with predefined data types
a data flow interface for program inputs and outputs2
A Data-flow Interface for Program Inputs and Outputs
  • Cg and HLSL: modified bind-by-name scheme
    • a predefined namespace is used instead of the user-defined identifier name
    • provide maximum control over the generated code
  • Cg also supports a bind-by-position
    • requires that data be organized in an ordered list
    • a function-parameter list or a list of structure members
  • GLSL: purebind-by-name
    • not supported by either Cg or HLSL
permit subsetting of language
Permit Subsetting of Language
  • Conflict goals: portability and comprehensive
  • Major differences in functionality between the different graphics architecture that Cg supports
    • e.g. DX9: floating-point fragment arithmetic
  • Consider a variety of possible approaches to hiding or exposing these difference
    • minor architectural differences could be efficiently hidden by the compiler, Cg did so
    • major architectural differences can not be hidden by a compiler → Performance
permit subsetting of language1
Permit Subsetting of Language
  • Cg wanted both support
    • the existing installed base of DX8-class hardware
    • to provide access to the capabilities of the latest hardware
  • Cg:
    • expose major architectural differences asdifferences in language capabilities
    • to minimize the impact on portability, Cg exposed the differences using a subsetting mechanism
    • each processor is defined by aprofile
      • specifies which subset of the full Cg specification is supported on that processor
no mandatory virtualization
No Mandatory Virtualization
  • Whether or not to automatically virtualizehardware resources using software-based multi-pass techniques ?
  • Do not require it in the Cg language specification (not support in the current release of Cg)
    • effective virtualization of this hardware is impossible
    • too slowly to be useful in a real-time application
    • conflicted with our design goals(virtualization on current hardware requires global management of application data and hardware resources)
layered above an assembly language interface
Layered Above An Assembly Language Interface
  • Whether or not to expose machine / assembly language as an additional interface for system users ?
  • By providing access to the assembly code, the system allows users
    • tune their code by studying the compiler output
    • manually editing the compiler output
    • even write programs entirely in assembly language
          • maximize performance
explicit program parameters
Explicit Program Parameters
  • All input parameter to a Cg program
    • be explicitly declared using non-static global variables
    • by including the parameters on the entry function’s parameter list
  • Cg also provides a set of runtime API routines that allow parameters to be passed using their true names and types
explicit program parameters1
Explicit Program Parameters
  • The Cg compiler prepends a header to its assembly code output to describe the mapping betweenprogram parameter and registers

#profile arbvp1

#program simpleTransform

#semantic simpleTransform.brightness

#semantic simpleTransform.modelViewProjection

#var float4 objectPosition : $vin.POSITION : POSITION : 0 : 1

#var float color : $vin.COLOR : COLOR : 1 : 1

….

#var float brightness :: c[0] : 8 : 1

#var float4x4 modelViewProjection :: c[1], 4 : 9 : 1

outline4
Introduction

Background

Design Goals

Key Design Decisions

Cg Language Summary

Design Issues

CgFX

System Experiences

Conclusion

Outline
example program

vector of four float

Example Program
  • Example Cg Program for Vertex Processor

void simpleTransform(float4 objectPosition : POSITION,

float4 color : COLOR,

float4 decalCoord : TEXCOORD0,

out float4 clipPosition : POSITION,

out float4 oColor : Color,

out float4 oDecalCoord : TEXCOORD0,

uniform float brightness,

uniformfloat4x4 modelViewProjection)

{

clipPositon = mul(modelViewProjection, objectPosition);

oColor = brightness * color;

oDecalCoord = decalCoord;

}

other cg functionality
Other Cg Functionality
  • Provides structure, arrays, (+, *, /, etc.), boolean type and (||, &&, !, etc.), (++/--), (?:), (+=, etc.)
  • Supports programmer-defined functions(recursive functions are not allowed)
  • Provides only a subset of C’s control flow construct:(do, while, for, if, break, continue) (goto, switch) are not supported
  • Doesn’s support pointers or bitwise operations
  • Supports #include, #define, #ifdef, etc. (matching the C preprocessor)
outline5
Introduction

Background

Design Goals

Key Design Decisions

Cg Language Summary

Design Issues

CgFX

System Experiences

Conclusion

Outline
design issues
Design Issues
  • Support for hardware
  • User-defined interfaces between modules
  • Other language design decisions
  • Runtime API
support for hardware
Support for Hardware
  • The discussion below is organized around the characteristics of GPU hardware
    • Stream processor
    • Data types
    • Indirect addressing
    • Interaction with the rest of the graphics pipeline
    • Shading-specific hardware functionality
stream processor
Stream Processor
  • A GPU program is executed many times –once for each vertex or fragment
    • efficiently: input → changes vs. unchanged(reside in different register sets)
  • A GPU language compiler must know the category to which an input belongs before it can generate assembly code
stream processor1
Stream Processor
  • Terminology for the two kind of input
    • varying input
    • uniform input
  • Cg uses the uniform qualifier
  • Computation that depend only on uniform parameter
    • do not need to be redone for every vertex or fragment
data type
Data Type
  • Multiple numeric data types
    • float(32-bit), half(16-bit), fixed(12-bit)
  • Vector data types and operators
  • Matrix data types and operations
  • Not support integer data types
  • Add a bool data type for conditional operation
indirect addressing
Indirect Addressing
  • Current graphics processors have very limited indirect addressing capability (uniform, sampler)
  • An array assignment in Cg performs a copy of the entire array
  • Cg currently forbids the use of pointer
  • Cg currently forbids recursive function calls
  • Support call-by-value-result semantics
    • using a notation (in and out parameter modifier)
interaction with the rest of the graphics pipeline
Interaction with the Rest of the Graphics Pipeline
  • Some of the I/O register are used to control the non-programmable parts of the graphics pipeline, rather than to pass general-purpose data
  • The Cg specification mandates that certain register identifiers(e.g. POSITION) be supported as an output by all vertex profiles, and that certain other identifiers be supported by all fragment profiles
shading specific hardware functionality
Shading-specific Hardware Functionality
  • The least generation of graphics hardware include a variety of capabilities specialized for shading
  • Chose to expose the latest generation of graphics hardware capability via Cg’s standard library functions
    • maintains the general-purpose nature of the language
  • Cg standard library supports a variety of mathematical, geometric, and specialized functions
user defined interface between modules
User-defined Interface Between Modules
  • The general-purpose solution we chose is adopted from Java and C#
  • Programmer may define an interface, which specifies one or more function prototypes
  • Programmer implements the interface by defining a struct (i.e. class) that contains definition for the interface’s function
other language design decisions
Other Language Design Decisions
  • Function overloading by types and by profiles
  • Constants are typeless
  • No type checking for textures
function overloading by types and by profile
Function Overloading by Types and by Profile
  • Support function overloading by data type
    • mechanism is similar to C++ (less complex)
  • Also permit overloaded by profile
    • it is possible to write multiple versions of a function that are optimized for different architecture
    • the compiler will automaticallychose the version for the current profile
overloading
Overloading
  • Function overloading by hardware profile

// For ps_1_1 profile, use cubemap to normalize

ps_1_1 float3 mynomalize(float3 v)

{

return texCUBE(norm_cubmap, v.xyz).xyz;

}

//For ps_2_0 profile, use stdlib routine to normalize

ps_2_0float3 mynormalize(float3 v)

{

return normalize(v);

}

constants are typeless
Constants are Typeless
  • Change the type promotion rulesfor constants
    • C: float x; 2.0*x → double precision
    • Cg: half y; 2.0*y → half precision
  • Internally, the new constant promotion rules are implemented by assigning a different type (cfloat or cint) to constants that do not have an explicit type suffix
no type checking for textures
No Type Checking for Textures
  • The Cg system leaves the responsibility for most texture management (e.g. loading textures, specifying texture formats, etc.) with the underlying 3D API
  • Thus, the Cg system has very little information about the texture types
  • Stronger type checking would be possible by integrating the Cg system more tightly with the 3D API
runtime api
Runtime API
  • Cg runtime API is composed of two parts
    • Independent of the 3D API andprovide a procedural interface to the compiler and its output
    • Layered ontop of 3D API and used to load and bind Cg program,to pass uniform and varying parameters to them, andto perform miscellaneous housekeeping tasks
compound types are exploded to cross api
Compound Types are Exploded to Cross API
  • Cg programs may declare uniform parameters with compound types such as structures and arrays
  • The application passes the values of these parameters to the Cg program by using the Cg runtime API
  • To explode compound data structures into their constituent parts to pass them across the API
cg system can shadow parameter values
Cg System can Shadow Parameter Values
  • The Cg runtime can manage many Cg programs at once, each with its own uniform parameters
  • GPU hardware can only hold a limited number of programs and parameters at time
  • The Cg runtime can be configured to shadowa program’s parameters, so that the parameter values persist when the program is changed
outline6
Introduction

Background

Design Goals

Key Design Decisions

Cg Language Summary

Design Issues

CgFX

System Experiences

Conclusion

Outline
slide56
CgFX
  • CgFX can represent and manage:
    • Functions that execute on the CPU
    • Multi-pass rendering effects
    • Configurable graphics state
    • Assembly-language GPU programs
    • Multiple implementations of a single shading effect
outline7
Introduction

Background

Design Goals

Key Design Decisions

Cg Language Summary

Design Issues

CgFX

System Experiences

Conclusion

Outline
outline8
Introduction

Background

Design Goals

Key Design Decisions

Cg Language Summary

Design Issues

CgFX

System Experiences

Conclusion

Outline
conclusion
Conclusion
  • Cg system:
    • A system for programming GPUs
  • Cg language:
    • Extends and restricts C as needed for GPU’s
    • Expresses stream kernels
    • HW oriented language
  • Designed to age well
    • By reintroducing missing C features
ad