Graphics acceleration
Download
1 / 23

lesson12 - PowerPoint PPT Presentation


  • 240 Views
  • Updated On :

Graphics acceleration An example of line-drawing by the ATI Radeon’s 2D graphics engine Bresenham’s algorithm Recall this iterative algorithm for doing a ‘scanline conversion’ for a straight line It required five parameters: The starting endpoint coordinates: (X0,Y0)

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'lesson12' - oshin


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Graphics acceleration l.jpg

Graphics acceleration

An example of line-drawing by the ATI Radeon’s 2D graphics engine


Bresenham s algorithm l.jpg
Bresenham’s algorithm

  • Recall this iterative algorithm for doing a ‘scanline conversion’ for a straight line

  • It required five parameters:

    • The starting endpoint coordinates: (X0,Y0)

    • The ending endpoint coordinates: (X1,Y1)

    • The foreground color for the solid-color line

  • It begins by initializing a decision-variable

    • errorTerm = 2*deltaY - deltaX;


Algorithm s main loop l.jpg
Algorithm’s main loop

for (int y = Y0, x = X0; x <= X1; x++)

{

drawPixel( x, y, color );

if ( errorTerm >= 0 ) { errorTerm += 2*delY; }

else { y += 1; errorTerm += 2*(delY – delX); }

}


How much work for cpu l.jpg
How much work for CPU?

  • Example: To draw the longest visible line (in 1024x768 graphics mode) will require approximately 10,000 CPU instructions

  • The loop gets executed once for each of the 1024 horizontal pixels, and each pass through that loop requires about ten CPU operations: moves, compares, branches, adds and subtracts, plus the function-calls


Is acceleration possible l.jpg
Is acceleration possible?

  • The IBM 8514/A appeared in late 1980s

  • It could do line-drawing (and some other common graphics operations) if just a few parameters were supplied

  • So instead of requiring the CPU to do ten thousand operations, the CPU could do maybe ten operations, then let the 8514/A graphics engine do the rest of the work!


8514 a block diagram l.jpg
8514/A Block Diagram

Graphics

processor

RAMDAC

VRAM

memory

LUT

DAC

Display Monitor

Display processor

CRT controller

Drawing engine

CPU

ROM

PC Bus Interface

PC Bus


Ati improved on ibm s 8514 a l.jpg
ATI improved on IBM’s 8514/A

  • Various OEM vendors soon introduced their own graphics accelerator designs

  • Because IBM had not released details of its design, others had to create their own programming interfaces – all are different

  • Early PC graphics software was therefore NOT portable between hardware platforms


How does x300 draw lines l.jpg
How does X300 draw lines?

  • To demonstrate the line-drawing ability of our classroom’s Radeon X300 graphics processors, we wrote ‘drawline.cpp’ demo

  • We did not have access to ATI’s official Radeon programming manual, but we had several such manuals from other vendors, and we found ‘clues’ in source-code files for the Linux Radeon device-driver


Programming concepts l.jpg
Programming concepts

  • Our demo-program must first verify that it is running on a Radeon-equipped machine

  • It must determine how it can communicate with the Radeon’s graphics accelerator

  • Normal VGA registers are at ‘standard’ I/O port-addresses, but the graphics engine is outside the scope of established standards


Peripheral component interconnect l.jpg
Peripheral Component Interconnect

  • An industry committee (led by Intel) has established a standard mechanism that PC device-drivers can use to identify the peripheral devices that a workstation has, and their mechanisms for communication

  • To simplify the Pre-Boot Execution code, modern PC’s provide ROM-BIOS routines that can be called to identify peripherals


Pci configuration space l.jpg
PCI Configuration Space

Each peripheral device has a set of nonvolatile memory-locations

which store information about that device using a standard layout

PCI CONFIGURATION HEADER

256 bytes

ADDITIONAL

PCI

CONFIGURATION

DATA

1024

bytes

This device-information is accessed via I/O Port-Addresses 0x3C8-0x3CF


Pci configuration header l.jpg
PCI Configuration Header

Sixteen longword entries (256 bytes)

DEVICE

ID

VENDOR

ID

BASE-ADDRESS

RESOURCE 0

BASE-ADDRESS

RESOURCE 1

BASE-ADDRESS

RESOURCE 2

BASE-ADDRESS

RESOURCE 3

VENDOR-ID = 0x1002: Advanced Technologies, Incorporated

DEVICE-ID = 0x5B60: ATI Radeon X300 graphics processor

BASE-ADDRESS for RESOURCE 1 is the 2D engine’s I/O port

Our ‘findsvga.cpp’ utility will show you the PCI Configuration Space for any

peripheral devices of Class 0x030000 (i.e., VGA-compatible graphics cards)


Interface to pci bios l.jpg
Interface to PCI BIOS

  • Our ‘dosio.c’ device-driver (and ‘int86.cpp’ companion code) allow us access to BIOS

  • The PCI BIOS services are accessible (in the Pentium’s virtual-8086 mode) using function 0xB1 of software interrupt 0x1A

  • There are several subfunctions – you can find documentation online – for example, Professor Ralf Brown’s Interrupt List


Return radeon port address l.jpg
return_radeon_port_address();

  • Our demo invokes these PCI ROM-BIOS subfunctions to discover which I/O Port our Radeon’s 2D graphics engine uses

    • Subfunction 1: Detect BIOS presence

    • Subfunction 3: Find Device in a Class

    • Subfunction A: Read Configuration Dword

  • Configuration Dword at offset 0x14 holds I/O Port-Address for 2D graphics engine


The ati i o port interface l.jpg
The ATI I/O Port Interface

iobase + 0 iobase + 4

MM_INDEX

MM_DATA

You output a register’s index

to the iobase + 0 address

Then you have read or write access to

that register at the iobase + 4 address


Many 2d engine registers l.jpg
Many 2D engine registers!

  • You can peruse the ‘radeon.h’ header-file to see names and register-index numbers for the Radeon 2D graphics accelerator

  • You could also write a programming loop to input the contents from various offsets and thereby get some idea of which ones appear to hold ‘live’ values (i.e.,hundreds!)

  • Only a small number used in line-drawing


Main line drawing registers l.jpg
Main Line-Drawing registers

  • DP_GUI_MASTER_CNTL

  • DP_BRUSH_FRGD_COLOR

  • DP_BRUSH_BKGD_COLOR

  • DP_WRITE_MSK

  • DST_LINE_START

  • DST_LINE_END


Others that affect drawing l.jpg
Others that affect drawing

  • RB2D_DSTCACHE_MODE

  • MC_FB_LOCATION

  • DEFAULT_PITCH_OFFSET

  • DST_PITCH_OFFSET

  • SRC_PITCH_OFFSET

  • DP_DATATYPE

  • DEFAULT_SC_TOP_LEFT

  • DEFAULT_SC_BOTTOM_RIGHT


Cpu gpu synchronization l.jpg
CPU/GPU synchronization

Intel

Pentium

CPU

ATI

Radeon

GPU

When CPU off-loads the work of drawing lines (and doing other common

Graphical operations) tp the Graphics Processing Unit, then this frees up

the CPU to execute other instructions – but it opens up the possibility that

the CPU will send more drawing commands to the GPU, even before the

GPU is finished doing earlier commands. Some mechanism is needed to

prevent the GPU from becoming overwhelmed by work the CPU sends it.

Solution is a FIFO for pending commands, plus a Status Register


Engine has 64 fifo slots l.jpg
Engine has 64 FIFO slots

  • Before the CPU initiates a new drawing command, it checks to see if there are enough free slots in the command FIFO for storing that command’s parameters

  • The CPU can do ‘busy-waiting’ until the GPU reports that enough FIFO slots are ready to accept new command-arguments

  • An alternative is ‘interrupt-driven’ drawing


Testing drawline cpp l.jpg
Testing ‘drawline.cpp’

  • We developed our ‘drawline.cpp’ demo on a Radeon 7000 graphics card, then tested it on a newer and faster Radeon 9250

  • Our code worked fine

  • Tonight we shall try it on the Radeon X300

  • If these various models of the Radeon are fully compatible with one another, we can expect our demo to work fine on the X300


Hardware changes l.jpg
Hardware changes?

  • But if any significant differences exist in the various Radeon design-generations, then we may discover that our ‘drawline’ fails to perform properly on an X300

  • We would then have to explore the ways in which Radeon designs have changed, and try to devise ‘fixes’ for any flaws that we have found in our software application


In class exercises l.jpg
In-class exercises

  • Try running the ‘drawline.cpp’ application on our classroom or CS Lab workstation: maybe it works fine, maybe it doesn’t

  • Look at the source-code files for the Linux ‘open-source’ ATI Radeon device-driver

  • If our ‘drawline’ work ok, see if you can add code that programs the engine to fill rectangles or copy screen-areas; or, if ‘drawline’ fails, see if you can devise a ‘fix’


ad