slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Reducing Complexity in Signal Processing Algorithms for Communication Receiver and Image Display Software PowerPoint Presentation
Download Presentation
Reducing Complexity in Signal Processing Algorithms for Communication Receiver and Image Display Software

Loading in 2 Seconds...

play fullscreen
1 / 36

Reducing Complexity in Signal Processing Algorithms for Communication Receiver and Image Display Software - PowerPoint PPT Presentation


  • 154 Views
  • Uploaded on

Wireless Networking and Communications Group. Reducing Complexity in Signal Processing Algorithms for Communication Receiver and Image Display Software. Brian L. Evans Prof. Brian L. Evans. Seminar at the American University of Beirut. 27 July 2010. Outline. Embedded digital systems

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Reducing Complexity in Signal Processing Algorithms for Communication Receiver and Image Display Software' - della


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Wireless Networking and Communications Group

Reducing Complexity inSignal Processing Algorithms forCommunication Receiver andImage Display Software

Brian L. Evans

Prof. Brian L. Evans

Seminar at the American University of Beirut

27 July 2010

outline
Outline
  • Embedded digital systems
  • Generating sinusoidal waveforms
  • Discrete-time filters
  • Multicarrier equalizers
  • Image halftoning algorithms
  • Conclusion

2004

2005

2006

2007

2008

2009

2010

embedded digital systems
Embedded Digital Systems
  • Often work on application-specific tasks
  • In consumer products (2008 units)

1200M cell phones 70M DSL modems

300M PCs 55M cars/light trucks

100M digital cameras 30M gaming consoles (2007)

100M DVD players

  • iPhone has six programmable processors (2008)
  • Embedded programmable processors

Inexpensive with small area and volume

Predictable off-chip input/output (I/O) rates

“Low” power (TI C5504 45mW @ 300MHz)

Limited on-chip memory

Fixed-point arithmetic

embedded digital systems4
Embedded Digital Systems
  • Memory access in processors

External I/O: block data transfers to/from on-chip memory

Internal I/O: on-chip memory to CPU registers using data buses (e.g. TI C6000 processor has two 32-bit data buses)

  • Common word sizes for signal processing software

64-bit floating-point for desktop computing (e.g. Matlab)

32-bit floating-point for pro-audio and sonar beamforming

16-bit fixed-point for speech, consumer audio, image proc.

  • IEEE floating-point operations

Handles many special cases (e.g. +∞, -∞ and not a number)

Add, multiply, divide have comparable hardware complexity

embedded digital systems5
Embedded Digital Systems
  • Fixed-pointoperations

Multiplicationbased on addition operations

Division takes 1-2instructions perbit of accuracy

Multiplication canconsume muchdynamic power

  • Truncate constantsfor power savings

56%

Multiplier used in TI C64 processors

[Han, Evans & Swartzlander, 2005]

generating sinusoidal waveforms
Generating Sinusoidal Waveforms
  • Sample continuous-time cosine signal at rate fs

Discrete-time fixed frequency 0 = 2 f0 / fs

Example: f0 = 1200 Hz and fs = 8000 Hz, 0 = 3/10 

Discrete-time realization drops fs term in front of cosine

  • Math library call to cos function in C

Uses double-precision floating-point arithmetic

No standard in C for internal implementation

Generally meant for high-accuracy desktop calculations

  • Call to gsl_sf_cos_e in GNU scientific library 1.8

20 multiply, 30 add, 2 divide, 2 power calculations/output

generating sinusoidal waveforms7
Generating Sinusoidal Waveforms
  • Difference equation with input x[n] and output y[n]

y[n] = (2 cos 0) y[n-1] - y[n-2] + x[n] - (cos 0) x[n-1]

From inverse z-transform of z-transform of cos(0n) u[n]

Impulse response gives cos(0n) u[n]

2 multiplications and 3 adds per output value

Buildup in error as n increases due to feedback

  • Lookup table – pre-compute samples offline

Discrete-time frequency 0 = 2 f0 / fs = 2 N / L

All common factors between integers N and L removed

 = 2  k = 2 (N / L) n → n = L → store L samples

Entries in either floating-point or fixed-point format

Table would contain N periods of the cosine

Initial conditions are all zero

generating sinusoidal waveforms8
Generating Sinusoidal Waveforms
  • Signal quality vs. implementation complexity in generating cos(0n) u[n] with 0 = 2 N / L

MAC Multiplication-accumulationRAM Random Access Memory (writeable) ROM Read-Only Memory

discrete time filters
Discrete-Time Filters
  • Finite impulse response (FIR) filter
  • Impulse response h[k] has finite extent k = 0,…, M-1

x[k-1]

x[k]

z-1

z-1

z-1

h[0]

h[1]

h[2]

h[M-1]

S

y[k]

Discrete-time convolution

discrete time filters10

v[k]

x[k]

b0

y[k]

UnitDelay

a1

b1

v[k-1]

UnitDelay

v[k-2]

a2

b2

Discrete-Time Filters
  • Infinite impulse response (IIR) filter

Biquad building block: 2 poles and 0-2 zeros

Generally, coefficients a1, a2, b0, b1, b2 are real-valued

Biquad is short for biquadratic− transfer function is ratio of two quadratic polynomials

discrete time filters11
Discrete-Time Filters

(1) For same piecewise constant magnitude specification(2) Algorithm to estimate minimum order for Parks-McClellan algorithm by Kaiser may be off by 10%. Search for minimum order is often needed.(3) Algorithms can tune design to implementation target to minimize risk

discrete time filters12
Discrete-Time Filters
  • Keep roots computed by filter design algorithms

Polynomial deflation (rooting) reliable in floating-point

Polynomial inflation (expansion) may degrade roots

  • Choice of IIR filter structure matters

Direct form IIR structures expand zeros and poles, and may become unstable for large order filters (order > 12)

Cascade of biquads expands zeros and poles in each biquad

  • Minimum order design not always most efficient

Efficiency depends on target implementation

Consider power-of-two coefficient design

Efficient designs may require search of ∞ design space

halftime aub summer 2005
Halftime: AUB Summer 2005
  • EECE 503 Real-Time DSP Lab
  • Embedded digital systems
  • Generating sinusoidal waveforms
  • Discrete-time filters
  • Multicarrier equalizers
  • Image halftoning algorithms
  • Conclusion
channel equalization
Channel Equalization
  • Channel degrades transmitted signal

Nonlinear distortion, e.g. amplitude nonlinearities

Linear distortion, e.g. convolution by channel impulse response

Additive noise, e.g. thermal (Gaussian) and impulsive

  • Equalization compensates linear distortion

Spreading/attenuation in time

Magnitude/phase distortion in frequency

Received bit stream

Message bit stream

Transmitter

Channel

Receiver

Equalizer

multicarrier modulation
Multicarrier Modulation
  • Divide channel into narrowband subchannels
  • Discrete multitone modulation

Baseband transmission based on fast Fourier transform (FFT)

Each subchannel carries single-carrier transmission

Standardized for digital subscriber line (DSL) communication

channel

carrier

magnitude

subchannel

frequency

Subchannels are 4.3 kHz wide in DSL systems

channel equalization16
Channel Equalization

nk

Channel

Equalizer

  • Equalizer

Shortens channelimpulse response(time domain eq.)

Compensates phase/magnitude distortion(freq. domain eq.)

  • Single carrier system – g is scalar constant

FIR filter w performs time and frequency domain equalization

  • Multicarrier system – g is FIR filter of length n+1

Time domain equalizer (w) then FFT & freq. domain equalizer

yk

xk

rk

ek

w

h

+

+

+

Training signal

-

Ideal Channel

Receiver generates xk

g

z-

Discretized Baseband System

Equalization in DSL receivers increases bit rate by 10x

multicarrier equalization
Multicarrier Equalization
  • Maximum shortening SNR time domain equalizer

Minimize energy leakage outside shortened channel length

For each position of window  [Melsa, Younce & Rohrs, 1996]

  • Cholesky decomposition of Bleads to optimal eigensolution

Computationally-intensive: O(Lw3)

Floating-point multiplications/divisions

Restricts TEQ length to be less than n+1

n+1 samples

channel impulse response

effective channel impulse response

time domain equalizer design
Time Domain Equalizer Design

Bit Rate (Mbps)

TEQ length of 17

Data rates averaged over eight standard DSL test lines

[Martin et al., 2006]

Training complexity in log10(multiply-add operations)

Most efficient floating-point versions of algorithms used

time domain equalizer design19
Time Domain Equalizer Design
  • Unified framework [Martin et al., 2006]

A and B are square (LwLw) and depend on choice of 

Constraint prevents trivial non-practical solution w = 0

  • Find eigenvector for largest generalized eigenvalue

Formulation

Power method

Alternating

Lagrangian

Iterative Methods

division-free

20 iterations to converge for 17-tap MSSNR TEQ design

digital image halftoning

Original Image

Threshold at Mid-Gray

x(m)

b(m)

Digital Image Halftoning
  • For display on devices with fewer bits ofgray/color resolution than original image

Grayscale: 8-bit image to 1-bit image

Color: 24-bit RGB image to 12-bit RGB display

  • Produces artifacts

Each pixel in original image is 8-bit unsigned intensity in [0, 255]

For display, 0 is black and 255 is white

quantization with feedback
Consider 4-bit data on 2-bit display (unsigned)

Feedback quantization error

For constant input 1001 = 9

Average output value

¼ (10+10+10+11) = 1001

4-bit resolution at DC !

Noise shaping

Truncating from 4 to 2 bits increases noise by ~12dB

Feedback removes noise at DC & increases HF noise

Inputsignal

words

4

2

Todisplaydevice

2

2

1 sample

delay

Quantization with Feedback

Adder Inputs OutputTime Upper Lower Sum to display

1 1001 00 1001 10

2 1001 01 1010 10

3 1001 10 1011 10

4 1001 11 1100 11

Added noise

12 dB

(2 bits)

Periodic

f

error diffusion halftoning

Original

7/16

3/16

5/16

1/16

Halftone Spectrum

Halftone

Error Diffusion Halftoning
  • Quantize each pixel
  • Diffuse filtered quantization error to “future” pixels

difference

threshold

u(m)

x(m)

b(m)

current pixel

_

+

_

+

e(m)

[Floyd & Steinberg, 1976]

compute error

shape error

error filter weights

error diffusion halftoning23
Error Diffusion Halftoning
  • Deterministic bit flipping quantizer (DBF)[Damera-Venkata & Evans, 2001]

Thresholds input to black (0) or white (255)

Flip quantized value about mid-gray (128)

Reduces false textures in mid-grays

Implemented with two comparisons

DBF(x)

255

x1

128

x2

x

sharpness control
Sharpness Control

Signal transfer function models sharpening

Ks ≈ 2 for Floyd-Steinberg

Noise transfer function models noise-shaping

Kn = 1

  • Model quantizer as gain plus noise [Kite, Evans & Bovik, 1997]

Ks = 2

2

1

w

w

-w1

w1

-w1

w1

Pass high frequency noise

Pass low and enhance high frequencies

Plots for ideal lowpass H()

sharpness control25
Sharpness Control
  • Adjust by threshold modulation [Eschbach & Knox, 1991]

Scale image by gain L and add it to quantizer input

  • Flatten signal transfer function [Kite, Evans & Bovik, 2000]

L

b(m)

u(m)

x(m)

_

+

_

+

e(m)

results
Results

Floyd-Steinberg

Original

DBF quantizer

Unsharpened

conclusion
Conclusion
  • Processor architecture

Decrease data sizes to reduce on-chip memory usage and increase data bus efficiency

Truncate multiplicand constants to reduce power

  • Compute signal values by recursion or lookup table
  • Algorithm design

Keep offline design results in full precision until end

Order of calculations matters in implementation

Exploit problem structure in developing fixed-point algorithms

Linearize nonlinear systems to leverage linear system methods

  • Many other ways to reduce complexity exist
invitations
Invitations
  • Panel discussion on graduate studies

Tomorrow (Wednesday) 1:30 – 2:30 pm in this room (RCR)

Panelists: Prof. Zaher Dawy (AUB), Prof. Imad El-Hajj (AUB) and Prof. Brian Evans (UT Austin)

  • IEEE Workshop on Signal Processing Systems

Early October 2011

Short walk from the AUB campus

Organizers include Prof. Magdy Bayoumi (Univ. of Louisiana at Lafayette), Prof. Brian Evans (UT Austin), Dean Ibrahim Hajj (AUB) and Prof. Mohammad Mansour (AUB)

digital signal processors

AnnualRevenue

Share

Digital Signal Processors

DSP Processor Market

  • Market

~1/3 of $25B embedded digital signal processing market

2007 cholesterol loweringPzifer Lipitor sales: $13B

  • Applications (2007)

Source: Forward Concepts

Source: Forward Concepts

screening masking methods

Introduction

Screening (Masking) Methods
  • Periodic thresholds to binarize image

Periodic application leads to aliasing (gridding effect)

Clustered dot screening is more resistant to ink spread

Dispersed dot screening has higher spatial resolution

Blue larger masks (e.g. 1” by 1”)

Clustered dot mask

Dispersed dot mask

index

Threshold Lookup Table

linear gain model for quantizer

Ks

Linear Gain Model for Quantizer
  • Extend sigma-delta modulation analysis to 2-D

Linear gain model for quantizer in 1-D [Ardalan and Paulos, 1988]

Linear gain model for grayscale image [Kite, Evans, Bovik, 1997]

  • Error diffusion is modeled as linear, shift-invariant

Signal transfer function (STF): quantizer acts as scalar gain

Noise transfer function (NTF): quantizer acts as additive noise

{

us(m)

Ks us(m)

Signal Path

u(m)

b(m)

n(m)

un(m)

un(m) + n(m)

Noise Path

spatial domain

Original Image

Threshold at Mid-Gray

Dispersed Dot Screening

Clustered DotScreening

Stucki Error

Diffusion

Floyd SteinbergError Diffusion

Spatial Domain
magnitude spectra

Dispersed Dot Screening

Threshold at Mid-Gray

Original Image

Clustered DotScreening

Stucki Error

Diffusion

Floyd SteinbergError Diffusion

Magnitude Spectra
human visual system modeling
Human Visual System Modeling
  • Contrast at particular spatialfrequency for visibility

Bandpass: non-dimbackgrounds[Manos & Sakrison, 1974; 1978]

Lowpass: high-luminance officesettings with low-contrast images[Georgeson & G. Sullivan, 1975]

Exponential decay[Näsäsen, 1984]

Modified lowpass version[e.g. J. Sullivan, Ray & Miller, 1990]

Angular dependence: cosinefunction[Sullivan, Miller & Pios, 1993]

linear gain model for quantizer36

Image

Floyd

Stucki

Jarvis

Analysis and Modeling

barbara

2.01

3.62

3.76

boats

1.98

4.28

4.93

lena

2.09

4.49

5.32

mandrill

2.03

3.38

3.45

Average

2.03

3.94

4.37

Linear Gain Model for Quantizer
  • Best linear fit for Ks between quantizer input u(m) and halftone b(m)

Stable for Floyd-Steinberg

Can use average value to estimate Ks from only error filter

  • Sharpening: proportional to Ks [Kite, Evans & Bovik, 2000]

Value of Ks: Floyd Steinberg < Stucki < Jarvis

  • Weighted SNR using unsharpened halftone

Floyd-Steinberg > Stucki > Jarvis at all viewing distances