- 72 Views
- Uploaded on
- Presentation posted in: General

Computer Science 631 Lecture 4: Wavelets

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Computer Science 631Lecture 4: Wavelets

Ramin Zabih

Computer Science Department

CORNELL UNIVERSITY

- Announcement: HW#1 exists!
- One more try at Powerpoint formulas
- Wavelet transformation via the Haar basis
- 2D wavelets

- Compute the position of the pixel X w.r.t. an oriented ray PQ
- Coordinates are A (along PQ) and B (perpendicular to PQ)

- The problem is that A and B are in units of pixels
- Need them in percentages of the length of PQ

Note that X’ (as well as X) is a point, not a pixel

- The value at the interior point (x,y) is
- To compute this fast:

- An image can be described in several ways
- So far, in terms of pixels (= spatial domain)
- The frequency decomposition is very useful
- Low-frequency components change slowly
- High-frequency components change rapidly

- If the image has no (little) high-frequency components, then aliasing is not a problem

- By describing an image in terms of the frequency domain, many things become clear
- The image formation process itself removes really high-frequency components
- What happens when we take a picture of a checkerboard where a pixel contains 100 squares?
- Many image operations are naturally viewed in terms of their effects on various frequency components
- Local averaging removes high-frequency components

- Image compression is best viewed in this way

- The image formation process itself removes really high-frequency components

- The canonical way to describe the frequency of an image is in terms of its Fourier transform
- This involves a number of issues that we don’t have time to cover in depth

- Instead, we will use a Wavelet representation called the Haar basis
- Same basic idea, but easier and more intuitive
- For example, our basis vectors will be mostly 0
- By contrast, the Fourier basis is sine waves

- Consider a 1-D image (signal) with four elements: I = [9 7 3 5]
- Spatial representation:
I = 9*[1 0 0 0]+7*[0 1 0 0]+3*[0 0 1 0]+5*[0 0 0 1]

- The basis elements are 1-D (in this case) vectors, each with a single 1
- What are they called for an image?

- Spatial representation:

- Our basis vectors will have a scale, which intuitively means how many non-zero elements
- To begin with, we will subtract the average value of I, which is 6 in our example
I - 6 = [3 1 -3 -1] I = 6*[1 1 1 1] + [3 1 -3 -1]

- [1 1 1 1] is our first (dull) new basis
- No zeros, so coarsest possible scale

- The average value of an image is referred to as its DC component, others are AC components

- To begin with, we will subtract the average value of I, which is 6 in our example

- For wavelets, the basis functions are shifted and scaled versions of a single function
- Like Fourier basis, with sine/cosine
- The “generating function” for the Haar basis is [1 -1]
- everywhere else it’s 0

- This is a pretty simple way to describe a signal!

- The average of two numbers is halfway between them
- Therefore, a pair [a b] can be represented as:
avg*[1 1] + diff*[1 -1], where

avg = (a + b)/2 and diff = a - avg = avg - b

- For the Haar basis, we will use this trick
- For a pair of numbers, compute the average and difference

- Therefore, a pair [a b] can be represented as:

scale

average

4

2

1

[9 7 3 5]

[8 4]

[6]

- Given a signal of size 2n, we can average the pixels together at various scales
- Compute a pairwise average, to get size 2n-1
- Compute a pairwise average on that, too
- Eventually you get a single number

- Example (n = 2)

scale

average

detail

4

2

1

[9 7 3 5]

[8 4]

[6]

[1 -1]

[2]

- Each time we compute the average, we can write down the difference too
[9 7] -> [8 1], [3 5] -> [4 -1]

We can reconstruct [8 4] from [6] and [2]

Goal: represent the input (left) in terms of a DC term, plus

scaled and shifted versions of the basis function (right)

Add these to get our original input [9 7 3 5]

Add these to get the previous average [8 8 4 4]

- There are several ways to do the representation
- For the Haar basis, we use a single DC component, and everything else is details
6*[1 1 1 1]+2*[1 1 -1 -1]+1*[1 -1 0 0] +-1*[0 0 1 -1]

- Note that you can build the averages at one level from the averages and differences on the next level down
- We can reconstruct [8 4] from [6] and [2]
[8 4] = 6*[1 1] + 2*[1 -1]

- We can reconstruct [8 4] from [6] and [2]

- For the Haar basis, we use a single DC component, and everything else is details

- Given an input signal, we can create the transformed signal by repeated averaging and differencing
[9 7 3 5]

[8 4 1 -1]

[6 2 1 -1]

Algorithm:

Turn each adjacent pair into an (avg,diff) pair

Store all the avg’s in the left half of output

Store all the diff’s in the right half

Recurse on the left half of the output

Stop when there is a single number left

Inverting is a simple exercise...

- The basis vectors for the Haar basis are shifted and scaled versions of [1 -1]
- We’d like the basis to be orthonormal
- For any two elements u,v we want the inner product utv to be 1 if u = v, otherwise 0
- What is [1 -1]t[1 -1] ?
- Solution: multiple basis vectors by the appropriate constant

- First issue is that images are not necessarily of size 2n by 2n
- What is the wavelet transform of [1 3 9]?

- This issue arises in many places
- For example, if you want to smooth an image.
- Obvious solution is to replace each pixel by a local average
- But what do you do at the borders?

- Pad the image with something
- Typically, pad with 0
- For a local operation, discard the values that are within radius of the border
- Just need to be sure the process doesn’t crash

- “Make up” some other stuff around the image
- Question is what to make up
- Obvious answer is to use the image itself!

- Why?
- The difference between the Fourier transform and the transform that JPEG compression uses is (to simplify) standard versus reverse tiling
- Discrete Fourier Transform = standard
- Discrete Cosine Transform = reverse

- The standard decomposition first computes the 1D wavelet transform of each row
- The result is a row of the same length
- Order by decreasing importance (resolution)
- DC coefficient first

- We treat the result as an image, then compute a 1D wavelet transform of each column
- Result is an image with a single DC coefficient
- Rest are details (usually, details of details…)

- Under the standard basis, examples are:

Gray = 0