what is the wht anyway and why are there so many ways to compute it
Download
Skip this Video
Download Presentation
What is the WHT anyway, and why are there so many ways to compute it?

Loading in 2 Seconds...

play fullscreen
1 / 26

What is the WHT anyway, and why are there so many ways to compute it? - PowerPoint PPT Presentation


  • 63 Views
  • Uploaded on

What is the WHT anyway, and why are there so many ways to compute it?. Jeremy Johnson. 1, 2, 6, 24, 112, 568, 3032, 16768,…. Walsh-Hadamard Transform. y = WHT N x, N = 2 n. WHT Algorithms. Factor WHT N into a product of sparse structured matrices Compute: y = (M 1 M 2 … M t )x

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' What is the WHT anyway, and why are there so many ways to compute it?' - francis-floyd


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
what is the wht anyway and why are there so many ways to compute it

What is the WHT anyway, and why are there so many ways to compute it?

Jeremy Johnson

1, 2, 6, 24, 112, 568, 3032, 16768,…

walsh hadamard transform
Walsh-Hadamard Transform
  • y = WHTN x, N = 2n
wht algorithms
WHT Algorithms
  • Factor WHTN into a product of sparse structured matrices
  • Compute: y = (M1 M2 … Mt)x

yt = Mtx

y2 = M2 y3

y = M1 y2

factoring the wht matrix
Factoring the WHT Matrix
  • AC Ä BD = (A Ä B)(C Ä D)
  • A Ä B = (A Ä I)(I Ä B)
  • A Ä (BÄ C) = (A Ä B)Ä C
  • ImÄ In = Imn

WHT2Ä WHT2 = (WHT2 Ä I2)(I2 Ä WHT2)

recursive and iterative factorization
Recursive and Iterative Factorization

WHT8= (WHT2 Ä I4)(I2 Ä WHT4)

  • = (WHT2 Ä I4)(I2 Ä ((WHT2 ÄI2) (I2 Ä WHT2)))
  • = (WHT2 Ä I4)(I2 Ä (WHT2 ÄI2)) (I2 Ä (I2 Ä WHT2))
  • = (WHT2 Ä I4)(I2 Ä (WHT2 ÄI2)) ((I2 Ä I2) Ä WHT2)
  • = (WHT2 Ä I4)(I2 Ä WHT2 ÄI2) ((I2 Ä I2) Ä WHT2)

= (WHT2 Ä I4)(I2 Ä WHT2 ÄI2) (I4 Ä WHT2)

recursive algorithm
Recursive Algorithm

WHT8 = (WHT2 I4)(I2  (WHT2  I2)(I2  WHT2))

iterative algorithm

é

ù

1

1

1

1

1

1

1

1

ê

ú

-

-

-

-

1

1

1

1

1

1

1

1

ê

ú

ê

ú

-

-

-

-

1

1

1

1

1

1

1

1

ê

ú

-

-

-

-

1

1

1

1

1

1

1

1

ê

ú

=

ê

ú

-

-

-

-

1

1

1

1

1

1

1

1

ê

ú

-

-

-

-

1

1

1

1

1

1

1

1

ê

ú

ê

ú

-

-

-

-

1

1

1

1

1

1

1

1

ê

ú

-

-

-

-

ê

ú

1

1

1

1

1

1

1

1

ë

û

é

ù

é

ù

é

ù

1

1

1

1

1

1

ê

ú

ê

ú

ê

ú

-

1

1

1

1

1

1

ê

ú

ê

ú

ê

ú

ê

ú

ê

ú

ê

ú

-

1

1

1

1

1

1

ê

ú

ê

ú

ê

ú

-

-

1

1

1

1

1

1

ê

ú

ê

ú

ê

ú

ê

ú

ê

ú

ê

ú

-

1

1

1

1

1

1

ê

ú

ê

ú

ê

ú

-

-

1

1

1

1

1

ê

ú

ê

ú

ê

ú

ê

ú

ê

ú

ê

ú

-

-

1

1

1

1

1

1

ê

ú

ê

ú

ê

ú

-

-

-

ê

ú

ê

ú

ê

ú

1

1

1

1

1

1

ë

û

ë

û

ë

û

Iterative Algorithm

WHT8 = (WHT2 I4)(I2  WHT2  I2)(I4  WHT2)

wht algorithms1
WHT Algorithms
  • Recursive
  • Iterative
  • General
wht implementation

n

+ ··· +

n

n

+ ··· +

n

i

+1

t

1

i

-

1

WHT Implementation
  • Definition/formula
    • N=N1* N2NtNi=2ni
    • x=WHTN*x x =(x(b),x(b+s),…x(b+(M-1)s))
  • Implementation(nested loop)

R=N; S=1;

for i=t,…,1

R=R/Ni

forj=0,…,R-1

for k=0,…,S-1

S=S* Ni;

M

b,s

t

Õ

)

Ä

Ä

(

I

I

WHT

WHT

=

n

n

2

2

2

2

i

i

=

1

partition trees

9

4

4

3

4

2

1

1

3

3

4

1

2

1

1

1

2

2

1

1

1

1

4

1

1

1

1

1

1

2

2

1

1

1

1

Partition Trees

Left Recursive

Right Recursive

Balanced

Iterative

ordered partitions
Ordered Partitions
  • There is a 1-1 mapping from ordered partitions of n onto (n-1)-bit binary numbers.
  • There are 2n-1 ordered partitions of n.

162 = 1 0 1 0 0 0 1 0

1|1 1|1 1 1 1|1 1  1+2+4+2 = 9

enumerating partition trees

3

2

1

1

1

Enumerating Partition Trees

00

01

01

3

3

3

2

1

2

1

1

1

10

10

11

3

3

1

2

1

1

1

search space
Search Space
  • Optimization of the WHT becomes a search, over the space of partition trees, for the fastest algorithm.
  • The number of trees:
size of search space
Size of Search Space
  • Let T(z) be the generating function for Tn

Tn = (n/n3/2), where =4+8  6.8284

  • Restricting to binary trees

Tn = (5n/n3/2)

wht package p schel johnson icassp 00
WHT PackagePüschel & Johnson (ICASSP ’00)
  • Allows easy implementation of any of the possible

WHT algorithms

  • Partition tree representation

W(n)=small[n] | split[W(n1),…W(nt)]

  • Tools
    • Measure runtime of any algorithm
    • Measure hardware events
    • Search for good implementation
      • Dynamic programming
      • Evolutionary algorithm
histogram n 16 10 000 samples
Histogram (n = 16, 10,000 samples)
  • Wide range in performance despite equal number of arithmetic operations (n2n flops)
  • Pentium III consumes more run time (more pipeline stages)
  • Ultra SPARC II spans a larger range
operation count
Operation Count

Theorem. Let WN be a WHT algorithm of size N. Then the number of floating point operations (flops) used by WN is Nlg(N).

Proof. By induction.

instruction count model
Instruction Count Model
  • A(n) = number of calls to WHT procedure
  • = number of instructions outside loops

Al(n) = Number of calls to base case of size l

  •  l = number of instructions in base case of size l
  • Li = number of iterations of outer (i=1), middle (i=2), and
  • outer (i=3) loop
  • i = number of instructions in outer (i=1), middle (i=2), and
  • outer (i=3) loop body
small 1
Small[1]

.file "s_1.c"

.version "01.01"

gcc2_compiled.:

.text

.align 4

.globl apply_small1

.type apply_small1,@function

apply_small1:

movl 8(%esp),%edx //load stride S to EDX

movl 12(%esp),%eax //load x array\'s base address to EAX

fldl (%eax) // st(0)=R7=x[0]

fldl (%eax,%edx,8) //st(0)=R6=x[S]

fld %st(1) //st(0)=R5=x[0]

fadd %st(1),%st // R5=x[0]+x[S]

fxch %st(2) //st(0)=R5=x[0],s(2)=R7=x[0]+x[S]

fsubp %st,%st(1) //st(0)=R6=x[S]-x[0] ?????

fxch %st(1) //st(0)=R6=x[0]+x[S],st(1)=R7=x[S]-x[0]

fstpl (%eax) //store x[0]=x[0]+x[S]

fstpl (%eax,%edx,8) //store x[0]=x[0]-x[S]

ret

histogram using instruction model p3
Histogram using Instruction Model (P3)

 l = 12,  l = 34, and  l = 106

 = 27

1 = 18, 2 = 18, and 1 = 20

dynamic programming
Dynamic Programming

n

Cost(

),

min

n1+… + nt= n

T

T

nt

n1

where Tn is the optimial tree of size n.

This depends on the assumption that Cost only depends on

the size of a tree and not where it is located.

(true for IC, but false for runtime).

For IC, the optimal tree is iterative with appropriate leaves.

For runtime, DP is a good heuristic (used with binary trees).

optimal formulas
Optimal Formulas

UltraSPARC

[1], [2], [3], [4], [5], [6]

[[3],[4]]

[[4],[4]]

[[4],[5]]

[[5],[5]]

[[5],[6]]

[[4],[[4],[4]]]

[[[4],[5]],[4]]

[[4],[[5],[5]]]

[[[5],[5]],[5]]

[[[5],[5]],[6]]

[[4],[[[4],[5]],[4]]]

[[4],[[4],[[5],[5]]]]

[[4],[[[5],[5]],[5]]]

[[5],[[[5],[5]],[5]]]

Pentium

[1], [2], [3], [4], [5], [6]

[7]

[[4],[4]]

[[5],[4]]

[[5],[5]]

[[5],[6]]

[[2],[[5],[5]]]

[[2],[[5],[6]]]

[[2],[[2],[[5],[5]]]]

[[2],[[2],[[5],[6]]]]

[[2],[[2],[[2],[[5],[5]]]]]

[[2],[[2],[[2],[[5],[6]]]]]

[[2],[[2],[[2],[[2],[[5],[5]]]]]]

[[2],[[2],[[2],[[2],[[5],[6]]]]]]

[[2],[[2],[[2],[[2],[[2],[[5],[5]]]]]]]

different strides
Different Strides
  • Dynamic programming assumption is not true. Execution time depends on stride.
ad