- 150 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Research on Graph-Cut for Stereo Vision' - chase-lyons

Download Now**An Image/Link below is provided (as is) to download presentation**

Download Now

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Research on Graph-Cut for Stereo Vision

### Review on Stereo Vision

### Hierarchical Exhaustive Search on

### Death Sentence to HES

### Partitioned (Block) Graph-Cut

### Hierarchical Parallel Graph-Cut

Presenter: Nelson Chang

Institute of Electronics,

National Chiao Tung University

Outline

- Research Overview
- Brief Review of Stereo Vision
- Hierarchical Exhaustive Search
- Partitioned Graph-Cut for Stereo Vision
- Hierarchical Parallel Graph-Cut

Our Research

HRP-2 Head

- A fast vision system for robotics
- Stereo vision
- Local block-based + diffusion (M)
- Graph-cut (PhD)
- Belief propagation (PhD)
- Segmentation
- Watershed (M)
- Meanshift
- Approaches
- Embedded solutions
- DSP (U)
- ASIC
- PC-based solutions
- Dual webcam stereo (U)

HRP-2 Tri-Camera Head

My Research

- A fast graph-cut VLSI engine for stereo vision
- ASIC approach
- Goal: 256x256 pixels, 30 depth label, 30 fps
- Stereo vision system prototypes
- PC-based
- DSP-based
- FPGA/ASIC-based

Presenter: Nelson Chang

Institute of Electronics,

National Chiao Tung University

Concept of Stereo Vision

- Computational Stereo – to determine the 3-D structure of a scene from 2 or more images taken from distinct view points.

Triangulation of non-verged geometry

d : disparity

Z : depth

T : baseline

f : focal length

M. Z. Brown et al., “Advances in Computational Stereo,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 8, August 2003.

Disparity Image

- Disparity Map/Image
- The disparities of all the pixels in the image
- Example:

Left Cam

Right Cam

110 pixels

Disparity map of the 4x4 block

0

0

0

0

Left Disparity Map

Right Disparity Map

0

0

110

0

Farthest

0

100

138

0

d= 0

80

123

156

176

d= 255

Nearest

How to find the disparity of a pixel? (1/2)

- Simple Local Method
- Block Matching
- SADSum of Absolute Difference
- ∑|IL-IR|
- Find the candidate disparity with minimal SAD
- Assumption
- Disparities within a block should be the same
- Limitation
- Works bad in texture-less region
- Works bad in repeating pattern

0

0

0

0

0

100

d=k-1

SAD=400

0

200

300

0

0

0

d=k

SAD=0

0

0

0

0

100

0

0

100

0

200

300

0

d=k+1

SAD=600

200

300

0

Left

0

0

0

100

0

0

300

0

0

Right

How to find the disparity of a pixel? (2/2)

- Complex Global Method
- Graph-cut, Belief Propagation
- Disparity Estimation Optimal Labeling Problem
- Assign the label (disparity) of each pixel such that a given global energy is minimal
- Energy is a function of the label set (disparity map/image)
- The energy considers the
- Intensity similarity of the corresponding pixel
- Example: Absolute Difference (AD), D=|IL-IR|
- Disparity smoothness of neighboring pixels
- Example: Potts Model

If (dL≠dR), V=K

else, V=0

d=0 V=2K

d=16 V=3K

d=32 V=3K

d=2 V=4K

0

0

?

16

32

Swap and Expansion Moves

More chances of finding more local minimum

E

- Weak move
- Modifies 1 label at a time
- Standard move
- Strong
- Modifies multiple labels at a time
- Proposed swap and expansion move

Init.

Strong

Weak

α-βswap

αexpansion

Initial labeling

Standard move

V

V

V

V

D’

4-connected structure- Most common graph/MRF(BP) structure in stereo

2-variable Graph-Cut

Source

α

Observable nodes

D

V

V

V

V

Hidden nodes

α’

Sink

MRF in Belief Propagation

D,V are vectors

Presenter: Nelson Chang

Institute of Electronics,

National Chiao Tung University

Outline

- Combinatorial Optimization
- Graph-Cut
- Exhaustive Search
- Iterated Conditional Modes
- Hierarchical Exhaustive Search
- Result
- Summary & Next Step

0

0

0

99

92

101

100

?

?

?

?

10

10

10

0

1

2

3

100

79

98

114

1

1

1

1

Combinatorial Optimization- Determine a combination (pattern, set of labels) such that the energy of this combination is minimum
- Example: 4-bit binary label problem
- Find a label-set which yields the minimal energy
- Each individual bit can be set as 0 or 1
- Each label corresponds to an energy cost
- Each neighboring bit pair is better to have the same label (smoothness)

Energy(0000)

= 99+92+100+101

= 392

Energy(0001) =

= 99+92+100+98+10

= 399

0

0

0

99

92

101

100

10

10

10

?

?

?

?

0

1

2

3

100

79

98

114

1

1

1

1

Exhaustive Search- List all the combinations and corresponding energy
- Example: 1100 has the minimal energy of 390

0

0

0

99

92

101

100

10

10

10

?

?

?

?

0

1

2

3

100

79

98

114

1

1

1

1

Iterated Conditional Modes- Iteratively finds the best label under the current given condition
- Greedy
- Different starting decision (initial condition) result in different result
- Can find local minima
- Example:
- Start with bit 1 because it is more reliable
- Iteration order: bit1bit0bit2bit3
- Final solution: 1100

0

0

1

1

2

3

0

1

100(1)<99+10(0)

1

79(1)<92(0)

1

100+10(0)<114 (1)

0

101(0)<98+10(1)

0

Exhaustive Search Engine

- Exhaustive search can be hardware implemented
- Less sequential dependency
- Not suitable for graph larger than 4x4

Result of fully connected graph,

NOT 4-connected graph

1

2

3

Hierarchical Graph-Cut?- Solve large n graph with multiple small n GCE hierarchically
- Example:
- Solve n=16 with 4+1 n=4 graph-cuts

For each sub-graph,

find the best 2 label-sets

Sub-graph 0

Sub-graph 1

For each sub-graph vertice

Label 0 = 1st label set

Label 1 = 2nd label set

Assumption:

!! The optimal solution must be within the combinations of sub-graph label sets !!

Sub-graph 2

Sub-graph 3

HGC Speed up Evaluation

- For an 8-point GCE with 8-set of ECUs
- Cost: 300 eq. adders
- Latency: 41 cycles per graph
- If only 1 GCE is used to compute 64-point 2 variable graph-cut

Latency

= 41 cycles x 8 + 41 cycles + TV

= 369 cycles + TV

If V is computed for each pixels

Tv=(8x8)X(8x7/2)X4=3584

Total Latency ~ 3953 cycles

Question: Is this solution the optimal label set for n=64???

Hierarchical Exhaustive Search

pat0 is the best candidate pattern

pat1 is 2nd best candidate pattern

- 64x64 nodes
- 4x4 based pyramid structure
- 3 levels

Level 2

D@lv2 E0/E1@lv1

Label0@lv2 pat0@lv1

Label1@lv2 pat1@lv1

Level 1

D@lv1 E0/E1@lv0

Label0@lv1 pat0@lv0

Label1@lv1 pat1@lv0

Level 0

D@lv0 D0/D1@lv0

Label0@lv0 Label0

Label1@lv0 Label1

Computing V term at Level 1

- For 1st order neighboring sub-graphs Gi and Gj
- possible neighboring pair combination
- (pat0i, pat0j)
- (pat0i, pat1j)
- (pat1i, pat0j)
- (pat1i, pat1j)
- Compute V(patXi,patXj) with original neighboring cost
- Example:
- V(pat0i, pat0j) = K
- V(pat0i, pat1j) = K+K+K = 3K

Gi

Gj

pat0i

pat0j

?

?

?

0

0

?

?

?

?

?

?

0

0

?

?

?

?

?

?

0

1

?

?

?

?

?

?

1

1

?

?

?

pat0i

pat1j

?

?

?

0

1

?

?

?

?

?

?

0

0

?

?

?

?

?

?

0

1

?

?

?

?

?

?

1

0

?

?

?

Result of 16x16 (256) 2 level HES

- Random generated 100 graphs
- D/V~ 10
- Symmetric V=20
- Error Rate
- Max: 17/256 ~ 6.6%
- Average: 7/256 ~ 2.8%
- Min: 2/256 ~ 0.8%

Result of 64x64 (4096) 3 level HES

- Random generated 100 graphs
- D/V~ 10
- Symmetric V=20
- Error Rate
- Max: 185/4096 ~ 4.5%
- Average: 146/4096 ~ 3.6%
- Min: 115/4096 ~ 2.8%

Presenter: Nelson Chang

Institute of Electronics,

National Chiao Tung University

Error Rate vs. Graph Size

Error rate range became smaller

- (D,V)=(~163:20)

3.63 vs. 3.65

Error rate did not increase significantly

Impact of different V cost

- 64x64(3 level) HES
- 100 patterns per V cost value
- D cost (average over s-link caps of 10 patterns, 2 for each V)
- Average: 162.8
- Std.Dev: 94.4
- V cost
- 10, 20, 40, 60, 80

Stereo Matching Case

- Stereo Pair: Tsukuba
- Expansion with random label order
- 15 labels 15 graph-cut computations
- Graph Size: 256 x 256
- D term: truncated Sum of Squared Error (tSSE)
- Truncated at AD=20
- V term: Potts model
- K=20

1st iteration result

5

BnK’s expansion result

4

- Error rate might exceed 20% for important expansion moves

9

Important expansions

Reason for failure

- Best 2 local candidates does NOT include the final optimal solution
- Error often happen near lv2 and lv3 block boundary
- Majority node has both 0 source and sink link capacity
- More dependent on neighboring node’s label
- D:V ratio ~ 56:20 2.8:1
- Similar to D:V = 163:60 case
- Error rate for random pattern ~ 15%

Best 2 patterns in does NOT consider the pattern of

Presenter: Nelson Chang

Institute of Electronics,

National Chiao Tung University

Motivation

- Global
- Considers the whole picture
- More information
- Local
- Considers a limited region of a picture
- Less information

Is it necessary to use that much information in global methods??

Concept

- Original full GC
- 1 big graph
- Partitioned GC
- N smaller graphs

What’s the smallest possible partition to achieve the same performance?

Experiment Setting

- Energy
- D term
- Luma only
- Birchfield-Tomasi cost (best result at half-pel position)
- Square Error
- V term
- Potts Model V= K x T(di≠dj)
- K constant is the same for all partition
- Partition Size
- 4x4, 16x16, 32x32, 64x64, 128x128
- Stereo Pairs
- Tsukuba, Teddy, Cones, Venus

Middleburry Result

Evaluation Web Page http://cat.middlebury.edu/stereo/

Best: Full GC with best parameter

Full: Full GC with k=20(tsukuba) and 60 (others)

Summary

- Smallest possible partition size (2% accuracy drop)
- Tuskuba64x64
- Teddy & Cones 96x96
- Venus larger than 128x128
- Benefits
- Possible complexity or storage reduction
- Parallelism increase
- Drawbacks
- Performance (disparity accuracy) drop
- PC computation becomes longer

Presenter: Nelson Chang

Institute of Electronics,

National Chiao Tung University

Concept of Hierarchical Parallel GC

- Bottom Up
- Solve graph-cut for smaller subgraphs
- Solve graph-cut for larger subgraphs
- Larger subgraphs = set of neighboring smaller subgraphs

!!Each subgraph is temporary independent !!

Larger subgraph = sg0+sg1+sg2+sg3

sg0

sg1

Level 0

Level 1

sg2

sg3

HPGC for solving a 256x256 graph

Step 1 64 32x32 Lv0 subgraphs

Step 2 16 64x64 Lv1 subgraphs

Step 3 4 128x128 Lv2 subgraphs

Step 4 1 256x256 Lv3 subgraphs

Total graph-cut computations

= 64+16+4+1 =85

!!HPGC must used Ford-Fulkerson-based methods!!

Boykov and Kolmogorov’s Motivation

1

1

1

- Dinic Method
- Search the shortest augmenting path
- Use Breadth First Search (BFS)
- Example:
- Search shortest path (length = k)
- Use BFS, expand the search tree
- Find all paths of length k
- Search shortest path (length = k+1),
- Use BFS, RE-expand the search tree again
- Find all paths of length (k+1)
- Search shortest path (length = k+2),
- Use BFS, RE-RE-expand the search tree again
- …..

1

1

1

1

1

1

1

Why don’t we REUSE the expanded tree?

BnK’s Method

- Concept:
- Reuse the already expanded trees
- Avoid re-expanding the tress from scratch (nothing)
- 3 stages
- Growth
- Grow the search tree
- Augmentation
- Ford-Fulkerson style augmentation
- Adoption
- Reconnect the unconnected sub-trees
- Connect the orphans to a new parent

Augmenting Path

Saturate Critical Edge

Adopt Orphans

Feature of BnK method

- Based on Ford-Fulkerson
- Bidirection search tree constructon
- Searched tree reuse
- Determine label (source or sink) using tree connectivity

Source tree

Sink tree

Case 2

Case 3

Case 4

Connectivity is why HPGC works- Example: a 2x4 binary variable graph

Graph view

Tree view

How to add edges

- When should node A and B check their edge
- If A & B belong to different search trees
- A is in a sink tree, B is in a source tree
- A is in a source tree, B is in a sink tree
- Implies a source->sink path
- If A or B is an orphan (not connected to any tree)
- A is an orphan, B is not an orphan
- A is not an orphan, B is an orphan
- Check for possible connectivity of the orphan

B

A

Complexity Result

- Method
- Annotate each line of code with basic operations
- Read
- Write
- Arithmetic
- Logic
- Compare
- Branch
- Examples
- C=A+B 2R, 1W, 1A
- If(A==B) 2R, 1C, 1B

Stereo Matching Case

- Stereo Pair: Tsukuba
- Expansion with random label order
- 15 labels 15 graph-cut computations
- Graph Size: 256 x 256
- D term: truncated Sum of Squared Error (tSSE)
- Truncated at AD=20
- V term: Potts model
- K=20

Full BnK Graph-cut Operation Distribution

- 256x256 graph – Tsukuba iteration 0 label 5

77,407,307 Operations

Memory access dominant

Control ~22%

Arithmetic is insignificant

Conclusion

- HPGC can improve speed with multiple PEs
- To perform 30 fps, 30 labels, 256x256 graph-cut
- 1PE@100MHz
- Averge cycle budget for each subgraph ~1.3K cycles
- Lv0 subgraph is 32x32
- Next step
- Small BnK graph-cut engine architecture design
- Estimate speed/cost

Progress Check

- Previous plan
- Parallel graph-cut engine for binary-variable graph
- Based on Boykov and Kolmogorov’s graph cut algorithm
- Complexity analysis done
- Hierarchical parallel algorithm SW model done
- Small BnK graph-cut engine architecture design
- Based on Boykov and Kolmogorov’s algorithm next 2 weeks
- Hierarchical parallel graph-cut engine architecture design
- Based my hierarchical parallel algorithm modified from BnK’s algorithm June/July

Download Presentation

Connecting to Server..