research on graph cut for stereo vision n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Research on Graph-Cut for Stereo Vision PowerPoint Presentation
Download Presentation
Research on Graph-Cut for Stereo Vision

Loading in 2 Seconds...

play fullscreen
1 / 60

Research on Graph-Cut for Stereo Vision - PowerPoint PPT Presentation


  • 163 Views
  • Uploaded on

Research on Graph-Cut for Stereo Vision. Presenter: Nelson Chang Institute of Electronics, National Chiao Tung University. Outline. Research Overview Brief Review of Stereo Vision Hierarchical Exhaustive Search Partitioned Graph-Cut for Stereo Vision Hierarchical Parallel Graph-Cut.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Research on Graph-Cut for Stereo Vision' - chase-lyons


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
research on graph cut for stereo vision

Research on Graph-Cut for Stereo Vision

Presenter: Nelson Chang

Institute of Electronics,

National Chiao Tung University

outline
Outline
  • Research Overview
  • Brief Review of Stereo Vision
  • Hierarchical Exhaustive Search
  • Partitioned Graph-Cut for Stereo Vision
  • Hierarchical Parallel Graph-Cut
our research
Our Research

HRP-2 Head

  • A fast vision system for robotics
    • Stereo vision
      • Local block-based + diffusion (M)
      • Graph-cut (PhD)
      • Belief propagation (PhD)
    • Segmentation
      • Watershed (M)
      • Meanshift
  • Approaches
    • Embedded solutions
      • DSP (U)
      • ASIC
    • PC-based solutions
      • Dual webcam stereo (U)

HRP-2 Tri-Camera Head

my research
My Research
  • A fast graph-cut VLSI engine for stereo vision
    • ASIC approach
    • Goal: 256x256 pixels, 30 depth label, 30 fps
  • Stereo vision system prototypes
    • PC-based
    • DSP-based
    • FPGA/ASIC-based
review on stereo vision

Review on Stereo Vision

Presenter: Nelson Chang

Institute of Electronics,

National Chiao Tung University

concept of stereo vision

d

Concept of Stereo Vision
  • Computational Stereo – to determine the 3-D structure of a scene from 2 or more images taken from distinct view points.

Triangulation of non-verged geometry

d : disparity

Z : depth

T : baseline

f : focal length

M. Z. Brown et al., “Advances in Computational Stereo,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 8, August 2003.

disparity image
Disparity Image
  • Disparity Map/Image
    • The disparities of all the pixels in the image
  • Example:

Left Cam

Right Cam

110 pixels

Disparity map of the 4x4 block

0

0

0

0

Left Disparity Map

Right Disparity Map

0

0

110

0

Farthest

0

100

138

0

d= 0

80

123

156

176

d= 255

Nearest

how to find the disparity of a pixel 1 2
How to find the disparity of a pixel? (1/2)
  • Simple Local Method
    • Block Matching
      • SADSum of Absolute Difference
        • ∑|IL-IR|
      • Find the candidate disparity with minimal SAD
    • Assumption
      • Disparities within a block should be the same
    • Limitation
      • Works bad in texture-less region
      • Works bad in repeating pattern

0

0

0

0

0

100

d=k-1

SAD=400

0

200

300

0

0

0

d=k

SAD=0

0

0

0

0

100

0

0

100

0

200

300

0

d=k+1

SAD=600

200

300

0

Left

0

0

0

100

0

0

300

0

0

Right

how to find the disparity of a pixel 2 2
How to find the disparity of a pixel? (2/2)
  • Complex Global Method
    • Graph-cut, Belief Propagation
  • Disparity Estimation  Optimal Labeling Problem
    • Assign the label (disparity) of each pixel such that a given global energy is minimal
      • Energy is a function of the label set (disparity map/image)
      • The energy considers the
        • Intensity similarity of the corresponding pixel
          • Example: Absolute Difference (AD), D=|IL-IR|
        • Disparity smoothness of neighboring pixels
          • Example: Potts Model

If (dL≠dR), V=K

else, V=0

d=0 V=2K

d=16 V=3K

d=32 V=3K

d=2 V=4K

0

0

?

16

32

swap and expansion moves
Swap and Expansion Moves

More chances of finding more local minimum

E

  • Weak move
    • Modifies 1 label at a time
    • Standard move
  • Strong
    • Modifies multiple labels at a time
    • Proposed swap and expansion move

Init.

Strong

Weak

α-βswap

αexpansion

Initial labeling

Standard move

4 connected structure

D

V

V

V

V

D’

4-connected structure
  • Most common graph/MRF(BP) structure in stereo

2-variable Graph-Cut

Source

α

Observable nodes

D

V

V

V

V

Hidden nodes

α’

Sink

MRF in Belief Propagation

D,V are vectors

hierarchical exhaustive search on

Hierarchical Exhaustive Search on

Presenter: Nelson Chang

Institute of Electronics,

National Chiao Tung University

outline1
Outline
  • Combinatorial Optimization
  • Graph-Cut
  • Exhaustive Search
  • Iterated Conditional Modes
  • Hierarchical Exhaustive Search
  • Result
  • Summary & Next Step
combinatorial optimization

0

0

0

0

99

92

101

100

?

?

?

?

10

10

10

0

1

2

3

100

79

98

114

1

1

1

1

Combinatorial Optimization
  • Determine a combination (pattern, set of labels) such that the energy of this combination is minimum
  • Example: 4-bit binary label problem
    • Find a label-set which yields the minimal energy
      • Each individual bit can be set as 0 or 1
        • Each label corresponds to an energy cost
      • Each neighboring bit pair is better to have the same label (smoothness)

Energy(0000)

= 99+92+100+101

= 392

Energy(0001) =

= 99+92+100+98+10

= 399

graph cut

1100

101

99

100

92

79

114

100

98

Graph-Cut
  • Formulate the previous problem into a graph-cut problem
    • Find the cut with minimum total capacity (cost, energy)
  • Solving the graph-cut: Ford-Fulkurson Method

0

3

13

12

2

?

?

?

?

10

10

10

9

7

0

1

2

3

14

4

1

1

1

Total Flow Pushed

=

99+79+100+98

+1

+10

+3

=390 Max Flow (Energy of the cut 1100)

exhaustive search

0

0

0

0

99

92

101

100

10

10

10

?

?

?

?

0

1

2

3

100

79

98

114

1

1

1

1

Exhaustive Search
  • List all the combinations and corresponding energy
  • Example: 1100 has the minimal energy of 390
iterated conditional modes

0

0

0

0

99

92

101

100

10

10

10

?

?

?

?

0

1

2

3

100

79

98

114

1

1

1

1

Iterated Conditional Modes
  • Iteratively finds the best label under the current given condition
    • Greedy
    • Different starting decision (initial condition) result in different result
    • Can find local minima
  • Example:
    • Start with bit 1 because it is more reliable
    • Iteration order: bit1bit0bit2bit3
    • Final solution: 1100

0

0

1

1

2

3

0

1

100(1)<99+10(0)

 1

79(1)<92(0)

 1

100+10(0)<114 (1)

 0

101(0)<98+10(1)

 0

exhaustive search engine
Exhaustive Search Engine
  • Exhaustive search can be hardware implemented
  • Less sequential dependency
  • Not suitable for graph larger than 4x4

Result of fully connected graph,

NOT 4-connected graph

hierarchical graph cut

0

1

2

3

Hierarchical Graph-Cut?
  • Solve large n graph with multiple small n GCE hierarchically
  • Example:
    • Solve n=16 with 4+1 n=4 graph-cuts

For each sub-graph,

find the best 2 label-sets

Sub-graph 0

Sub-graph 1

For each sub-graph vertice

Label 0 = 1st label set

Label 1 = 2nd label set

Assumption:

!! The optimal solution must be within the combinations of sub-graph label sets !!

Sub-graph 2

Sub-graph 3

hgc speed up evaluation
HGC Speed up Evaluation
  • For an 8-point GCE with 8-set of ECUs
    • Cost: 300 eq. adders
    • Latency: 41 cycles per graph
  • If only 1 GCE is used to compute 64-point 2 variable graph-cut

Latency

= 41 cycles x 8 + 41 cycles + TV

= 369 cycles + TV

If V is computed for each pixels

Tv=(8x8)X(8x7/2)X4=3584

Total Latency ~ 3953 cycles

Question: Is this solution the optimal label set for n=64???

hierarchical exhaustive search
Hierarchical Exhaustive Search

pat0 is the best candidate pattern

pat1 is 2nd best candidate pattern

  • 64x64 nodes
    • 4x4 based pyramid structure
    • 3 levels

Level 2

D@lv2 E0/E1@lv1

Label0@lv2 pat0@lv1

Label1@lv2 pat1@lv1

Level 1

D@lv1 E0/E1@lv0

Label0@lv1 pat0@lv0

Label1@lv1 pat1@lv0

Level 0

D@lv0 D0/D1@lv0

Label0@lv0 Label0

Label1@lv0 Label1

computing v term at level 1
Computing V term at Level 1
  • For 1st order neighboring sub-graphs Gi and Gj
    • possible neighboring pair combination
      • (pat0i, pat0j)
      • (pat0i, pat1j)
      • (pat1i, pat0j)
      • (pat1i, pat1j)
  • Compute V(patXi,patXj) with original neighboring cost
    • Example:
      • V(pat0i, pat0j) = K
      • V(pat0i, pat1j) = K+K+K = 3K

Gi

Gj

pat0i

pat0j

?

?

?

0

0

?

?

?

?

?

?

0

0

?

?

?

?

?

?

0

1

?

?

?

?

?

?

1

1

?

?

?

pat0i

pat1j

?

?

?

0

1

?

?

?

?

?

?

0

0

?

?

?

?

?

?

0

1

?

?

?

?

?

?

1

0

?

?

?

result of 16x16 256 2 level hes
Result of 16x16 (256) 2 level HES
  • Random generated 100 graphs
    • D/V~ 10
    • Symmetric V=20
  • Error Rate
    • Max: 17/256 ~ 6.6%
    • Average: 7/256 ~ 2.8%
    • Min: 2/256 ~ 0.8%
result of 64x64 4096 3 level hes
Result of 64x64 (4096) 3 level HES
  • Random generated 100 graphs
    • D/V~ 10
    • Symmetric V=20
  • Error Rate
    • Max: 185/4096 ~ 4.5%
    • Average: 146/4096 ~ 3.6%
    • Min: 115/4096 ~ 2.8%
death sentence to hes

Death Sentence to HES

Presenter: Nelson Chang

Institute of Electronics,

National Chiao Tung University

error rate vs graph size
Error Rate vs. Graph Size

Error rate range became smaller

  • (D,V)=(~163:20)

3.63 vs. 3.65

Error rate did not increase significantly

impact of different v cost

256x256 1 pattern result

Impact of different V cost
  • 64x64(3 level) HES
    • 100 patterns per V cost value
  • D cost (average over s-link caps of 10 patterns, 2 for each V)
    • Average: 162.8
    • Std.Dev: 94.4
  • V cost
    • 10, 20, 40, 60, 80
stereo matching case
Stereo Matching Case
  • Stereo Pair: Tsukuba
  • Expansion with random label order
    • 15 labels  15 graph-cut computations
  • Graph Size: 256 x 256
  • D term: truncated Sum of Squared Error (tSSE)
    • Truncated at AD=20
  • V term: Potts model
    • K=20
1st iteration result
1st iteration result

5

BnK’s expansion result

4

  • Error rate might exceed 20% for important expansion moves

9

Important expansions

reason for failure
Reason for failure
  • Best 2 local candidates does NOT include the final optimal solution
    • Error often happen near lv2 and lv3 block boundary
      • Majority node has both 0 source and sink link capacity
      • More dependent on neighboring node’s label
    • D:V ratio ~ 56:20  2.8:1
      • Similar to D:V = 163:60 case
      • Error rate for random pattern ~ 15%

Best 2 patterns in does NOT consider the pattern of

partitioned block graph cut

Partitioned (Block) Graph-Cut

Presenter: Nelson Chang

Institute of Electronics,

National Chiao Tung University

motivation
Motivation
  • Global
    • Considers the whole picture
    • More information
  • Local
    • Considers a limited region of a picture
    • Less information

Is it necessary to use that much information in global methods??

concept
Concept
  • Original full GC
    • 1 big graph
  • Partitioned GC
    • N smaller graphs

What’s the smallest possible partition to achieve the same performance?

experiment setting
Experiment Setting
  • Energy
    • D term
      • Luma only
      • Birchfield-Tomasi cost (best result at half-pel position)
      • Square Error
    • V term
      • Potts Model V= K x T(di≠dj)
      • K constant is the same for all partition
  • Partition Size
    • 4x4, 16x16, 32x32, 64x64, 128x128
  • Stereo Pairs
    • Tsukuba, Teddy, Cones, Venus
tsukuba 96x96 128x128
Tsukuba 96x96, 128x128

Full GC

128x128

96x96

venus 96x96 128x128
Venus 96x96, 128x128

Full GC

96x96

128x128

teddy 96x96 128x128
Teddy 96x96, 128x128

Full GC

96x96

128x128

cones 96x96 128x128
Cones 96x96, 128x128

Full GC

96x96

128x128

middleburry result
Middleburry Result

Evaluation Web Page http://cat.middlebury.edu/stereo/

Best: Full GC with best parameter

Full: Full GC with k=20(tsukuba) and 60 (others)

summary
Summary
  • Smallest possible partition size (2% accuracy drop)
    • Tuskuba64x64
    • Teddy & Cones  96x96
    • Venus  larger than 128x128
  • Benefits
    • Possible complexity or storage reduction
    • Parallelism increase
  • Drawbacks
    • Performance (disparity accuracy) drop
    • PC computation becomes longer
hierarchical parallel graph cut

Hierarchical Parallel Graph-Cut

Presenter: Nelson Chang

Institute of Electronics,

National Chiao Tung University

concept of hierarchical parallel gc
Concept of Hierarchical Parallel GC
  • Bottom Up
    • Solve graph-cut for smaller subgraphs
    • Solve graph-cut for larger subgraphs
      • Larger subgraphs = set of neighboring smaller subgraphs

!!Each subgraph is temporary independent !!

Larger subgraph = sg0+sg1+sg2+sg3

sg0

sg1

Level 0

Level 1

sg2

sg3

hpgc for solving a 256x256 graph
HPGC for solving a 256x256 graph

Step 1 64 32x32 Lv0 subgraphs

Step 2 16 64x64 Lv1 subgraphs

Step 3 4 128x128 Lv2 subgraphs

Step 4 1 256x256 Lv3 subgraphs

Total graph-cut computations

= 64+16+4+1 =85

!!HPGC must used Ford-Fulkerson-based methods!!

boykov and kolmogorov s motivation
Boykov and Kolmogorov’s Motivation

1

1

1

  • Dinic Method
    • Search the shortest augmenting path
    • Use Breadth First Search (BFS)
  • Example:
    • Search shortest path (length = k)
      • Use BFS, expand the search tree
      • Find all paths of length k
    • Search shortest path (length = k+1),
      • Use BFS, RE-expand the search tree again
      • Find all paths of length (k+1)
    • Search shortest path (length = k+2),
      • Use BFS, RE-RE-expand the search tree again
      • …..

1

1

1

1

1

1

1

Why don’t we REUSE the expanded tree?

bnk s method
BnK’s Method
  • Concept:
    • Reuse the already expanded trees
    • Avoid re-expanding the tress from scratch (nothing)
  • 3 stages
    • Growth
      • Grow the search tree
    • Augmentation
      • Ford-Fulkerson style augmentation
    • Adoption
      • Reconnect the unconnected sub-trees
      • Connect the orphans to a new parent

Augmenting Path

Saturate Critical Edge

Adopt Orphans

feature of bnk method
Feature of BnK method
  • Based on Ford-Fulkerson
    • Bidirection search tree constructon
    • Searched tree reuse
    • Determine label (source or sink) using tree connectivity

Source tree

Sink tree

connectivity is why hpgc works

Case 1

Case 2

Case 3

Case 4

Connectivity is why HPGC works
  • Example: a 2x4 binary variable graph

Graph view

Tree view

how to add edges
How to add edges
  • When should node A and B check their edge
    • If A & B belong to different search trees
      • A is in a sink tree, B is in a source tree
      • A is in a source tree, B is in a sink tree
      • Implies a source->sink path
    • If A or B is an orphan (not connected to any tree)
      • A is an orphan, B is not an orphan
      • A is not an orphan, B is an orphan
      • Check for possible connectivity of the orphan

B

A

complexity result
Complexity Result
  • Method
    • Annotate each line of code with basic operations
      • Read
      • Write
      • Arithmetic
      • Logic
      • Compare
      • Branch
  • Examples
    • C=A+B 2R, 1W, 1A
    • If(A==B) 2R, 1C, 1B
stereo matching case1
Stereo Matching Case
  • Stereo Pair: Tsukuba
  • Expansion with random label order
    • 15 labels  15 graph-cut computations
  • Graph Size: 256 x 256
  • D term: truncated Sum of Squared Error (tSSE)
    • Truncated at AD=20
  • V term: Potts model
    • K=20
1st iteration result1
1st iteration result

5

BnK’s expansion result

4

  • Label 4, 5, 9 are key moves

9

Important expansions

full bnk graph cut operation distribution
Full BnK Graph-cut Operation Distribution
  • 256x256 graph – Tsukuba iteration 0 label 5

77,407,307 Operations

Memory access dominant

Control ~22%

Arithmetic is insignificant

full gc vs hpgc
Full GC vs. HPGC
  • 256x256 graph – Tsukuba iteration 0 label 5

77,407,307 Operations

16PE

4PE

8PE

32PE

64PE

conclusion
Conclusion
  • HPGC can improve speed with multiple PEs
  • To perform 30 fps, 30 labels, 256x256 graph-cut
    • 1PE@100MHz
    • Averge cycle budget for each subgraph ~1.3K cycles
      • Lv0 subgraph is 32x32
  • Next step
    • Small BnK graph-cut engine architecture design
      • Estimate speed/cost
progress check
Progress Check
  • Previous plan
    • Parallel graph-cut engine for binary-variable graph
      • Based on Boykov and Kolmogorov’s graph cut algorithm
        • Complexity analysis done
        • Hierarchical parallel algorithm SW model done
    • Small BnK graph-cut engine architecture design
      • Based on Boykov and Kolmogorov’s algorithm next 2 weeks
    • Hierarchical parallel graph-cut engine architecture design
      • Based my hierarchical parallel algorithm modified from BnK’s algorithm June/July