Xing Mei ;  Xun Sun ;  Mingcai Zhou ;  Shaohui Jiao  ;   Haitao Wang ; Xiaopeng Zhang  - PowerPoint PPT Presentation

slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Xing Mei ;  Xun Sun ;  Mingcai Zhou ;  Shaohui Jiao  ;   Haitao Wang ; Xiaopeng Zhang  PowerPoint Presentation
Download Presentation
Xing Mei ;  Xun Sun ;  Mingcai Zhou ;  Shaohui Jiao  ;   Haitao Wang ; Xiaopeng Zhang 

play fullscreen
1 / 55
Xing Mei ;  Xun Sun ;  Mingcai Zhou ;  Shaohui Jiao  ;   Haitao Wang ; Xiaopeng Zhang 
189 Views
Download Presentation
lark
Download Presentation

Xing Mei ;  Xun Sun ;  Mingcai Zhou ;  Shaohui Jiao  ;   Haitao Wang ; Xiaopeng Zhang 

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. On Building an Accurate Stereo Matching System on Graphics Hardware Xing Mei ; Xun Sun ;  Mingcai Zhou ;  Shaohui Jiao ;   Haitao Wang ; Xiaopeng Zhang  Samsung Advanced Institute of Technology, China Lab Computer Vision Workshops, 2011 IEEE

  2. Outline • Introduction • Related Works • Algorithmn • CUDA Implementation • Experimental Results • Conclusion

  3. Introduction

  4. Introduction Dense two-frame stereo matching • Compute a disparity map from stereo images. • Broad applications: 3D reconstruction, view interpolation

  5. Related Works

  6. Related Works • Local methods • Compute each pixel’s disparity independently over a local support region. • Fastbutinaccurate. • Global methods • Solve the stereo problem in an energy minimization process. • Accuratebutslowdue to time-comsuming global optimizer.(GC,BP)

  7. Related Works • Propagation-based methods • Produce quasi-dense or dense disparity results from a set of seed pixels. • Relatively fast but sensitive to early wrong matches • use segmented regions as guided propagation unit • expensivecost

  8. Related Works • Introduce a simple guided unit for propagation : pixel-wise 1D line segments. • No image segmentation required here. • Simple, fast and accurate

  9. Algorithmn

  10. Algorithmn • Framework • Input: • Stereoimages Output: Disparity map

  11. Algorithmn • Input: • Stereoimages Output: Disparity map

  12. Disparity Cost Computing • Cost mesure : AD, BT, gradient-based measures, non-parametric transforms(rank/census[3])...... • Combination : SAD+gradient[6],AD + Census • AD (Absolute Distance) • Constant color assumption • Repetitive structures • Census • Encodes local image structures • Textureless regions [3] H. Hirschmuller and D. Scharstein. “Evaluation of stereo matching costs on images with radiometric differences.”IEEE TPAMI, 31(9):2009. [6] A. Klaus, M. Sormann, and K. Karner. “Segment-based stereo matching using belief propagation and a self-adapting dissimilarity measure.” ICPR,2006.

  13. AD-Census Cost Initialization + • p : pixel • d : level • >> a robust function on variable 𝑐 • pd = (x-d,y) in the right image • : Hamming distance[22] d Left I Right I [22] R. Zabih and J. Woodfill. “Non-parametric local transforms for computing visual correspondence.” In Proc. ECCV, 1994.

  14. Census Transform Census transform window :

  15. Census Hamming Distance • Left image • Right image Hamming Distance = 3 XOR

  16. AD-Census Cost Initialization + • > >> a robust function on variable 𝑐

  17. AD-Census Cost Initialization • AD-Census measure produces proper disparity results for both repetitive structures and textureless regions.

  18. Algorithmn • Input: • Stereoimages Output: Disparity map

  19. Cross-based Cost Aggregation[23] • Cross construction • Line ending points P1, P2 for P are located when rule 1 or 2 are violated: • R1: Color self-similarity in the line region: smooth depth assumption • R2: Arm length limitation: avoid over-smoothness [23] K. Zhang, J. Lu, and G. Lafruit. “Cross-based local stereo matching using orthogonal integral images.” IEEE TCSVT,2009.

  20. Cross-based Cost Aggregation

  21. Cross-based Cost Aggregation • Enhancecross construction (use pixel p’s left arm and the endpointpixel pl as an example)

  22. Cross-based Cost Aggregation • Cost aggregation • Run this step for 4 iterations to get stablecost values. • For iteration 1 and 3, aggregated horizontally and thenvertically. • For iteration 2 and 4, aggregated verticallyand then horizontally. • Reduce the errors at depth discontinuities.

  23. Cross-based Cost Aggregation • Our aggregation method can better handle large textureless regions and depth discontinuities.

  24. Cross-based Cost Aggregation [21] K.-J. Yoon and I.-S. Kweon. “Adaptive support-weight approach for correspondence search.” IEEE TPAMI, 2006. [23] K. Zhang, J. Lu, and G. Lafruit. “Cross-based local stereo matching using orthogonal integral images.” IEEE TCSVT,2009.

  25. Algorithmn • Input: • Stereoimages Output: Disparity map

  26. ScanlineOptimization[2] • 4 scanline optimization processes are performed independently. • 2 horizontal directions • 2 vertical directions [2] H. Hirschmuller. Stereo processing by semiglobal matching and mutual information.” IEEETPAMI, 2008.

  27. Scanline Optimization p p-r r • r : direction • p-r : the previous pixel along the same direction • 𝑃1, 𝑃2: penalize the disparity changes between neighboring pixels. (𝑃1 ≤𝑃2) [8] [8]S. Mattoccia, F. Tombari, and L. D. Stefano. “Stereo vision enabling precise border localization within a scanline optimization framework.” In Proc. ACCV, pages 517–527, 2007.

  28. Scanline Optimization • The final cost : • The disparity with the minimum 𝐶2value is selected as pixel p’s intermediate result.

  29. Algorithmn • Input: • Stereoimages Output: Disparity map

  30. Multi-step Disparity Refinement • Outlier Handling • Outlier Detection • Iterative Region Voting • Proper Interpolation • Depth Discontinuity Adjustment • Sub-pixel Enhancement

  31. Outlier Handling--Detection • The outliers:𝐷𝐿(p) != 𝐷R(p − (𝐷𝐿(p), 0)) • Outliers are further classified into occlusion and mismatch points • p intersect its epipolar line and𝐷Ris checked • If no intersection p is labelled as “occlusion”, otherwise “mismatch”

  32. Outlier Handling--Iterative Region Voting • Construct cross-based regions and a robust voting scheme • Sp : • 𝜏𝑆, 𝜏𝐻 : threshold values • 5 iterations d d

  33. Outlier Handling--Proper Interpolation • occlusion • The pixel with the lowest disparity value is selected for interpolation • It’s most likely comes from the background • mismatch points • The pixel with the most similar color is selected for interpolation.

  34. Depth Discontinuity Adjustment • For each pixel p on the disparity edge, two pixels p1, p2 from both sides of the edge are collected. • 𝐷𝐿(p) is replaced by 𝐷𝐿(p1) or 𝐷𝐿(p2) if one of the two pixels has smaller matching cost than 𝐶2(p,𝐷𝐿(p)). 𝐷𝐿(P1) 𝐷𝐿(P) 𝐷𝐿(P2)

  35. Sub-pixel Enhancement[20] • Quadratic polynomial interpolation • With 3*3 median filter [20] Q. Yang, L. Wang, R. Yang, H. Stewenius, and D. Nister. “Stereo matching with color-weighted correlation, hierarchical belief propagation andocclusion handling.” IEEE TPAMI, 2009.

  36. Multi-step Disparity Refinement • The average error percentages after performing each refinement step.

  37. CUDA Implementation

  38. CUDA Implementation • Compute Unified Device Architecture (CUDA) is a programming interface for parallel computation tasks on NVIDIA graphics hardware. • The computation task is coded into a kernelfunction. • The allocation of the threads is controlled with two hierarchical concepts: grid andblock. • Akernelcreates a grid with multiple blocks, and each block consists of multiple threads.

  39. CUDA Implementation • Cost Initialization: • Parallelize with 𝑊 × 𝐻 threads. • Organize into a 2D grid and the block size is set to 32× 32. • Each thread computes a cost value for a pixel at a given disparity. • Forcensus transform, a square window is require for each pixel, which requires loading more data into the shared memory for fast access.

  40. CUDA Implementation • Cross-based Cost Aggregation: • A grid with 𝑊 × 𝐻 threads. • Cross construction:block size is 𝑊 or 𝐻 toefficiently handle a scanline • Cost aggregation:block size is 32X32 • Data reuse with shared memory is considered in both steps.

  41. CUDA Implementation • Scanline Optimization: • This step is different,because the process is sequential in the scanline direction and parallel in the orthogonal direction. • 𝑊 × 𝐷 or 𝐻 × 𝐷 threads • Disparity Refinement: • 𝑊 × 𝐻 threads

  42. Experimental Results

  43. Experimental Results • Device:A PC with Core 2 Duo 2.20GHz CPU and NVIDIA GeForce GTX 480 graphics card • Settingsparameters: • Source : Middlebury http://vision.middlebury.edu/stereo/ HHI database(book arrival) Microsofy i2i database(Ilkay)

  44. Experimental Results • The GPU-friendly system brings an impressive 140× speedup. • The average proportions of the GPU running time for the four computation steps are 1%,70%,28% and 1% respectively. • The iterative cost aggregation step and the scanline optimization process dominate the running time.

  45. Experimental Results • First row: disparity maps generated with our system. • Second row: disparity error maps with threshold 1. • Errors in unoccluded and occluded regions are marked in black and gray respectively.

  46. Experimental Results

  47. Experimental Results • video

  48. Experimental Results Snapshots on ’book arrival’ stereo video

  49. Experimental Results Snapshots on ’Ilkay’ stereo video

  50. Conclusion