1 / 34

Journal of Visual Communication and Image Representation

Multi-view video based multiple objects segmentation using graph cut and spatiotemporal projections. Journal of Visual Communication and Image Representation Volume 21, Issues 5–6, July–August 2010, Pages 453–461 Qian Zhang,  King Ngi Ngan

Download Presentation

Journal of Visual Communication and Image Representation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multi-view video based multiple objects segmentation using graph cut and spatiotemporal projections Journal of Visual Communication and Image Representation Volume 21, Issues 5–6, July–August 2010, Pages 453–461 Qian Zhang, King NgiNgan Department of Electronic Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong Speaker : Yi-Ting Chen

  2. Outline • Introduction • The proposed framework • Method • Segmentation for key view • Multi-view video segmentation • Experimental results • Conclusion

  3. Outline • Introduction • The proposed framework • Method • Segmentation for key view • Multi-view video segmentation • Experimental results • Conclusion

  4. Introduction • Most of the interest has been focused on the research of single view segmentation. • Depth information in the 3D scene can be reconstructed from multi-view images, but multiple view segmentation has not attracted much attention. • Most of the classical and start-of-the-art graph cut based segmentation algorithms require user’s interventions to specify the initial foreground and background regions as hard constraints.

  5. Outline • Introduction • The proposed framework • Method • Segmentation for key view • Multi-view video segmentation • Experimental results • Conclusion

  6. Overview of the proposed framework • We built a five-view camera system for views v∈{0,1,2,3,4} • To reduce the projection error and avoid extensive computational load, we select view 2 as the key viewto start the segmentation process.

  7. Outline • Introduction • The proposed framework • Method • Segmentation for key view • Multi-view video segmentation • Experimental results • Conclusion

  8. Automatic initial interested objects (IIOs) extraction based on saliency model (SM) • Inspired by the work in [33], more sophisticated cues such as motion and depth are combined into our topographical SM. (a) input image, (b) saliency map using depth and motion and (c) extracted IIOs. [33] W.X. Yang, K.N. Ngan ,Unsupervised multiple object segmentation of multiviewimages, Advanced Concepts for Intelligent Vision Systems Conference (2007), pp. 178–189

  9. Multiple objects segmentation using graph cut • For individual object, we construct a sub-graph for the pixels belonging to its “Object Rectangle” • an enlarged rectangle to encompass the whole object and restricts the segmentation region • we convert multiple objects segmentation into several sub-segmentation problems

  10. Objects segmentation by using graph cut • Graph cut • The general formulation of energy function: smoothness term Data term

  11. Basic energy function • Data term • evaluate the likelihood of a certain pixelp assigned to the label fp • color (RGB) and depth information are combined is the color distributions modeled by the Gaussian Mixture Model (GMM) is the depth modeled by the histogram mode g(·) denotes a Gaussian probabilitydistribution h(·) is the histogram mode w(·) is the mixture weighting coefficient is GMM component variable ={d, r, g, b} is a four-dimensional feature vector for pixel p

  12. Basic energy function • Smoothness term • Ep,q(fp,fq) measures the penalty of two neighboring pixelsp andq with different labels • dist(p,q) is the coordinate distance betweenp andq  • diff(cp,cq)is the average RGB color difference between p andq  • βr=(2〈‖(rp-rq)2‖〉)-1, where 〈·〉 is the expectation operator for the red channel.

  13. The result with basic energy function basic energy function using: (a) color, (b) depth, (c) combined color and depth 

  14. The segmentation errors in the rectangles • errors occur because their color and depth information are very similar to the foreground data

  15. Background penalty with occlusion reasoning(1/2) • Since we capture the same scene at different view points, occluded background regions often occur around the object boundary. • the occluded regions have a higher probability to be the background than the visible ones.  • impose a background penalty factor αbp=3.5

  16. Background penalty with occlusion reasoning(2/2) background probability map without occlusion penalty, (b) combined occlusion map (c) background probability map with occlusion penalty.

  17. The erroneous segmentations marked as ellipses • errors are mainly caused by the strong color contrast in the background comparing to the weak contrast across the “true” object boundary

  18. Foreground contrast enhancement(1/3) • To make the color contrast representation more efficient • the average color difference is computed in the perceptually uniform L *a *b color space • To enhance the contrast across foreground/background boundary and attenuate the background contrast • adopting the motion residual information

  19. Foreground contrast enhancement(2/3) • The motion residual is defined as • the smoothness term combines the L *a *b colorand motion residual contrasts is the reconstructed image from and the motion field are the motion residual of p and q

  20. Foreground contrast enhancement(3/3) • But combining the color contrast and motion residual contrasts will not only attenuate the background contrast but also weaken the ‘‘true” foreground contrast

  21. Foreground contrast enhancement(3/3) • we define a local color contrast to enhance the discontinuity distribution in its neighborhood • calculate the local mean μ and the local variance δ of contrast • = [26] J. Wang, P. Bhat, R.A. Colburn, M. Agrawala, M.F. Cohen, Interactive video cutout, ACM Transactions on Graphics 24 (2005) 585–594.

  22. The result with modified energy function

  23. Outline • Introduction • The proposed framework • Method • Segmentation for key view • Multi-view video segmentation • Experimental results • Conclusion

  24. Multi-view video segmentation • projected by pixel-based disparity compensation • exploits the spatial consistency • projected by pixel-based motion compensation from the mask of its previous frame • enforces the temporal consistency

  25. Uncertain boundary band validation • To improve the segmentation results, we construct an uncertain band along the object boundary based on an activity measure •  using our graph cut algorithm to yield more accurate segmentation layers (a) (b) prediction mask of view 3 (b) uncertain band based on the post-processing

  26. Outline • Introduction • The proposed framework • Method • Segmentation for key view • Multi-view video segmentation • Experimental results • Conclusion

  27. Experimental results • five-view camera system • resolution of 640 * 480 at frame rate of 30 frames per second (fps) • we demonstrated on two types of multi-view videos simulating different scenarios • similar and low depth • different depths

  28. Segmentation of IOs with similar and low depth

  29. Segmentation of IOs with different depths

  30. Comparison with others’ methods(1/2) • compare with Kolmogorov’s bilayer segmentation algorithm [21] by using their test images (a) left view, (b) right view, (c) result by our proposed algorithm (d) result by Kolmogorov’s algorithm. [21] V. Kolmogorov, A. Criminisi, A. Blake, G. Cross, C. Rother, Probabilistic fusion of stereo with color and contrast for bi-layer segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence 28 (9) (2006) 1480–1492.

  31. Comparison with others’ methods(2/2) • compare our proposed algorithm with an existing method employing multi-way cut with α-expansion [33] using our test images [33] W.X. Yang, K.N. Ngan ,Unsupervised multiple object segmentation of multiview images, Advanced Concepts for Intelligent Vision Systems Conference (2007), pp. 178–189

  32. Outline • Introduction • The proposed framework • Method • Segmentation for key view • Multi-view video segmentation • Experimental results • Conclusion

  33. Conclusion • In this paper, we propose an automatic segmentation algorithm for multiple objects from multi-view video. • The experiment was implemented on two representative multi-view videos. • Accurate segmentation results with good visual quality and subjective comparison with others’ methods attest to the efficiency and robustness of our proposed algorithm.

  34. Thanks for your listening!

More Related