1 / 33

Gunhee Kim Eric P. Xing

Jointly Aligning and Segmenting Multiple Web Photo Streams for the Inference of Collective Photo Storylines. Gunhee Kim Eric P. Xing. School of Computer Science, Carnegie Mellon University. June 19, 2013. Outline. Problem Statement Algorithm Dataset and preprocessing

malise
Download Presentation

Gunhee Kim Eric P. Xing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Jointly Aligning and Segmenting Multiple Web Photo Streams for the Inference of Collective Photo Storylines Gunhee Kim Eric P. Xing School of Computer Science, Carnegie Mellon University June 19, 2013

  2. Outline • Problem Statement • Algorithm • Dataset and preprocessing • Alignment of Multiple Photo Streams • Large-scale Cosegmentation • Experiments • Conclusion

  3. Background Online photo sharing becomes so popular Information overload problem in visual data

  4. Background Query scuba+divingfrom Flickr Any meaningful structural summary? Taken in different spatial, temporal, and personal perspective Likely to share common storylines

  5. Our Ultimate Goal An example of scuba+diving storyline beach on boat diving underwater dinner boat coral sunset cf) ranking and retrieval by Narrative structural summary vs. independently retrieved images Reconstructing photo storylines from large-scale online images

  6. Objective of This Paper As a first technical step, jointly perform two crucial tasks... Mutually rewarding! Alignment Cosegmentation • Segment K common regions from aligned M images • Match images from different photo streams PS2 User 1 at 10/19/2008 (Cayman Islands) PS14 User 2 at 03/19/2005 (Phuket, Thailand) PS3 User 3 at 08/27/2008 (Cozumel, Mexico)

  7. Objective of This Paper As a first technical step, jointly perform two crucial tasks... Mutually rewarding! Alignment Cosegmentation • Online images are too diverse to segment together at once • The alignment discovers the images that share common regions PS2 User 1 at 10/19/2008 (Cayman Islands) PS14 User 2 at 03/19/2005 (Phuket, Thailand) PS3 User 3 at 08/27/2008 (Cozumel, Mexico)

  8. Objective of This Paper As a first technical step, jointly perform two crucial tasks... Mutually rewarding! Alignment Cosegmentation • Improve image matching by a better image similarity measure Closing a loop between the two tasks

  9. Outline • Problem Statement • Algorithm • Dataset and preprocessing • Alignment of Multiple Photo Streams • Large-scale Cosegmentation • Experiments • Conclusion

  10. Flickr Dataset Flickr dataset of 15 outdoor recreational activities • Experiments with more than 100Kimages of 1Kphoto streams • Larger than those of previous work by orders of magnitude # photo streams (13,157) # of images(1,514,976) SurfingBeach HorseRiding AirBallooning RAfting YAcht ROwing ScubaDiving FormulaOne SNowboarding SafariPark MountainCamping RockClimbing LondonMarathon FlyFishing Tour deFrance

  11. Image Descriptor and Similarity Measure Image description • HSV color SIFT and HOG features on regular grid • L1 normalized spatial pyramid histogram using 300 visual words Image similarity measure • (Our assumption) Segmentation enhances the image alignment. 2. Segmentation available 1. No segmentation available • Histogram intersection on SPH • Histogram intersection on the best assignment of segments • Not robust against location/pose changes

  12. Outline • Problem Statement • Algorithm • Dataset and preprocessing • Alignment of Multiple Photo Streams • Large-scale Cosegmentation • Experiments • Conclusion

  13. Alignment of Photo Streams Input: A set of photo streams (PS): P = {P1,…,PL} • Photo Stream: a set of photos taken in sequence by a single user in a single day Idea: Align all photo streams at once after building K-NN graph • Naïve-Bayes Nearest Neighbor(NBNN) [Boiman et al. 08] for similarity metric P1 P2 P3 P4

  14. Alignment of Photo Streams Input: A set of photo streams (PS): P = {P1,…,PL} • Photo Stream: a set of photos taken in sequence by a single user in a single day Idea: Align all photo streams at once after building K-NN graph • Naïve-Bayes Nearest Neighbor(NBNN) [Boiman et al. 08] for similarity metric For simplicity, first consider pairwise alignment of two photo streams Pairwise alignment P1 P2 P3 P4

  15. Pairwise Alignment Goal of alignment: find a matching btw a pair of PS • means I in P1 has no match in P2. Optimization: MRF-based energy minimization • ex) Deformable image matching [Shekhovtsovet al.07] or SIFT flow [Liu et al. 2009] • Flexibility: Various energy terms • Solved by discrete BP

  16. Pairwise Alignment Objective function 9AM Data term: The matched image pairs should be visually similar. Smoothness term: The matched images to neighbors in P1 should be neighbors in P2. Time term: The matched image pairs should be temporally similar. 6PM 10AM

  17. Alignment of Multiple Photo Streams Objective : MRF-based energy minimization : All pairs of NN photo streams Message-passing based optimization • until convergence or for fixed iterations Pairwise alignment Pairwise alignment Pairwise alignment P1 P2 Pairwise alignment P3 P4

  18. Alignment of Multiple Photo Streams Objective : MRF-based energy minimization : All pairs of NN photo streams Message-passing based optimization • until convergence or for fixed iterations P1 P2 P3 P4

  19. Outline • Problem Statement • Algorithm • Dataset and preprocessing • Alignment of Multiple Photo Streams • Large-scale Cosegmentation • Experiments • Conclusion

  20. Build an Image Graph Idea: Connect the images that are similar enough to be cosegmented Image Graph G = (I, E) • I : The set of images. E: The set of edges. • E = EBUEw • EB: Edges between different photo streams (results of alignment) • EW: Edges within a photo stream consider the images such that For each image I, links I with the K-NN of I (EW). Ew EB

  21. Scalable Cosegmentation Iteratively run the MFC algorithm [Kim and Xing, 2012]on the image graph Review of MFC algorithm • Cosegmentation: Jointly segment M images into K+1regions (Kforegrounds (FG) + background (BG)) Foreground Modeling RegionAssignment • Learn appearance models of K FGs and BG Iterate • Allocate the regions of image into one of KFGs or BG • Any region classifiers or their combination • Very efficiently solve using the idea of combinatorial auction • Ex. Gaussian mixture on RGB, linear SVM on SPH

  22. Scalable Cosegmentation on Image Graph Message-passing based optimization • Learn FG Models from neighbors of Ii. • Run region assignment on Ii. Iteratively solve… FG 1 (car) FG 3(BG) FG 2(road)

  23. Scalable Cosegmentation on Image Graph Initialization Message-passing based optimization • Supervised: start from seed labels • Learn FG Models from neighbors of Ii. • Unsupervised: use the algorithm of CoSand[Kim et al. 2011]. • Run region assignment on Ii. Iteratively solve… FG 1 (car) FG 3(BG) FG 2(road)

  24. Summary of Algorithm As a first technical step, jointly perform two tasks... Mutually rewarding! Alignment Cosegmentation Message-passing style optimization • Build K-NN graph between photo streams (PS) • Build K-NN graph between images • Align all photo streams at once guided by the PS graph • Cosegment all images at once guided by the image graph Computation time: O(T|E|) • O(TN) where N: # of images • O(TL) where L: # of photo streams

  25. Outline • Problem Statement • Algorithm • Dataset and preprocessing • Alignment of Multiple Photo Streams • Large-scale Cosegmentation • Experiments • Conclusion

  26. Evaluation – Two Experiments Evaluation for Alignment Very hard to obtain groundtruth! • Correspondences btw two sets of thousands of images? Task: Temporal localization (inspired by geo-location estimation) Where is it likely to be taken? Whenare they likely to be taken? Timeline P1 [Hays and Efros. 2008] Evaluation for cosegmentation Task: Foreground detection • We manually annotate 100 images per class • Accuracy is measured by intersection-over-union

  27. Evaluation of Alignment Procedures of temporal localization 1. Given a set of photo streams, randomly split training and test sets Training (80%) Better temporal localization ≠ Better Alignment Test (20%) 2. Run alignment 3. Estimate timestamps of all images in test photo streams 4. Temporal localization is correct if Baselines • BPS: Our Alignment + Cosegmentation • BP: Our alignment only Popular multiple sequence alignment • KNN: K-nearest neighbors – Image similarity only (the simplest) • HMM: Hidden Markov Models Justify closing a loop • DTW: Dynamic Time Windows

  28. Evaluation of Alignment = 60 min. • Temporal localization is correct if BPS: Our Alignment + Cosegmentation BP: Our alignment only KNN: K-nearest neighbors HMM: Hidden Markov Models DTW: Dynamic Time Windows

  29. Evaluation of Cosegmentation Task: Foreground detection BP+MFC: (Proposed) Alignment + Cosegmentation MFC: Our cosegmentation without alignment COS : Submodular optimization [Kim et al. ICCV11] LDA: LDA-based localization [Russell et al. CVPR06] Examples

  30. Outline • Problem Statement • Algorithm • Dataset and preprocessing • Alignment of Multiple Photo Streams • Large-scale Cosegmentation • Experiments • Conclusion

  31. Conclusion Ultimate goal: building photo storylinesfrom large-scale online images horse+riding safari+park

  32. Conclusion Jointly perform Alignment and cosegmentation • As a first technical step • Much remains to be done. • Code will be available soon. Message-passing based optimization • An unified framework for both alignment and cosegmentation • Scalable (ex. linear computation time with # PS, # of images)

  33. Thank you !Stop by our poster.

More Related