Saliency-Assisted Navigation of Very Large Landscape Images

Saliency-Assisted Navigation of Very Large Landscape Images Cheuk Yiu Ip AmitabhVarshney

Very Large Landscape Images • Image Acquisition: • Gigapan • MS HDViewToG 2007 • Image Stitching: • Kazhdan et al ToG 2008, 2010 • Summa et al ToG 2010 • Stitch images to create multi-gigapixel very large images • But WHERE should we start looking?

Visual Knowledge Discovery • Visual knowledge discovery • Identify what is interesting • Visualize them

Results Preview

Visual Scalability

Information Scalability • Design effective algorithms to process large images • The SMALL unique regions in the large images contain the MOST information • Identify informative regions from repetitive scene elements

Data Scalability • Very large images represent a large amount of data 5 Gpix RGBA = 20GB uncompressed • Multicore and manycore parallel processing • Requires efficient algorithms O(n) and out-of-core GPU methods

Overview • Sliding-Window Saliency Map • Detection Anomalous Regions • Interactive Exploration

Traditional Multiscale Image Saliency • Detects “Pop-out” spots from the scene • Inspired by human visual system • Pre-attentive vision • Find multiscale contrasting regions • Intensity, Color Opponencies (I, RG, BY) • Convolve (I, RG, BY) with Difference of Gaussians (DoG) filter (σ is stdev) • Repeat on downsampled images for multiscales • Image Saliency • Itti et al.PAMI,1998 • Bruce et al. IJCV,2009 • Goferman et al. CVPR 2010 • Work on small images, very accurate but slow.

Multiscale Aggregation • Works well on small images • If we have many more scales … • Large regions dominate small regions • Wait… we don’t want to miss the small regions • Traditional multiscale saliency is insufficient

Our Sliding-Window Aggregation • We see different things at different zoom levels • One saliency map per level • Only aggregate up to 4x • Use a sliding-window across scales • Why 4x? • Eye resolution difference ~5x 16σ – 64σ All (σ - 256σ) 4σ – 16σ σ – 4σ

There are still too many regions… • 18,000+ regions in 1.3Gpix (5 hours if a user spends 1s on each) • Regions are enlarged for visibility • There are many contrasting repetitive elements

Information Discovery • Identify the informative regions from the salient regions • Compare regions to find the most different ones • Detect the anomalous regions and outliers • Visual Data Analysis • Mesh and Volume Saliency (Lee et al. ToG 2005, Kim et al. TVCG 2006) • Video Summarization (Daniel et al. Vis 2003) • Flow and Information Theory (Janicke et al. TVCG 2010) • Molecular Dynamics Layout (Patro et al. Biovis 2011)

Represent salient regions by histograms (rotational invariance) Global Colors RGB, HSV, CIELAB: Not discriminative Local Edges: Too discriminative Histograms of colors in 8x8 moving windows work well(MPEG-7 CSD) Compare histograms, p, q, by the Euclidean distance Image Region Descriptors

k-Nearest-Neighbors Anomaly Detection • Uniqueness, U(p), is the average distance of p to its k-Nearest-Neighbors. • Repeating regions have a low U(p) • Distinct regions have a high U(p) • Spatial data structures (kD-trees) accelerate the retrieval

Where are they … ? • Top 3% (500) of the most distinct regions. • Most of the repeating region are eliminated. • Can you see the remaining regions?

Visualizing the Detected Regions • Problem: Small regions of interests are NOT visible • Adaptively enlarge regions • Determine the scale and colors by the region’s rank of uniqueness • Increase when zooming out • Decrease when zooming in • (Formula in paper)

Automatic Exploration • Explore the regions in descending order of their uniqueness • k-NN anomaly detection step provides uniqueness ordering

Interactive Refinement • Locate similar undesired regions • Select a representative • Move the slider to adjust the coverage • Delete the selection The spatial data structure indexes the regions and provides fast retrieval

After User Refinement • The remaining 300 regions after 3 refinement interactions

Data Scalability • GPU Out-of-core saliency computation • Break the image into tiles • Parallel Gaussian filtering on GPU • Filter overlapping boundary tiles to maintain continuity • Saliency map storage • Fit and store ellipses of the salient regions • Do not store an extra image • Tiled Image Viewer • View dependent mipmap image tiles loading and prefetching for smooth pan and zoom

Royal Gorge Bridge (1.4 Gpix)

Cacti (4.0 GPix)

Mount Whitney (5.0 GPix)

Gigapan Community Tags Grimsel Pass Royal Gorge Bridge

Gigapan Community Tags Cacti Mount Whitney

Limitations • Buffelgrass after fire • The “Original” cactus • Tags with semantic information • Domain knowledge necessary • Why are they tagged ?

Performance • Each GPix takes 2.5 1 hours to preprocess(1 NVIDIA GeForce GTX 285 GPU and 1 CPU) • Each interaction takes 10 ms

Conclusions • First step on visual knowledge discovery on very large landscape images • Visual Scalability: Sliding-Window Saliency • Information Scalability: Anomaly Detection • Data Scalability: Parallel filtering, Saliency Storage • Interactive Navigation

Future work • There are a lot of very large images • Astronomy • Microscopy • Product inspection • Urban Scenes • Domain specific descriptors • Fast discovery of locally distinct regions. • Accurate Identification of globally unique regions.

Acknowledgements • National Science Foundation: CCF 05-41120, CMMI 08-35572, CNS 09-59979 • NVIDIA CUDA Center of Excellence Program • Derek Juba, SujalBista, Rob Patro, Icaroda Cunha, Yang Yang, AdilYalcin, and the reviewers for improving this paper and presentation • The Vis paper award committees Thank you!

Questions ? • Please see our websitesfor the paper and video: • Cheuk Yiu Ip • www.cs.umd.edu/~ipcy/ • GVIL Research Highlights • www.cs.umd.edu/gvil/

Saliency-Assisted Navigation of Very Large Landscape Images

Saliency-Assisted Navigation of Very Large Landscape Images

Presentation Transcript

Very Large Floating Structures Hydroelasticity

Building Very Large Overlay Networks

Very Large Databases

Large Water Project Images

Random Walks on Graphs to Model Saliency in Images

The Expanded Very Large Array

The Expanded Very Large Array

Unconscious processing of visual saliency

Progress VLDB (Very Large DataBases )

VLA (Very Large Array)

The Expanded Very Large Array

Large Scale Discovery of Spatially Related Images

NAC-Slide: Displaying Very Large NAC Images Robert Wagner Mentor: Mark Robinson

Very Large Scale Neighborhood Search

Very Large Contexts (VLCs)

Very large numbers!

Introduction of Saliency Map

Very Large Array data

Very large data sets

The Very Large Telescope Interferometer

Progress VLDB (Very Large DataBases )

Data Mining of Very Large Data