1 / 32

Large-scale Satellite Image Browsing using Automatic Semantic Categorization and Content-based Retrieval

Large-scale Satellite Image Browsing using Automatic Semantic Categorization and Content-based Retrieval. Ashish Parulekar, Ritendra Datta, Jia Li, James Z. Wang The Pennsylvania State University http://riemann.ist.psu.edu/image. Outline. The Problem The Data Related Work Our Approach

sandra_john
Download Presentation

Large-scale Satellite Image Browsing using Automatic Semantic Categorization and Content-based Retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Large-scale Satellite Image Browsing using Automatic Semantic Categorizationand Content-based Retrieval Ashish Parulekar, Ritendra Datta, Jia Li, James Z. Wang The Pennsylvania State University http://riemann.ist.psu.edu/image

  2. Outline • The Problem • The Data • Related Work • Our Approach • Results • Conclusion

  3. Problem: How to effectively query on a very large collection of satellite imagery for semantically meaningful regions ? Use semantic categorization combined with CBIR on small “patches” of the images

  4. Query Formulation Query “patch” – pertaining to some semantics, e.g. mountains Satellite Image Database Ranked Results • Purpose ? • Geography - Find mountainous regions with snow-caps (low-level semantics). • Forestry – Find forests of a certain density, analyze deforestation (mid-level semantics). • Military – Find air-bases in certain regions of the world (high-level semantics).

  5. Outline • The Problem • The Data • Related Work • Our Approach • Results • Conclusion

  6. The Data • Freely available multi-spectral imagery – • Satellite: Landsat 7 • Sensor: Enhanced Thematic Mapper Plus (ETM+) • WWW: http://glcf.umiacs.umd.edu/data/landsat/ • Resolution: ~ 30 Meters. • Seven Bands (R, G, B, NIR, SWIR1, SWIR2, TIR) • Size (approximate) • 6000 x 6600 pixels • 180 Km x 196 Km • ~ 64MB ( TIFF format, each band) Visual NIR SWIR

  7. Spectral Bands of TM and ETM+ Landsat Images

  8. Outline • The Problem • The Data • Related Work • Our Approach • Results • Conclusion

  9. Classifiers • Rigid Classifiers • Neural networks • P. D. Heerman and N. Khazenie • H. Bischof and W. Schneider • Decision trees • A. S. Kumar and K. L. Majumder • Genetic algorithms • B. C. K. Tso and P. M. Mather • Markov Random Fields • A. H. S. Solberg et al. • Soft classifiers • Fuzzy Logic • G. M. Foody • L. Bastin • Decision Fusion • J. A. Benediktsson and I. Kanellopoulos • L. Jimenez et al.

  10. Improving Performance with Features/Data • Texture to improve performance • R. M. Haralick et al. • M. F. Augusteijn et al. • B.S. Manjunath et al. • Structure to improve performance • P. Gong and P. J. Howarth • Multi-source data • L. Bruzzone et al. • E. Binaghi et al. Browsing and Querying Archives • Content-based Image Retrieval • W. Y. Ma and B. S. Manjunath • C.-S. Li and V. Castelli • S. Newsam et al. • Information Mining • M. Datcu and K. Seidel • H. Daschiel and M. Datcu

  11. Limitations of Current Approaches • Difficulty in defining classes • Continuum of variety • Subjectivity • No major improvement in classification accuracy • G. G. Wilkinson, “Results and Implications of a Study of Fifteen Years of Satellite Image Classification Experiments,” IEEE Trans. On Geoscience and Remote Sensing, 2005. • Recognition of higher-order semantics • More focus on spectral than spatial dimensions • Scalability of complex querying and browsing • Flexibility to handle diverse applications

  12. Outline • The Problem • The Data • Related Work • Our Approach • Results • Conclusion

  13. Summary of Our Approach • Practical implementation on large-scale archives • Flexibility to adapt to various applications of satellite imagery, e.g., Geography, Military, Metallurgy, Agriculture etc. • Exploiting spectral and spatial information using a generative model for semantic categorization (supervised learning) • Handling of untrained classes (supervised learning) • Using a scalable CBIR system for efficient querying and browsing (unsupervised)

  14. Training and Querying Patches • Patches • 128 x 128 pixels – NIR Bands (why ?) • 40 samples per semantic category • Military/ Geography Application – 4 classes/categories • Difficulties • Ambiguity • Untrained Classes • Normalization Mountain Crop Field Residential Urban

  15. Overview of the System • Classification • Generative Classifier: Two-dimensional Multi-resolution Hidden Markov Models (2-D MHMM) • Handling untrained classes • Discriminative Classifier: Support Vector Machines • Querying for fast retrieval • Integrated Region Matching (IRM) measure used in SIMPLIcity system Query patch Database search using IRM 40 x K training patches K trained 2D-MHMMs Patch Class K Likelihood scores Random samples from trained and untrained classes Trained biased SVM Query Results

  16. Details of the Architecture • Efficient database design • Pre-computation of classes using 2D-MHMM for faster response • Suitable user interface and visualization (Common CBIR issue)

  17. 2-D MHMM • Extends the HMM to 2-D and over multiple resolutions by a Markov chain • J. Li et al., “Multiresolution Image Classification by Hierarchical Modeling with Two-Dimensional Hidden Markov Models,” IEEE Trans. on Information Theory, 2000. • Reflects inter-scale and intra-scale dependence • States: hierarchical Markov mesh, unobservable • Features: Color (LUV space) and texture (Daubechies wavelet) on 4x4 blocks • Builds models of semantic categories from training samples

  18. 2-D MHMM • Start from the conventional 1-D HMM • Extend to 2-D transitions • Conditional Gaussian distributed feature vectors • Then add Markovian statistical dependence across resolutions • Use EM algorithm for parameter estimation • Shown effective in modeling generic semantic categories for image annotation • J. Li and J. Z. Wang, “Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach,'' IEEE Trans. on Pattern Analysis and Machine Intelligence, 2003.

  19. Training process

  20. Automatic Semantic Categorization

  21. SVM • Not practical to train all concepts • May not be available or required • May not be meaningful to the given context • May be too costly • Samples not belonging to a trained classes add noise to the query results • Solution: Automatically determine untrained sample and eliminate from consideration • Train an SVM (RBF Kernel) on the likelihood scores for samples from trained and untrained concepts • Bias classification towards low false negatives

  22. IRM • IRM defines an image-to-image distance as a weighted sum of region-to-region distances • Different names in different fields (e.g., Mallows Distance) • Weighting matrix is determined based on significance constraints and a ‘MSHP’ greedy algorithm • Successfully used in the generic content-based image retrieval system SIMPLIcity • J. Z. Wang et al., “Semantics-Sensitive Integrated Matching for Picture Libraries,” IEEE Trans. on Pattern Analysis and Machine Intelligence, 2001.

  23. Post-retrieval Browsing

  24. Outline • The Problem • The Data • Related Work • Our Approach • Results • Conclusion

  25. Classification Results 2D-MHMM: SVM: Patch is from a trained class when it actually is : 96.05% Patch is from an untrainedclass when it actually is: 53.7%

  26. Retrieval Results http://riemann.ist.psu.edu/image

  27. Retrieval Results http://riemann.ist.psu.edu/image

  28. More Retrieval Results

  29. Retrieval Accuracy

  30. Conclusions • Proposed a convenient learning-based approach for large-scale browsing and retrieval of satellite image patches • Combine semantic categorization and CBIR • 2-D MHMM and IRM complement each other to boost retrieval performance • SVM has been effectively used to deal with patches from untrained categories • Lack of Homogeneity in Data • Sun Elevation • Seasonal changes of terrain • Appropriate normalization procedure ? Other solutions ? • Optimizing size of patch • Larger size – more efficient, can capture larger semantics • Smaller size – less ambiguous patches, more accurate results • What is optimal for a given Application ? Formal procedure to determine this ?

  31. Acknowledgement • This work was supported by the U.S. National Science Foundation • Discussions with satellite image analyst Bradley Farster at the Penn State Applied Research Laboratory were helpful

  32. Questions

More Related