1 / 28

Overview

A Comparison Between Bayesian Networks and Generalized Linear Models in the Indoor/Outdoor Scene Classification Problem. Overview. Introduce Scene Classification Problems Motivation for Scene Classification Kodak's JBJL Database and Features Bayesian Networks

huela
Download Presentation

Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Comparison Between Bayesian Networks and Generalized Linear Models in the Indoor/OutdoorScene Classification Problem

  2. Overview • Introduce Scene Classification Problems • Motivation for Scene Classification • Kodak's JBJL Database and Features • Bayesian Networks • Brief Overview (description, inference, structure learning)‏ • Classification Results • GLM • Briefer Overview • Classification Results • Comparison and Conclusion

  3. Problem Statement: Given a set of consumer digital images, can we use a statistical model to distinguish between indoor images and outdoor images?

  4. Motivation • Kodak • Increase visual appeal by processing based on classification • Object Recognition • Provide context information which may give clues to scale, location, identity, etc.

  5. Procedure • Establish ground-truth for all images • Perform feature extraction and confidence/probability mapping for features • Divide images into training and testing set • Use test images to train a model to predict ground-truth • Use the model to predict ground truth for the test set • Evaluate performance

  6. Kodak JBJL • Consumer image database • 615 indoor and 693 outdoor images • Some images are difficult for HSV to determine whether it is indoor or outdoor • Some images have indoor and outdoor parts

  7. Features and Probability Mapping • “Low-level” Features • Ohta-space color histogram (color information)‏ • MSAR model (texture information)‏ • “Mid-level” Features • Grass classifier • Sky classifier • K-NN Used to Extract Probs from Features • Quantized to nearest 10% (11 states for Mid-level, 3 states for Low-level)‏

  8. Feature Probs and Classes

  9. Stat. Model 1: Bayesian Network • Graphical Model • Variables are represented by vertices of a graph • Conditional relationships are represented by directed edges • Conditional Probability table associated with each vertex • Quantifies vertex relationships • Facilitates automated inference

  10. Exact Inference • Model Joint Probability • Inference

  11. Structure Learning Search Space • Space BNs • Variable-State Combination • (#States per Node) x (#Nodes)‏ • 2178 possible • Structures • Limited to DAGs • 29281

  12. Scoring Metric • Score a structure based on how well the data models the data • We do have an expression estimate the data given the structure • Unfortunately, the data probability is difficult to estimate

  13. The Bayes Dirichlet Likelihood Equivalent • Can compare structures 2 at a time • What is the prior on structure? • Assume all structures are equally likely • Use #edges to penalize complex networks

  14. Challenges • Not all structures can be considered if there is only a small amount of data. • Context dilution • Can't consider cases where CPT cannot be filled in • Finding an optimal structure is NP hard

  15. BDe Structure For I/O Classification • Greedy algorithm with BDe scoring • Naïve Bayes Model!

  16. Result Compared to Previous • Previous Results • Our Results

  17. Misclassified:Inferred Outdoor

  18. Misclassified: Inferred Indoor

  19. Generalized Linear Model • Outdoor and Indoor can be thought of a binary output • Logit kernel

  20. Likelihood for GLM • Newton-Raphson • Get estimates of mean and variance (1st and 2nd derivative)‏ • Find optimal based on estimates (Taylor Expansion)‏ • Iterate • Generally, this quickly converges to the optimal solution

  21. Side by Side Comparison

  22. Misclassified: Predicted Outdoor

  23. Misclassified: Predicted Indoor

  24. Conclusion • The newer Bayesian Network model may perform classification slightly better than GLM • BN is more computationally intensive • Unclear if there is in fact a difference • Both models have difficulty with the same images • Better to introduce new data than to use a new model • New model give (at most) marginal improvement

  25. References • Heckerman, D. A Tutorial on Learning with Bayesian Networks. In Learning in Graphical Models, M. Jordan, ed.. MIT Press, Cambridge, MA, 1999. • Murphy, K. A Brief Introduction to Graphical Models and Bayesian Networks, http://www.cs.ubc.ca/~murphyk/Bayes/bnintro.html(viewed 4/1/08)‏ • Lehmann, E.L. and Casella G. Theory of Point Estimation (2nd edition)‏ • Weisberg, S. Applied Linear Regression (3rd Edition)‏

  26. Data Given Model Prob

More Related