1 / 34

Learning of Data Collections in High-dimensional Spaces Without Supervision

Djemel Ziou NSERC/Bell Canada Chair in personal imaging Computer Science dept. Université de Sherbrooke Quebec, Canada. Learning of Data Collections in High-dimensional Spaces Without Supervision. 1. Content. Visual collection management Machine learning Image segmentation

duer
Download Presentation

Learning of Data Collections in High-dimensional Spaces Without Supervision

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Djemel Ziou NSERC/Bell Canada Chair in personal imaging Computer Science dept. Université de Sherbrooke Quebec, Canada Learning of Data Collections in High-dimensional Spaces Without Supervision 1

  2. Content • Visual collection management • Machine learning • Image segmentation • Content based image suggestion

  3. Visual collection management

  4. Motivations NSF 2007, B. Efron 2002.

  5. Reactive Access to Collections Short term need User queries an information retrieval system • Content-based image retrieval • Visual appearance: color, shape, texture, regions of interest,… • Limitations • Query, features, similarity, indexing, … • Text-based image retrieval • Text: keywords extracted from Web pages containing the image, figure captions, … 5

  6. Proactive Access To Collections Predict the buyers needs Suggestion Suggestion rules • Collaboration: Users conformity to groups  Opinions of other users • 2. Content : Conformity to himself • Items with same tags (keywords) 6

  7. Machine Learning 7

  8. Introduction • Representation of stimulus

  9. X C X C Introduction • Data • Generative learning Under certain assumptions (structural, MAP) • Discriminative learning Unlike generative learning, 1) provides no information about x ( ) ; 2) Discriminative learning cannot be used with unlabelled data (C must be observed).

  10. Discriminative Learning: Bayesian Logistic Regression Ksantini, Ziou, Colin, Dubeau. IEEE Trans. On PAMI, 2008 Maximizing the conditional Log-Likelihood. where There are several drawbacks (high-dimension, separability, …) Bayesian formulation

  11. Variational approximation and Jensen’s inequality lead to:

  12. Generative learning: case of finite mixture of pdfs Third Moment Gaussian (point) Beta (area) Gamma (line) 0 Fourth Moment • Finite mixture model. Problems: pdf, estimation, model selection, … Which Pdf? Gaussian, Gamma, …, same or different pdfs for populations. Mixture of different Pdfs for SAR images El Zaart and Ziou, Int J. Remote Sensing 2007

  13. The Generalized Dirichlet Distribution Generalized Dirichlet distribution (GDD)

  14. Multi-dimensionality is Omnipresent • Multidimensional data • Image Descriptors: 128 000 features (128 Sift features x 1000 interest points). • Faces: 128x128 pixels= 16384 features/face • Text: # terms in a corpus ~10 000 14

  15. High-Dimensional data Bouguila and Ziou. IEEE Trans. On PAMI, 2007 Boutemedjet, Bouguila and Ziou. IEEE Trans. On PAMI, 2009 If is GDD ( ) If d=1 for d=2…D Each is a Beta

  16. Feature Selection Mixture model before and after transformation: 16

  17. Feature Selection Model Boutemedjet, Bouguila and Ziou. IEEE Trans. On PAMI, 2009 • Relevance Criterion: marginal independence of Xl from the class label Z • Label Xl with hidden Bernoulli variableфl, such that фl=0 when Xl ~ ξl, • General definition:ξl = mixture of K ξkl • e.g.distribution of background in object images. • Label Xl in the mixture ξl by hidden multinomial variable • Approximation: • New mixture model Generalized Dirichlet (GD) with selection of independent features 17

  18. Unsupervised Learning using the MML Principle Bouguila and Ziou. IEEE Trans. On TKDE, 2007 Paradigm Send Encode Decode What is the minimum message length? • is the number of parameters being estimated and equal to M (2D+1). • is the prior probability. • is the Fisher information (determinant of the Hessian matrix). • Problems: ? And ?

  19. Unsupervised Learning MML Boutemedjet, Bouguila and Ziou. IEEE Trans. On PAMI, 2009 • Fisher Information: • E.g. • Prior distribution: • E.g. • Message Length of the data set 19

  20. Optimization of MML 2x2 matrix Expectation Maximization (EM) algorithm E-step: expected posterior probabilities M-step:

  21. Object image categorization Challenging problem in computer vision • Goal : Identify categories and irrelevant features • Challenge: Intra-class variability + inter-class similarity • Existing: Supervised,K-NN with Euclidian distance • Collection: 2688 images, 8 classes • Features: • Scale Invariant Feature Transform (SIFT) ≈ 1.5.106 descriptors 128-D (2 GB) • Visual vocabulary 700 “visual words” • Probabilistic Latent Semantic Indexing (pLSI) • P(z|I):hidden aspects defined on simplex  Non-Euclidian

  22. Results Feature Selection improves the accuracy of image categorization 22

  23. Image segmentation and object tracking M.S. Allili and Ziou, Int. J. of Computer Mathematics, 2007.M.S. Allili and Ziou, J. Neurocomputing, 2008.

  24. Problem formulation of segmentation Active contour based approach Variational formulation Final contour Initial Contour

  25. Proposed approach Statistical Model selection Contrast estimation + Energy functional Euler-Lagrange PDE

  26. Topology change (Level sets) Experimental results

  27. Object tracking in video

  28. CBIS as a Model Selection Problem Boutemdjet and Ziou, IEEE Trans. on multimedia, 2008.

  29. Suggestion Criteria • Data • Users: U={u1,u2,…,UNu} • Contexts: E={e1,e2,…,eNe} • Images:X = {x1,x2,…,xNx} • Ratings of user on images: D={(u(i),e(i),x(i),r(i)),i=1,…,N}, • Data modeling principle • Similar users prefer visually and semanticallysimilar products • Suggestion : consumers need highlyrated and less redundant products 29

  30. Data model: p(u,e,x,r) • Rating: modeldata  Each Quadruplet (u,e,v,x) is a random vector • Discover user/image classes (z,c) and Label (u,e,v,x) with 2 hidden variables: z: user class, c: image class • All variables except x are discrete ~multinomial distributions, x~GD • Parameters: • Diversity: Penalize predicted ratings for consumed images Xue • Consumed images becomeirrelevant Nue={(u,e,xtue,r-),t=1,..,Nue} • Update Θ from Nue • New data are handled. 30

  31. Algorithm 31

  32. Results: Mean Absolute Error (MAE) Feature Selection improves the rating prediction accuracy PCC: Pearson Correlation Coefficients (P. Resnick et al., CSCW 1994) Aspect Model(T. Hofmann, ACM TOIS 2004) Flexible Mixture Model (L. Si & R. Jin, ICML 2003) User Rating Profile(B. Marlin, NIPS 2004) V-FMM: No contextual information, E=Singleton V-GD-FMM: No Feature Selection 32

  33. Thank you 33

  34. References • M. S. Allili, D. Ziou. Object tracking in videos using adaptive mixture models and active contours. Neurocomputing 7, pp. 2001-2011, 2008. • M. S. Allili, D. Ziou: Automatic colour-texture image segmentation using active contours. Int. J. Comput. Math. 84(9): 1325-1338, 2007. • S. Boutemedjet, Djemel Ziou. A Graphical Model for Context-Aware Visual Content Recommendation. IEEE Trans. on Multimedia 10, pp. 52-62, 2008. • S. Boutemedjet, N. Bouguila, and D. Ziou (In press). A Hybrid Feature Extraction Selection Approach for High-Dimensional Non-Gaussian Data Clustering. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2009. • N. Bouguila and D. Ziou: High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture Model Based on Minimum Message Length. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2007. • R. Ksantini, D. Ziou, B. Colin, F. Dubeau. Weighted Pseudometric Discriminatory Power Improvement Using a Bayesian Logistic Regression Model Based on a Variational Method. IEEE Trans. Pattern Anal. Mach. Intell. 30(2): 253-266, 2008. • D. Ziou, T. Hamri, S. Boutemedjet. A hybrid probabilistic framework for content-based image retrieval with feature weighting. Pattern Recognition 42(7): 1511-1519, 2009. • M. L. Kherfi, D. Ziou. Relevance feedback for CBIR: a new approach based on probabilistic feature weighting with positive and negative examples. IEEE Trans. on Image Processing 15(4): 1017-1030 2006. • M.-F. Auclair-Fortier, D. Ziou. A Global Approach for Solving Evolutive Heat Transfer for Image Denoising and Inpainting. IEEE Trans. Image Processing, 15:2558-2574, 2006. • A. F. El Ouafdi, D. Ziou, and H. Krim. A smart stochastic approach for manifolds smoothing. Comput. Graphic Forum 27, pp. 1357-1364, 2008. 34

More Related