1 / 18

Neighboring Feature Clustering

Neighboring Feature Clustering. Author: Z. Wang, W. Zheng, Y. Wang, J. Ford, F. Makedon, J. Pearlman Presenter: Prof. Fillia Makedon Dartmouth College. What is Neighboring Feature Clustering.

gagan
Download Presentation

Neighboring Feature Clustering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Neighboring Feature Clustering Author: Z. Wang, W. Zheng, Y. Wang, J. Ford, F. Makedon, J. Pearlman Presenter: Prof. Fillia Makedon Dartmouth College

  2. What is Neighboring Feature Clustering • Given an m × n matrix M,where m denotes m samples and n denotes n (ordered) dimensional features, the goal is to find a intrinsic partition of the features based on their characteristics such that each cluster is a continuous piece of features. • We assume there is a natural ordering of features that has relevance to the problem being solved • E.g., in spectral datasets, such characteristics could be correlations • For example, if we decide feature 1 and 10 belong to a cluster, feature 2 to 9 should also belong to that cluster. • ZHIFENG: PLEASE IMPROVE THIS SLIDE, PROVIDE AN INTUITIVE DIAGRAM

  3. MR spectral features and DNA Copy Number??? • MR spectral features are highly redundant suggesting that the data lie in some low-dimensional space (ZHIFENG: WHAT DO YOU MEAN BY LOW DIMENSIONAL SPACE - CLARIFY) • Neighboring spectral features of MR spectra are highly correlated • Using NFC, we can partition the features into clusters. • A cluster can be represented by a single feature, hence reducing the dimensionality. • This idea can be applied to DNA copy number analysis too. Zhifeng: Yuhang said these two are not related!! • Please explain how these are related.

  4. Use MDL method to solve NFC • Reduce NFC into a one dimensional piece-wise linear approximation problem. • Given a sequence of n one dimensional points <x1,...,xn >, find the optimal step function-like line segments that can be fitted to the points • Fig. 1. • Piecewise linear approximation [3] [4] is usually 2D. Here we use its concept for a 1D situation. • We use minimum description length (MDL) method [2] to solve this reduced problem. • Zhifeng: define and explain MDL

  5. Minimum Description Length (MDL) • Zhifeng, please provide a slide to define this • EXPLAIN HOW THE TRANSFORMATION IS DONE (AS IN [1]) TO GIVE 1D piece-wise linear approximation. • Represent all the points by two line segments. • Trade-off between approximation accuracy and number of line segments. • A compromise can be made using MDL. ??? • Zhifeng: it is all very cryptic, pieces of explanation are missing!

  6. Outline • The problem • Spectral data • The abstract problem • Related work • HMM based, partial sum based, maximum likelihood based • Our approach • Problem reduced to 1D linear approximation • MDL approach

  7. Reducing NFC to 1D Piece-Wise Linear Approximation Problem 1 • Let correlation coefficient matrix of M be denoted as C. • LetC∗ be the strictly upper triangular matrix derived from 1−|C| (entries near 0 imply high correlation between the corresponding two features). • For features from i to j (1 ≤ i ≤ j ≤ n), the submatrix C∗i:j,i:j depicts pairwise correlations. We use its entries (excluding lower and diagonal entries) as the points to be explained by a line in the 1D piece-wise linear approximation problem. • The objective is to find the optimal piece-wise line segments to fit those created points. • Points near 0 mean high correlation. We need to force high correlations among a set. Thus the points are always approximated by 0.

  8. example • For example, suppose we have a set with points all around 0.3. • In piece-wise linear approximation, it is better to use 0.3 as the approximation. • However in NFC, we should penalize the points that stray away from 0. • So we still use 0 as the approximation. • Unlike usual 1D piece-wise linear approximation problem, the reduced problem has dynamic points (because they are created on the fly). • Zhifeng: provide figure to illustrate above example

  9. intensit Fig. 1 high dimensional data points frequeny Fig. 2 correlation coefficient matrix Spectral data • MR spectral data • High dimensional data points • Spectral features are highly redundant (high correlation) • Find neighboring features with high correlation in a spectral dataset, such as a MR spectral dataset. Both axes are the features or the number of dimensions

  10. Problem • Finding a low-dimensional space - • zhifeng: define low dimensional space • Curse of dimensionality • We extract an abstract problem: Neighboring Feature Clustering (NFC) • Features are ordered. Each cluster contains only neighboring features. • Find an optimal clusters according to certain criteria

  11. Fig. 3 aCGH technology Fig. 4 aCGH data (smoothed). The X axis is log ratio Fig.5 aCGH data(segmented). The X axis is log ratio Another application (with variation) • Array Comparative Genomic Hybridization to detect copy number alterations. • aCGH data are noisy • Smoothing • Segmentation

  12. Fig. 6 1D piece-wise approximation Related work • An algorithm trying to solve a similar problem • Baumgartner, et al, “Unsupervised feature dimension reduction for classification of MR spectra”, Magnetic Resonance Image, 22:251-256,2004 • An extensive literature on the reduced problem • The, et al, “On the detection of dominant points on digital curves”, IEEE PAMI, 11(8) 859-872, 1989 • Statistical methods…

  13. Related work: statistical methods • HMM based • Fridlyand, et al , “Hidden Markov models approach to the analysis of array CGH data”, J. Multivariate Anal., 90, 132-153 • Partial sum based • Lipson etc., ‘”Efficient Calculation of Interval Scores for DNA copy Number Data analysis”, RECOMB 2005 • Maximum likelihood based • Picard, etc., “A statistical approach for array CGH data analysis”, BMC Bioinformatics, 6:27,2005

  14. 2. For each pair of features 1. Correlation coefficient matrix intensity frequency 3. MDL code length (revised) C1,n-1 C2,n-1 C3,n-1 4. Code length matrix … Cn-1,n C1,2 C2,3 n-1 n 1 3 2 intensity 5. Shortest path (dynamic programming) C3,n frequency C1,2 C2,n Fig. 7 our approach Framework of the method proposed

  15. Fig. 6 1D piece-wise approximation Minimum Description Length • Information Criterion • A model selection scheme • Common information criteria are Akaike Information Criterion(AIC), Bayesian Information Criterion (BIC), and Minimum Description Length (MDL) • MDL is to encode the model and the data given the model. The balance is achieved in terms of code length

  16. d Fig.8 encoding the data given model for each feature pair Encoding model and data given model • For each pair (n*(n-1)/2 in total) of features • Encoding model • Cluster boundary, Gaussian parameter (standard deviation) • Encoding data given model

  17. Fig. 9 alternative representation of matrix C C1,n-1 C2,n-1 C3,n-1 … Cn-1,n C1,2 C2,3 n-1 n 1 3 2 C3,n C1,2 Fig. 10 Recursive function for the shortest path C2,n Minimize the code length • Code length matrix • Shortest path • Recursive function • Dynamic programming

  18. Fig. 11 the revised correlation matrix and the computed code length matrix Results • We test on simulated data.

More Related