1 / 6

Modified Multi-Dimensional Scaling (MDS) Algorithm for Mining Gene Expression Patterns

Modified Multi-Dimensional Scaling (MDS) Algorithm for Mining Gene Expression Patterns. X.J. Ge*, S. Yonamene*, Y.M. Mi*, S. Tsutsumi**, Y. Kobune**, H. Aburatani** and S. Iwata* *Research into Artifacts, Center for Engineering (RACE), The University of Tokyo,

kylan-chan
Download Presentation

Modified Multi-Dimensional Scaling (MDS) Algorithm for Mining Gene Expression Patterns

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modified Multi-Dimensional Scaling (MDS) Algorithm for Mining Gene Expression Patterns X.J. Ge*, S. Yonamene*, Y.M. Mi*, S. Tsutsumi**, Y. Kobune**, H. Aburatani** and S. Iwata* *Research into Artifacts, Center for Engineering (RACE), The University of Tokyo, Komaba 4-6-1, Meguro-ku, Tokyo 153-8904, Japan **Department of Life Sciences, Research Center for Advanced Science and Technology (RCAST), The University of Tokyo, Komaba 4-6-1, Meguro-ku, Tokyo 153-8904, Japan ABSTRACT: The dataset of Golub et al. is analyzed by using dimensionality-reduction techniques, including principal component analysis (PCA), multi-dimensional scaling (MDS) and a modified MDS algorithm. These methods produce snapshots that are helpful for class discovery. Data Set 2: Golub, et al. Science 286: 531(1999).

  2. Background Gene expression patterns can be considered as points in multi-dimensional Euclidean spaces. As the high dimensionality causes difficulty in analysis, it is helpful to have a low-dimensional, representation that captures some characteristics of the raw dataset. In principal component analysis (PCA), the raw data points are linearly projected to some plane with maximum variance. In Multi-dimensional Scaling (MDS), data points are represented on low-dimensional space such that the distances between points are preserved. MDS is nonlinear. Similarity matrix n-D data points 2-D map

  3. Results of PCA A linear projection of gene expression patterns using the first two principal components. Samples of ALL and AML are roughly mapped into different clusters.

  4. Results of conventional MDS MDS minimizes the objective function: is the distance between points in the x-y plot is the Euclidean distance between gene expression patterns. Mapping of gene expression patterns by multi-dimensional scaling (MDS). AML and two subtypes of ALL samples are found in different regions. But the classification is difficult without clinical information.

  5. Modified MDS Goal: Enlarge trans-cluster distances to make separation easier. Physics background: condensation of atoms to form solids with minimum free energy. (Objective function) Mapping of gene expression patterns by a modified multi-dimensional scaling (MDS) algorithm. AML and two subtypes of ALL samples are found in different regions. (Dissimilarity)

  6. Conclusions • The difference between AML and ALL can be discovered even using linear methods like principal component analysis(PCA). But for more complicated data structures, such as the difference between subtypes of ALL, PCA is not sufficient. • Multi-dimensional scaling (MDS) can produce 2-D maps of gene expression patterns that reveal more complicated data structures.

More Related