1 / 6

Markov Cluster (MCL) algorithm http://micans.org/mcl/

Markov Cluster (MCL) algorithm http://micans.org/mcl/. Stijn van Dongen. Markov Cluster (MCL) algorithm http://micans.org/mcl/. • Traditionally, most methods deal with similarity relationships in a pairwise manner, while graph theory allows classification of proteins

kaipo
Download Presentation

Markov Cluster (MCL) algorithm http://micans.org/mcl/

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Markov Cluster (MCL) algorithm http://micans.org/mcl/ Stijn van Dongen

  2. Markov Cluster (MCL) algorithm http://micans.org/mcl/ • Traditionally, most methods deal with similarity relationships in a pairwise manner, while graph theory allows classification of proteins into families based on a global treatment of all relationships in similarity space simultaneously. • Similarity between proteins are arranged in a matrix that represents a connection graph. • Nodes of the graph represent proteins, and edges represent sequence similarity that connects such proteins. • A weight is assigned to each edge by taking -log10(E-value) obtained by a BLAST comparison.

  3. • These weights are transformed into probabilities associated with a transition from one protein to another within this graph. • This matrix is passed through iterative rounds of matrix multiplication and matrix inflation until there is little or no net change in the matrix. The final matrix is then interpreted as a protein family clustering. • The inflation value parameter of the MCL algorithm is used to control the granularity of these clusters.

  4. The MCL algorithm simulates random walks within a graph by alternation of two operators called expansion and inflation. Expansion coincides with taking the power of a stochastic matrix using the normal matrix product (i.e. matrix squaring). Inflation corresponds with taking the Hadamard power of a matrix, followed by a scaling step, such that the resulting matrix is stochastic again, i.e. the matrix elements (on each column) correspond to probability values. Expansion and inflation represent different tidal forces which are alternated until an equilibrium state is reached. An equilibrium state takes the form of a so-called doubly idempotent matrix, i.e. a matrix that does not change with further expansion of inflation steps.

  5. Inflation operator: A column stochastic matrix is a non-negative matrix with the property that each of its columns sums to 1. M  Rkxk, M >= 0, and real number r >1. rM is the column stochastic matrix resulting from inflating each of the columns of M with power coefficient r. r is called the inflation operator with power coefficient r. Formally, the action of r : Rkxk -> Rkxk is defined by: (rM)pq = (Mpq)r/ {∑(Miq)r ; i=1,k}. Each column j of a stochastic matrix M corresponds with node j of the stochastic graph associated with M. M corresponds with the probability of going from node j to node i. It is observed that for values of r >1, inflation changes the probabilities associated with the collection of random walks departing from one particular node by favouring more probable walks over less probable walks.

  6. blastp proteome specific comparisons all protein significant hits Adapted from Enright et al. NAR 2002.

More Related