Network randomization

Network randomization Sylvain Brohée <sylvain@bigre.ulb.ac.be> Université Libre de Bruxelles, Belgique Laboratoire de Bioinformatique des Génomes et des Réseaux (BiGRe) http://www.bigre.ulb.ac.be/

Need for graph randomization • Negative controls in biology • In bioinformatics as well as in wet lab biology, negative controls are essential to estimate the relevance of an observation. • The principle of the negative control is that you perform the same (wet or in silico) experiment but in conditions where the results has to be negative. • If a method is good, the negative control will return a negative answer. • Randomization as in silico negative control methodology • The methods allowing to study a network and its properties (topology, network comparison, cluster structure) can be applied to randomized networks, as a negative control. • There are many ways to randomize a network. • The choice of the randomization method is essential and depends on the question to be addressed. • Applications • To compare methods (e.g. protein complexes from protein interaction networks), in order to estimate their respective merits in terms of difference between real data and on a randomized network. • To estimate the “signal” that is present in a given data set: if the data set contains signal, it should behave differently from a randomized dataset (considered as made of “noise”).

Major ways of randomizing networks • Randomization with Erdös-Rényi model • Keeps the same number of nodes and edges as in the original graph. • First create all the nodes, and then draw edges between nodes, with equal probability for all nodes. • Does not preserve the network topology (e.g. “hubs” of the original networks are lost in the randomized one). • Example of application: assessing the impact of the presence of hubs on path finding methods. ER Randomization

Major ways of randomizing networks • Randomization with node degree conservation • consists in shuffling the edges between the nodes. • keeps the same number of nodes and edges as in the original graph. • Moreover, each node keeps the same degree. • good random control for biological networks as it keeps the global network topology. Hubs stay hubs but are not connected to the same nodes. • Example of application: assessing the random expectation of clustering methods. ND Randomization

Major ways of randomizing networks • Permutation of node labels • Consists in switching the node names. • Keeps the same number of nodes and edges as in the original graph. • Completely keeps the network topology, but the identities of the nodes are changed. • This mode of randomization is relevant only for applications where the node identity matters. • Example of application: comparison between graph clusters and functional catalogs (e.g. Metabolic pathways, Gene Ontology, regulons, ...). Node label randomization

Comparison example. Uetz et al (2000) vs Ito et al (2001)‏ Number of edges in common : 122 Jaccard : 8% P-value : 2.5e-228 Comparison example - Uetz (2000) versus Ito (2001)

Comparison example. Uetz et al (2000) vs Ito et al (2001)‏ Node degree conservation randomization of Uetz dataset Number of edges in common : 4 Jaccard : 0.24% P-value : 6.8e-03 Random control for the comparison Uetz (2000) versus Ito (2001)

Tools for network randomization • NeAT • http://rsat.ulb.ac.be/neat/ • GraphCrunch • http://www.ics.uci.edu/~bio-nets/graphcrunch/ • Network Workbench • http://nwb.slis.indiana.edu/

Network randomization

Network randomization

Presentation Transcript

Issues in Randomization

Randomization

Issues in Randomization

3. Randomization

Randomization (n: 424)

Randomization Overview

Privacy and Spectral Analysis on Social Network Randomization

Approximate Randomization tests

Example 7.1.1: Randomization

Randomization

Randomization

Randomization and Controls

Months since Randomization

Randomization

Basics of Randomization

Randomization workshop

Randomization workshop

Adaptive randomization

Basics of Randomization

Randomization:

Mendelian Randomization

Port randomization (draft-ietf-tsvwg-port-randomization)