Canonical Graph labelling Alice Miller

Canonical Graph labellingAlice Miller (As used by nauty, bliss, saucy, traces) Most of the examples in this talk are from “Combinatorial Algoritms, Generation, Enumeration and Search”, Kreher + Stinson The actual algorithms are contained therein.

Graph isomorphism Deciding whether two graphs are isomorphic is fundamental in graph theory. As many combinatorial objects can be modelled as graphs, efficient isomorphism programs are necessary for the study of other combinatorial objects. Example: What if the graphs had hundreds of vertices, and given only as adjacency matrices? G4 G1 G2 G3 G1 , G3 ,G4 are isomorphic G1 and G4 are the same graph (same adjacency matrix).

Definition of graph isomorphism and automorphism Definition Graphs G1 = (V1;E1) and G2 = (V2;E2) are isomorphic if there is a bijection f : V1V2 such that (f(u), f(v))  E2 (u,v) E1 f is said to be an isomorphism between G1 and G2. If f is an isomorphism from G to itself, it is called an automorphism. The set of all automorphisms of a graph is a permutation group (which is a group under the “composition of permutations" operation).

Permutation matrix Given a permutation sof nelements, given in two-line form by 0 1 2 3 . . . n-1 s(0) s(1) s(2) s(3) . . . s(n-1) its permutation matrix is the n × nmatrix Pswhose entries are all 0 except that in row i, the entry s(i) equals 1. Properties: If Ps and Pm are permutation matrices relating to s and m, permutations on n elements, then Ps Pm =Psm Ps ‘ = Ps’=Ps T where ‘ denotes inverse. In this talk I use ‘ to denote inverse and transpose. Permutation matrices are doubly stochastic (not relevant here!)

Permutation matrices If s is a permutation, and As the adjacency obtained from A when s is applied to the vertices, then As[i][j] = A[s-1[i],s-1[j]] When a permutation matrix is applied to a matrix A from the left (PA) it will permute the rows of A. If it is applied to A from the right (AP) it will permute the columns of A. Since P’ is also a permutation matrix, it will permute the rows/columns of A. If B = s(A) then B = P’AP for permutation matrix P corresponding to s. If G is a graph with adjacency matrix A, and sis a permutation such that Ps’A1Ps= A1 Then s is an automorphism of G1

Example s : 01234567  06172345 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 1 1 0 1 0 1 0 0 0 0 1 0 1 0 0 1 0 1 0 1 0 1 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 1 0 1 1 0 0 1 0 0 1 0 1 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 A = Ps’ = Ps = 0 1 0 1 0 0 0 1 0 1 0 1 0 0 1 0 0 1 0 1 0 1 0 0 0 0 0 0 1 0 1 1 0 0 1 0 0 1 0 1 1 0 0 0 0 1 1 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 1 0 1 1 0 0 0 1 0 0 1 1 0 0 1 0 1 1 0 0 0 1 0 1 0 1 0 0 1 0 0 1 1 0 0 0 1 1 1 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 1 6 5 7 2 1 Ps’APs = Ps’A = 7 3 6 4 3 5 4 2 swaps rows swaps columns

Determining graph isomorphism • The problem of determining if two graphs are isomorphic is in general difficult, but most researchers believe it is not NP-complete. • Some special cases can be solved in polynomial time, such as: graphs with maximum degree bounded by a constant and trees. • Brute force approach, If vertex set is 1,2,3,…,n try all permutations in Sn to see if any maps G1 to G2 (very expensive). • Alternative, use a certificate (or canonical labelling). This is the method used by nauty (etc.) • Definition • A certificate Cert() for a family F of graphs is a function such that for • G1, G2F, we have • Cert(G1) = Cert(G2) G1 and G2 are isomorphic

Certificates for Trees (Ronald Read, 1972) • Label all vertices with string 01. • While there are more than 2 vertices in G: • for each non-leaf x of G do • Let Y be the set of labels of the leaves adjacent to x and the label of x • with initial 0 and trailing 1 deleted from x; • Replace the label of x with the concatenation of the labels in Y, sorted • in increasing lexicographic order, with a 0 prepended and a 1 appended. • Remove all leaves adjacent to x. • If there is only one vertex x left, report x's label as the certificate. • If there are 2 vertices x and y left, concatenate x and y in increasing • lexicographic order, and report it as the certificate. Tree certificates are totally balanced sequences or Catalan numbers (see Kreher and Stinson), .

Example 1: Tree with one centre T  C

Example 2: Tree with two centres T  C

Certificate to tree • I’m going to illustrate this using the two previous examples • Draw “mountain range” using the certificate (0= up, 1=down), from sea level (y=0) • Initial number of vertices = number of distinct mountains • At each step, raise sea level by 1 unit. If a single mountain becomes m distinct mountains (m1), add m new edges from the corresponding vertex. • Stop when raising sea level covers/touches peaks of all mountains. • Tree produced is isomorphic to the original (but not necessarily identical).

Example 1: Tree with one centre C  T certificate: 00001011100011100111

Example 2: Tree with two centres C  T certificate: 00011001101100011011

Certificates for General Graphs Let G = (V;E) with adjacency matrix A1. Consider all permutations s : V  V . Each s determines an adjacency matrix A2 = Ps ‘A1Ps Look at the relevant entries of A2 and form a number Num. We will use these Num to define a certificate. For any graph, Num is the (value of) the binary string formed by tracing the elements in the top right corner of the adjacency matrix, column by column.

Example of Num 0 00 0 0 0 1 0 0 0 0 1 1 0 0 0 0 1 0 1 0 0 0 1 0 1 1 1 0 1 0 1 0 1 1 0 1 1 1 1 0 1 1 0 0 1 1 1 0 NumI(G)=(000001010101111100111)binary = 44007 A1= s = (0 1 2 3 4 5 6)  (3 0 1 2 4 6 5) Nums(G)=(001000101000111111011)binary = 283131 0 0 0 0 1 0 1 0 0 1 0 0 0 1 0 1 0 0 1 1 1 0 0 0 0 0 1 0 1 0 1 0 0 1 1 0 0 1 1 1 0 1 1 1 1 0 1 1 0 1 2 4 A2= 0 6 5 3

Certificate for general graph, idea 1 • Cert1(G) = min{Nums(G) : s  Sym(V )} • difficult to compute. • has as many leading 0's as possible. • So, k is as large as possible, where k is the number of all-zero • columns above the diagonal. • So, vertices {0,1,2,…,k-1} form a maximum independent set in G (or • equivalently a maximum clique in the complement graph G). • So, computing Cert1(G) as defined above is NP-hard. • So, it is possible the approach of computing Cert1(G) to solve • the graph isomorphism problem is more work than necessary.

Certificate for general graph, idea 2 So, instead, we will define the certificate as follows: Cert(G) = min{Nums(G) : s PG} where PGis a set of permutations determined by the structure of G but not by any particular ordering of V . Main idea is to do partition refinement, and use backtracking whenever we reach an equitable partition (partition that can't be further refined). The minimum above is taken over permutations considered in this backtracking tree.

Equitable partition Definition An (ordered) partition B of set S is a set (sequence) of blocks (or parts) B[1], B[2], …., B[n] such that every element of S appears in precisely one block. Definition A partition B is a discretepartition if |B[j]|= 1 for all j and a unit partition if |B| = 1. Definition Let G = (V,E) be a graph and NG(u) = {x V : (u,x)  E} (neighbourhood of u). A partition B is an equitable partition with respect to the graph G if For all i and j |NG(u)  B[j]| = |NG(v) B[j]| for all u, v  B[i]. i.e. no two elements of the same block have different number of edges into another block

Forming an equitable partition from an intitial partition. Example B = [{0},{1,2,3,4,5,6,7}] [{0}{2,4,5,6},{1,3,7}] (edges into {0}) [{0},{2,4},{5,6},{1,3,7}] (edges into {1,3,7}) [{0},{2,4},{5,6},{7},{1,3}] (edges into {2,4}) Nb. At each stage, find the first set B[i] such that some B[j] contains elements x and y with different numbers of edges into B[i]. 0 1 • Initial partition might come from a known colouring, and/or degree information, for example. • We are interested in ending up with a discrete partition. To do this, successively break parts and refine. • The strategy for choosing a partition to break varies between implementations. Nauty chooses the smallest part of size > 1 7 2 3 6 5 4

A note re discrete partitions and permutations To find canoncial labelling of graph, use process of “individualization and refinement”. Leaves of search tree are discrete partitions. Each discrete partition p0|p1|….|pn-1 corresponds to an orderingp = p0p1p2….pn-1 Which corresponds to a permutation s = p-1 Hence permuted matrix Ap is Ps’ A Ps or, if Pp is the permutation matrix corresponding to the ordering, PpAPp’

Search Tree example For the following graph: (draw on board!) 0 1 0 1 0 0 0 1 1 0 1 0 1 0 0 0 0 1 0 1 0 0 1 0 1 0 1 0 1 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 1 0 1 1 0 0 1 0 0 1 0 1 1 0 0 0 0 1 1 0 A = 0 1 7 2 3 6 5 4

Initial (unit) partition is already equitable. Can split this (take one element into its own part) in 8 ways, and refine. E.g. via 0, gives previous example.

Zoom in: 0|24|56|7|13 0|4|2|6|5|7|13 0|2|4|5|6|7|13 0|2|4|5|6|7|3|1 0|2|4|5|6|7|1|3 0|4|2|6|5|7|1|3 0|4|2|6|5|7|3|1 Equal Equal Equal First : p = 02456713 Nump =5667952

Example 0|2|4|5|6|7|1|3 p = 02456713 Nump =5667952 0 1 0 1 0 0 0 1 1 0 1 0 1 0 0 0 0 1 0 1 0 0 1 0 1 0 1 0 1 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 1 0 1 1 0 0 1 0 0 1 0 1 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 A = Pp’ = Pp = 0 1 0 1 0 0 0 1 0 1 0 1 0 0 1 0 0 1 0 1 0 1 0 0 0 0 0 0 1 0 1 1 0 0 1 0 0 1 0 1 1 0 0 0 0 1 1 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 1 0 1 1 0 0 0 1 0 0 1 1 0 0 1 0 1 1 0 0 0 1 0 1 0 1 0 0 1 0 0 1 1 0 0 0 1 1 1 0 0 0 0 0 1 1 1 0 0 0 0 0 Num(G)=(0000010101100111110001110000) value = 5667952 PpA = PpAPp’ = Note this is the example we saw earlier, but used s : 01234567  06172345 and found Ps’Aps Rather than p = 02456713 and PpAPp’ swaps rows swaps columns

Pruning By calculating Nump early.. By generating and exploiting automorphisms during search.

Pruning 1 Better : p = 13567024 Nump =5192304 Equal Pruning! See next slide

2|04|57|6|13 Worse: (gives value > 5192304) …. But why? 2|4|0|7|5|6|13 2|0|4|5|7|6|13 Note that 204576|13 will lead to p with permutation matrix 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 *0 *0 0 0 0 0 * 0 * 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 * * 1 0 0 0 0 0 0 0 0 0 0 0 0 0 * * 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 where * = 0 or 1 Pp= Pp’ =

0 1 0 1 0 0 1 0 0 1 0 1 0 0 0 1 0 1 0 1 0 1 0 0 0 0 0 0 1 0 1 1 1 0 0 0 0 1 1 0 0 0 1 0 0 1 0 1 * * * * * * * * * * * * * * * * 0 0 0 0 0 1 * * 0 0 0 0 1 0 * * 0 0 0 1 0 0 * * 0 0 1 0 1 1 * * 0 1 0 1 0 1 * * 1 0 0 1 1 0 * * * * * * * * * * * * * * * * * * PpAPp’ = PpA = Nump(G)= (000001010110011*************) ≥ (0000010101100110000000000000) value ≥ 5660672 > 5192304 In fact, 5192304 is the best (lowest) value for the whole tree. This is the certificate of the graph.

Pruning 2 • Recall that if p is an ordering of the vertices of graph G = (V,E) corresponding to permutation s on the vertices of G, then • As[i,j] = A[s-1[i],s-1[j]] • In terms of the ordering, Ap denotes the adjacency matrix with respect to that ordering, and Ap[i,j] =A[p[i],p[j]] • is an automorphism if As = Ap = A (nb.p is an automorphismiffs is) • Theorem: Let G=(V,E) be a graph and let p, m  Sym(V) be two orderings of the vertices of G. If Nump (G) = Numm (G) then s = pm-1 is an automorphism of G. • This gives us a way to collect the automorphisms of G, and use them to prune the search tree. We collect the automorphisms via the Schrier vector • (but don’t go into that here!).

Example 1 0|24|56|7|13 0|4|2|6|5|7|13 0|2|4|5|6|7|13 0|2|4|5|6|7|3|1 0|2|4|5|6|7|1|3 0|4|2|6|5|7|1|3 0|4|2|6|5|7|3|1 Equal Equal Equal First : p = 02456713 Nump =5667952 (1,3)  Aut(G) (2,4)(5,6)  Aut(G) Prune, as know (1,3)  Aut(G)

Example 2 Look at the branch starting at 1|3|567|024. Already know that (1,3) and (2,4)(5,6)  Aut(G) 1|3|567|024 Split on 7 1|3|7|56|024 Split on 5 1|3|5|67|024 Split on 6 1|3|6|57|024 1|3|5|6|7|0|2|4 1|3|5|7|6|2|0|4 So what do we know about the automorphism group so far? See next slide… • Better • = 13567024 • Num = 5192304 Equal (0,2)(6,7)  Aut(G)

Aut(G) so far Aut(G) contains (1,3); (2,4)(5,6); (0,2)(6,7) and all combinations thereof. So Aut(G) contains the following subgroup: I, (1,3), (2,4)(5,6), (1,3)(2,4)(5,6), (0,2)(6,7), (0,2,4)(5,7,6), (0,2)(1,3)(6,7), (0,2,4)(1,3)(5,7,6), (0,4,2)(5,6,7), (0,4)(5,7), (0,4,2)(1,3)((5,6,7), (0,4)(1,3)(5,7) Remember these for the next slide!

Example 2 Look at the branch starting at 1|3|567|024. Already know that (1,3) and (2,4)(5,6)  Aut(G) 1|3|567|024 Split on 7 1|3|7|56|024 Split on 5 1|3|5|67|024 Split on 6 1|3|6|57|024 No need, as (2,4)(5,6)  Aut(G) No need, as (0,4)(5,7) Aut(G) 1|3|5|6|7|0|2|4 1|3|5|7|6|2|0|4 • Better • = 13567024 • Num = 5192304 Equal (0,2)(6,7)  Aut(G)

Pruned Search Compare to original, unpruned graph on slide 22

History of individualization-refinement paradigm • Introduced by Parris and Read (1969) • Developed by others (Corneil and Gotlieb (1970), Arlazarov et al. (1974) • McKay’s algorithm (later to become nauty, 1980) first to use automorphisms to prune search. • Kreher and Stinson book (from which I used examples) 1999. • Darga et al : saucy (2004), reimplementation using sparse data strauctures, then (2008) introduced early automorphism detection • Juntilla and Kaski (2007): Bliss, same algorithm as saucy, allowed refinement operations to be aborted early • Traces (Piperino, 2008) : major revision of how search tree is scanned, now part of nauty (2014). • Traces differs from nauty in 4 ways: • Node visiting strategy (BFS) • Manipulation of automorphisms using Schreier-Simms algorithm. • Refinement function • The selection of the next refinement step

Bibliography • R. Parris and R. Read. A coding procedure for graphs. Scientific report UWI/CC 10. Univ. of W. Indies Computer Centre. (1969) • D. Corneil, C. Gotlieb. An efficient algorithm for graph isomorphism. J. ACM 17, 51-64. 1970. • R. Read. The coding of various kinds of unlabeled trees. In Graph Theory and Computing, R.C. Read, ed., Academic Press, 1972, pp. 153-182 • V. Arzalarov, L. Zuev, A. Uskov, L. Faradzev. An algorithm for the reduction of finite non-oriented graphs to canonical form. Z Vycisi. Mat. Mat. Fiz. 14. pp. 737-743. 1974. • B. McKay. Practical graph isomorphism. Congr. Numer. 30, pp. 45-87. 1980. • D. Kreher and D. Stinson. Combinatorial Algorithms, Generation, Enumeration and Search. CRC Press. 1999. • P. Darga, M. Liffiton, K. Sakallah, L. Markov. Exploiting structure in symmetry detection for CNF. In Proceedings of the 41st Design Automation Conference, pp. 530-534. 2004. • T. Junttila, P. Kaski. Engineering of an efficient canonical labeling tool for large and sparse graphs. In Proceedings of the 9th Workshop on Algorithm Engineering and Experiments. Pp. 135-149. 2007. • P. Darga, K. Sakallah, L. Markov. Faster symmetry discovery using sparsity of symmetries. In Proceedings of the 45th Design Automation Conference, pp. 149-154. 2008. • A. Piperno. Search space contraction in canonical labeling of graphs. Arxiv.org/abs/0804,4881 • B. McKay and A. Piperno. Practical graph isomorphism II. Congr. Numer. J. Symb. Computation 60. pp. 94-112. 2014.

Canonical Graph labelling Alice Miller

Canonical Graph labelling Alice Miller

Presentation Transcript

Food Labelling

Canonical Correlation

Fast canonical labelling of graphs with sites

Labelling Triangles

Labelling

Labelling Theory

Food Labelling

Canonical composers

Canonical Distance

Canonical Transformation

Canonical Producer

Canonical Correlation

Canonical Analysis

Food Labelling

Canonical Correlation

Sandwich Labelling

Canonical correlations

Canonical Transformation

Energy labelling

Region labelling

Labelling Forces

Canonical and non-canonical clause structures