1 / 72

Register Allocation: Graph Coloring

Register Allocation: Graph Coloring. Compiler Baojian Hua bjhua@ustc.edu.cn. Middle and Back End. translation. AST. IR1. translation. IR2. other IR and translation. asm. Back-end Structure. instruction selector. IR. Assem. register allocator. TempMap. instruction scheduler.

dalit
Download Presentation

Register Allocation: Graph Coloring

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Register Allocation: Graph Coloring Compiler Baojian Hua bjhua@ustc.edu.cn

  2. Middle and Back End translation AST IR1 translation IR2 other IR and translation asm

  3. Back-end Structure instruction selector IR Assem register allocator TempMap instruction scheduler Assem

  4. int f (int x, int y){ int a,b,c,d; int t1, t2; pushl %ebp movl %esp, %ebp movl 8(%ebp), t1 movl 12(%ebp), t2 movl t1, a addl t2, a movl a, b addl $4, b movl b, %eax imult $2 movl %eax, c movl c, %eax cltd idivl $8 movl %eax, d movl d, %eax leave ret } InstructionSelection Prolog int f (int x, int y) { int a; int b; int c; int d; a = x + y; b = a + 4; c = b * 2; d = c / 8; return d; } y: 12(%ebp) x: 8(%ebp) Positions for a, b, c, d can not be determined during this phase. Epilog

  5. Register allocation • After instruction selection, there may be some variables left • basic idea: • put as many as possible of these variables into registers • speed! • Into memory, only if the register are out of supply • This process is called register allocation • the most popular and important optimization in modern compilers

  6. int f (int x, int y){ int a,b,c,d; int t1, t2; pushl %ebp movl %esp, %ebp movl 8(%ebp), t1 movl 12(%ebp), t2 movl t1, a addl t2, a movl a, b addl $4, b movl b, %eax imult $2 movl %eax, c movl c, %eax cltd idivl $8 movl %eax, d movl d, %eax leave ret } RegisterAllocation Suppose that the register allocation determines that (we will discuss how to do this a little later): a => %eax b => %eax c => %eax d => %eax t1 => %eax t2 => %edx (this data structure is called a temp map)

  7. .text .globl f f: pushl %ebp movl %esp, %ebp movl 8(%ebp), t1 movl 12(%ebp), t2 movl t1, a addl t2, a movl a, b addl $4, b movl b, %eax imult $2 movl %eax, c movl b, %eax cltd idivl $8 movl %eax, d movl d, %eax leave ret Rewriting %eax With the given temp map: a => %eax b => %eax c => %eax d => %eax t1 => %eax t2 => %edx %edx %eax %eax %edx %eax The rest are left to you! We can rewrite the code accordingly, to generate the final assembly code.

  8. .globl f f: pushl %ebp movl %esp, %ebp movl 8(%ebp), %eax movl 12(%ebp), %edx movl %eax, %eax addl %edx, %eax movl %eax, %eax addl $4, %eax movl %eax, %eax imult $2 movl %eax, %eax movl %eax, %eax cltd idivl $8 movl %eax, %eax movl %eax, %eax leave ret Peep-holeOptimization Peep-hole optimizations try to improve the code by examine the code using a code window. It’s of a local manner. For example, we can use a code window of width 1, to eliminate the obvious redundancy of the form: movl r, r

  9. // This function does // NOT need a (stack) // frame! .text .globl f f: pushl %ebp movl %esp, %ebp movl 8(%ebp), %eax movl 12(%ebp), %edx addl %edx, %eax addl $4, %eax imult $2 cltd idivl $8 leave ret Final Assembly int f (int x, int y) { int a; int b; int c; int d; a = x + y; b = a + 4; c = b * 2; d = b / 8; return 0; }

  10. int f (int x, int y){ int a,b,c,d; int t1, t2; pushl %ebp movl %esp, %ebp movl 8(%ebp), t1 movl 12(%ebp), t2 movl t1, a addl t2, a movl a, b addl $4, b movl b, %eax imult $2 movl %eax, c movl c, %eax cltd idivl $8 movl %eax, d movl d, %eax leave ret } RegisterAllocation Register allocation determines a temp map: a => %eax b => %eax c => %eax d => %eax t1 => %eax t2 => %edx How to generate such a temp map? Key observation: two variables can reside in one register, iff they don NOTlivesimultaneously.

  11. int f (int x, int y){ int a,b,c,d; int t1, t2; pushl %ebp movl %esp, %ebp movl 8(%ebp), t1 movl 12(%ebp), t2 movl t1, a addl t2, a movl a, b addl $4, b movl b, %eax imult $2 movl %eax, c movl c, %eax cltd idivl $8 movl %eax, d movl d, %eax leave ret } LivenessAnalysis So, we can perform liveness analysis to calculate the live variable information. On the right, we mark, between each two statements, the liveOut set. {…} {eax} {d} {eax} {eax}

  12. int f (int x, int y){ int a,b,c,d; int t1, t2; pushl %ebp movl %esp, %ebp movl 8(%ebp), t1 movl 12(%ebp), t2 movl t1, a addl t2, a movl a, b addl $4, b movl b, %eax imult $2 movl %eax, c movl c, %eax cltd idivl $8 movl %eax, d movl d, %eax leave ret } InterferenceGraph (IG) Register allocation determines that: (the temp map) a => %eax b => %eax c => %eax d => %eax t1 => %eax t2 => %edx t2 ∞ t1 a ∞ t2 a b %eax %eax c d %eax %eax t1 t2 %edx %eax

  13. Steps in Register Allocator • Do liveness analysis • Build the interference graph (IG) • draw an edge between any two variables which don’t live simultaneously • Coloring the IG with K colors (registers) • K is the number of available registers on a machine • A classical problem in graph theory • NP-complete (for K>=3), thus one must use heuristics • Allocate physical registers to variables

  14. History • Early work by Cocke suggests that register allocation can be viewed as a graph coloring problem (1971) • The first working allocator is Chaitin’s for IBM PL/1 compiler (1981) • Later, IBM PL.8 compiler • Have some impact on the RISC

  15. History, cont • The more recent graph coloring allocator is due to Briggs (1992) • For now, the graph coloring is the most popular allocator, used in many production compilers • e.g., GCC • But more advanced allocators invented in recent years • so, graph coloring is a lesson abandoned? • more on next few lectures …

  16. Graph coloring • Once we have the interference graph, we can try to color the graph with K colors • K: number of machine registers • adjacent nodes with difference colors • But this problem is a NP-complete problem (for K>=3) • So we must use some heuristics

  17. Kempe’s Allocator

  18. Kempe’s Theorem • [Kempe] Given a graph G with a node n such that degree(n)<K, G is K-colorable iff (G-{n}) is K-colorable (remove n and all edges connect n) • Proof? degree(n)<K n …

  19. Kempe’s Algorithm kempe(graph G, int K) while (there is any node n, degree(n)<K) remove this node n assign a color to the removed node n // greedy if (G is empty) // i.e., G is K-colorable return success; return failure;

  20. Example degree(a) = 3<4 remove node “a”, assign the first available color a b e c d K = 4 1, 2, 3, 4

  21. Example degree(a) = 3<4 remove node “a”, assign the first available color a b degree(b) = 2<4 remove node “b”, assign the first available color e Here, we want to choose the node with lowest degree, what kind of data structure should we use? c d K = 4 1, 2, 3, 4

  22. Example degree(a) = 3<4 remove node “a”, assign the first available color a b degree(b) = 2<4 remove node “b”, assign the first available color e degree(c) = 2<4 c d remove node “c”, assign the first available color K = 4 1, 2, 3, 4

  23. Example degree(a) = 3<4 remove node “a”, assign the first available color a b degree(b) = 2<4 remove node “b”, assign the first available color e degree(c) = 2<4 c d remove node “c”, assign the first available color degree(d) = 1<4 remove node “d”, assign the first available color K = 4 1, 2, 3, 4

  24. Example degree(a) = 3<4 remove node “a”, assign the first available color a b degree(b) = 2<4 remove node “b”, assign the first available color e degree(c) = 2<4 c d remove node “c”, assign the first available color degree(d) = 1<4 remove node “d”, assign the first available color K = 4 1, 2, 3, 4 degree(e) = 0<4 remove node “e”, assign the first available color

  25. Example degree(a) = 3<4 remove node “a”, assign the first available color a b degree(b) = 2<4 remove node “b”, assign the first available color e degree(c) = 2<4 c d remove node “c”, assign the first available color degree(d) = 1<4 remove node “d”, assign the first available color K = 4 1, 2, 3, 4 degree(e) = 0<4 remove node “e”, assign the first available color

  26. Example So this graph is 3-colorable. But if we have three colors, we can NOT apply the Kempe algorithm. (Why?) a b We can refine it to the following one: e kempe(graph G, int K) stack = []; while (true) remove and push node<K to stack; if node>=K, remove and push it pop stack and assign colors c d K = 3 1, 2, 3 Essentially, this is a lazy algorithm!

  27. Example remove node “a”, push onto the stack a b e c d K = 3 1, 2, 3

  28. significant Example a remove node “a”, push onto the stack remove node “b”, push onto the stack a b e c d K = 3 1, 2, 3

  29. significant Example b a remove node “a”, push onto the stack remove node “b”, push onto the stack a b remove node “c”, push onto the stack e c d K = 3 1, 2, 3

  30. significant Example d c b a remove node “a”, push onto the stack remove node “b”, push onto the stack a b remove node “c”, push onto the stack remove node “d”, push onto the stack e remove node “e”, push onto the stack c d K = 3 1, 2, 3

  31. significant Example e d c b a remove node “a”, push onto the stack remove node “b”, push onto the stack a b remove node “c”, push onto the stack remove node “d”, push onto the stack e remove node “e”, push onto the stack pop the stack, assign suitable colors c d pop “e” K = 3 1, 2, 3

  32. significant Example d c b a remove node “a”, push onto the stack remove node “b”, push onto the stack a b remove node “c”, push onto the stack remove node “d”, push onto the stack e remove node “e”, push onto the stack pop the stack, assign suitable colors c d pop “e” pop “d” K = 3 1, 2, 3

  33. significant Example c b a remove node “a”, push onto the stack remove node “b”, push onto the stack a b remove node “c”, push onto the stack remove node “d”, push onto the stack e remove node “e”, push onto the stack pop the stack, assign suitable colors c d pop “e” pop “d” pop “c” K = 3 1, 2, 3

  34. significant Example b a remove node “a”, push onto the stack remove node “b”, push onto the stack a b remove node “c”, push onto the stack remove node “d”, push onto the stack e remove node “e”, push onto the stack pop the stack, assign suitable colors c d pop “e” pop “d” pop “c” K = 3 1, 2, 3 pop “b”

  35. significant Example a remove node “a”, push onto the stack remove node “b”, push onto the stack a b remove node “c”, push onto the stack remove node “d”, push onto the stack e remove node “e”, push onto the stack pop the stack, assign suitable colors c d pop “e” pop “d” pop “c” K = 3 1, 2, 3 pop “b” pop “a”

  36. Example remove node “a”, push onto the stack remove node “b”, push onto the stack a b remove node “c”, push onto the stack remove node “d”, push onto the stack e remove node “e”, push onto the stack pop the stack, assign suitable colors c d pop “e” pop “d” pop “c” K = 3 1, 2, 3 pop “b” pop “a”

  37. Moral • Kempe’s algorithm: • step #1: simplify • remove graph nodes, be optimistic • step #2: select • assign a color for each node, be lazy • You should use this algorithm for your lab6 first • But what about the select phase fail? • no enough colors (registers)!

  38. Example remove node “a”, push onto the stack a b e c d K = 2 1, 2

  39. Failure • It’s often the case that Kempe’s algorithm fails • The IG is not K-colorable • The basic idea is to generate spilling code • some variables should be put into memory, instead of into registers • Usually, spilled variables reside in the call stack • Should modify code using such variables: • for variable use: read from the memory • for variable def: store into the memory

  40. Spill code generation • The effect of spill code is to turn long live range into shorter ones • This may introduce more temporaries • The register allocator should start over, after generating spill code • We’ll talk about this shortly

  41. Chaitin’s Allocator

  42. Chaitin’s Algorithm • Build: build the interference graph (IG) • Simplify: simplify the graph • Spill: for significant nodes, mark it as potential spill (sp), remove it and continue • Select: pop nodes and try to assign colors • if this fails for potential spill node, mark potential spill as actural spill and continue • Start over: generate spill code for actural spills and start over from step #1 (build)

  43. Chaitin’s Algorithm build simplify Potential spill Select Actual spill

  44. Step 1: build the IG a b a = 1 b = 2 c = a+b d = a+c e = a+b f = d+e c d e f K = 2 1, 2

  45. Step 2: simplification a b a = 1 b = 2 c = a+b d = a+c e = a+b f = d+e c d f e f K = 2 1, 2

  46. Step 2: simplification a b a = 1 b = 2 c = a+b d = a+c e = a+b f = d+e e c d f e f K = 2 1, 2

  47. Step 2: simplification a b a = 1 b = 2 c = a+b d = a+c e = a+b f = d+e c ps e c d f e f K = 2 1, 2

  48. Step 2: simplification a b a = 1 b = 2 c = a+b d = a+c e = a+b f = d+e d ps c ps e c d f e f K = 2 1, 2

  49. Step 2: simplification a a b a = 1 b = 2 c = a+b d = a+c e = a+b f = d+e d ps c ps e c d f e f K = 2 1, 2

  50. Step 2: simplification b a a b a = 1 b = 2 c = a+b d = a+c e = a+b f = d+e d ps c ps e c d f e f K = 2 1, 2

More Related