optimizing compilers cisc 673 spring 2009 control flow
Download
Skip this Video
Download Presentation
Optimizing Compilers CISC 673 Spring 2009 Control Flow

Loading in 2 Seconds...

play fullscreen
1 / 23

Optimizing Compilers CISC 673 Spring 2009 Control Flow - PowerPoint PPT Presentation


  • 104 Views
  • Uploaded on

Optimizing Compilers CISC 673 Spring 2009 Control Flow. John Cavazos University of Delaware. Control-Flow Analysis. Motivating example: identifying loops majority of runtime focus optimization on loop bodies! remove redundant code, replace expensive operations ) speed up program

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Optimizing Compilers CISC 673 Spring 2009 Control Flow' - hiroko


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
optimizing compilers cisc 673 spring 2009 control flow

Optimizing CompilersCISC 673Spring 2009Control Flow

John Cavazos

University of Delaware

control flow analysis
Control-Flow Analysis
  • Motivating example: identifying loops
    • majority of runtime
    • focus optimization on loop bodies!
      • remove redundant code, replace expensive operations ) speed up program
  • Finding loops:
    • easy…

i = 1; j = 1; k = 1;

A1: if i > 1000 goto L1;

A2: if j > 1000 goto L2;

A3: if k > 1000 goto L3;

do something

k = k + 1; goto A3;

L3: j = j + 1; goto A2;

L2: i = i + 1; goto A1;

L1: halt

for i = 1 to 1000

for j = 1 to 1000

for k = 1 to 1000

do something

  • or harder(GOTOs)
steps to finding loops
Steps to Finding Loops
  • Identify basic blocks
  • Build control-flow graph
  • Analyze CFG to find loops
control flow graphs
Control-Flow Graphs
  • Control-flow graph:
    • Node: an instruction or sequence of instructions (a basic block)
      • Two instructions i, j in same basic blockiff execution of i guarantees execution of j
    • Directed edge: potential flow of control
    • Distinguished start node Entry
      • First instruction in program
identifying basic blocks
Identifying Basic Blocks
  • Input: sequence of instructions instr(i)
  • Identify leaders:first instruction of basic block
  • Iterate: add subsequent instructions to basic block until we reach another leader
basic block partition algorithm
Basic Block Partition Algorithm

leaders = instr(1) // first instruction

for i = 1 to |n| // iterate thru all instrs

if instr(i) is a branch

leaders = leaders ∪ targets of instr(i)

leaders = leaders ∪ instr(i+1) // instr after branch

worklist = leaders

While worklist notempty

x = first instruction in worklist

worklist = worklist – {x}

block(x) = {x}

for (i = x + 1; i <= |n| && i notin leaders; i++)

block(x) = block(x) ∪ {i}

class example
Class Example

leaders = instr(1)

for i = 1 to |n|

if instr(i) is a branch

leaders = leaders

∪ targets of instr(i)

leaders = leaders ∪ instr(i+1)

worklist = leaders

While worklist notempty

x = first instruction in worklist

worklist = worklist – {x}

block(x) = {x}

for (i = x + 1; i <= |n| && i notin

leaders; i++)

block(x) = block(x) ∪ {i}

A = 4

t1 = A * B

L1: t2 = t1/C

if t2 < W goto L2

M = t1 * k

t3 = M + I

L2: H = I

M = t3 – H

if t3 >= 0 goto L4

L3: goto L1

L4: goto L3

basic block example
Basic Block Example
  • A = 4
  • t1 = A * B
  • L1: t2 = t1/C
  • if t2 < W goto L2
  • M = t1 * k
  • t3 = M + I
  • L2: H = I
  • M = t3 – H
  • if t3 >= 0 goto L4
  • L3: goto L1
  • L4: goto L3

Leaders

Basic blocks

control flow edges
Control-Flow Edges
  • Basic blocks = nodes
  • Edges:
    • Add directed edge between B1 and B2 if:
      • BRANCH from last statement of B1 to first statement of B2 (B2 is a leader), or
      • B2 immediately follows B1 in program order and B1 does NOT end with unconditional branch (goto)
control flow edge algorithm
Control-Flow Edge Algorithm

Input: block(i), sequence of basic blocks

Output: CFG where nodes are basic blocks

for i = 1 to the number of blocks

x = last instruction of block(i)

if instr(x) is a branch

for each target y of instr(x),

create edge block i to block y

if instr(x) is not unconditional branch,

create edge block i to block i+1

cfg edge example
CFG Edge Example

A

  • A = 4
  • t1 = A * B
  • L1: t2 = t1/C
  • if t2 < W goto L2
  • M = t1 * k
  • t3 = M + I
  • L2: H = I
  • M = t3 – H
  • if t3 >= 0 goto L4
  • L3: goto L1
  • L4: goto L3

Leaders

B

Basic blocks

C

D

E

F

steps to finding loops1
Steps to Finding Loops
  • Identify basic blocks
  • Build control-flow graph
  • Analyze CFG to find loops
  • Spanning trees, depth-first spanning trees
  • Reducibility
  • Dominators
  • Dominator tree
  • Strongly-connected components
spanning tree
Spanning Tree
  • Build a tree containing every node and some edges from CFG

A

procedure Span (v)

for w in Succ(v)

if not InTree(w)

add w, v→ w to ST

InTree(w) = true

Span(w)

for v in V do inTree = false

InTree(root) = true

Span(root)

B

C

D

E

F

cfg edge classification
CFG Edge Classification

Tree edge:

in CFG & ST

Advancing edge:

(v,w) not tree edge but w is descendant of v in ST

Back edge:

(v,w): v=w or w is proper ancestor of v in ST

Cross edge:

(v,w): w neither ancestor nor descendant of v in ST

A

B

C

loop

D

E

F

depth first spanning tree
Depth-first spanning tree

procedure DFST (v)

pre(v) = vnum++

InStack(v) = true

for w in Succ(v)

if not InTree(w)

add v→w to TreeEdges

InTree(w) = true

DFST(w)

else if pre(v) < pre(w)

add v→w to AdvancingEdges

else if InStack(w)

add v→w to BackEdges

else

add v→w to CrossEdges

InStack(v) = false

for v in V do inTree(v) = false

vnum = 0

InTree(root)

DFST(root)

A

1

B

2

C

3

D

4

E

F

5

6

class problem identify edges
Class Problem: Identify Edges

procedure DFST (v)

pre(v) = vnum++

InStack(v) = true

for w in Succ(v)

if not InTree(w)

add v→w to TreeEdges

InTree(w) = true

DFST(w)

else if pre(v) < pre(w)

add v→w to AdvancingEdges

else if InStack(w)

add v→w to BackEdges

else

add v→w to CrossEdges

InStack(v) = false

for v in V do inTree(v) = false

vnum = 0

InTree(root)

DFST(root)

compiler optimizations i
Compiler Optimizations I
  • Unrolling and other loop optimizations for improving instruction level parallelism (ILP)

IDEA 1: Predict Best Unrolling Factor

IDEA 2: Implement Different Loop Opt

Reversal, Interchange, Fusion, Fission

compiler optimizations ii
Compiler Optimizations II
  • Policies for method inlining

Question 1: What Method Characteristics are most important for Inlining?

Question 2: What is the impact of Inlining to Register Allocation?

compiler optimizations iii
Compiler Optimizations III
  • Escape analysis

IDEA 1 : Move object access that do not “escape” method into registers.

IDEA 2: Allocate objects that do not “escape” on the stack.

compiler optimizations iv
Compiler Optimizations IV
  • Improving virtual memory behavior

IDEA 1: Performance study of optimizations on virtual memory and/or parts of memory hierarchy (e.g., L3 cache). Use tool to analyze memory behavior.

compiler optimizations v
Compiler Optimizations V
  • Class analysis for safe object inlining
  • Policies for object inlining

IDEA 1: Implement object inlining optimization.

compiler optimizations vi
Compiler Optimizations VI
  • Field Reordering
  • Graph Coloring RA
    • Porting of old code
  • Instruction Scheduling for X86
    • Porting PowerPC version to X86
  • Autotuning Java applications
    • Interesting!
next time
Next Time
  • Reducibility
  • Dominance
    • Wikipedia: Dominator (graph_theory)
  • Some Dataflow
    • T.J. Marlowe and B.G. Ryder Properties of Data Flow Frameworks, pp. 121-163, ACTA Informatica, 28, 1990.
ad