Code Motion

Global value numbering enables us to decide whether a computation at a given program point is redundant or not, and therefore whether to eliminate it. Another way of reducing the number of computations a program carries out is to move them from frequently-executed parts of the program to parts that are executed less often. There are, of course, constraints on such moves that must be checked by checking global relationships among computations.

Example

Embedding our array calculation in a loop illustrates the problems and potential profits:
i = l1 + 1;
do {
  j = l2 + 1;
  do {
    k = l3 + 1;
    do {
      a[i,j,k] = ( a[i-1,j,k] + a[i+1,j,k] +
                   a[i,j-1,k] + a[i,j+1,k] +
                   a[i,j,k-1] + a[i,j,k+1] ) / 6.0;
      k++;
    } while (k < u3);
    j++;
  } while (j < u2);
  i++;
} while (i < u1);
Recall that the array assignment computed i*C1 + j*C2 + k*C3 seven times.

[Discuss-

]

Control flow graph and dominator tree

The dominance frontier of each dominator tree node is given in square brackets beside the node:

Here are the tuples, grouped into basic blocks numbered according to the diagram. Only the ones we've already seen from the array assignment are shown, and all program point numbers are omitted. Moreover, the value numbers of the original tuples have been preserved and new ones assigned to the new tuples. Finally, i, j and k without subscripts represent the initial variables; subscripted occurrences represent renamings due to assignments within the for-loops:

1:
V11:=l1+1
i1:=V11
Jump(2)

2:
i2:=phi(i1,i3);
j1:=phi(j,j4);
k1:=phi(k,k5);
V21:=l2+1
j2:=V21
Jump(3)

3:
j3:=phi(j2,j4);
k2:=phi(k1,k5);
V31:=l3+1
k3:=V31
Jump(4)

4:
k4:=phi(k3,k5);
V1:=i2
V2:=V1*C1
V3:=j3
V4:=V3*C2
V5:=V2+V4
V6:=k4
V7:=V6*C3
V8:=V5+V7
V9:=V8-C01
V10:=a[V9]
...
V41:=V6+1
k5:=V41
V42:=V41<u3
JumpT(4,V42)
JumpF(5,V42)

5:
V51:=V3+1
j4:=V51
V52:=V51<u2
JumpT(3,V52)
JumpF(6,V52)

6:
V61:=V1+1
i3=V61
V62:=V61<u1
JumpT(2,V62)
JumpF(exit,V62)

[Discuss-

]

Constraints on code motion

In general, one cannot move a computation

[Discuss-

]

Example

Consider V1 and V2, which are computed in block 4:

Thus the lumped computation could be carried out in any of the three starred blocks.

[Discuss-

]

A more complex example

This example is taken from Lazy code motion, ACM SIGPLAN Notices, 27 (July, 1992) 224-234.

[Discuss-

]

Critical edges

A critical edge is an edge in the flow graph leading from a node with multiple successors to a node with multiple predecessors. Critical edges can cause problems for code motion.

Consider the implementation of

if (i != 0)
  while (j > (k / i)) {
    ...  /* No changes made to i or k here */
    j--;
  }

1:
V11:=i
V12:=V11!=0
JumpT(exit,V12)
JumpF(2,V12)

2:
j1:=phi(j,j2)
V21:=k
V22:=V21/V11
V23:=j1
V24:=V23>V22
JumpT(3,V24)
JumpF(exit,V24)

3:
...
V32:=j1-1
j2:=V32
Jump(2)
Both V21 and V22 are constant within the loop, yet neither can be moved to block 1 because that would move them onto a path along which they would not have been computed by the original program.

[Discuss-

]

Removing critical edges

The problem can be easily resolved: Break the critical edge by introducing an empty basic block:

[Discuss-

]

Pre-tested loops

The original example of the array calculation could have been written using for statements instead of do-while:
for (i = l1 + 1; i < u1; i++)
  for (j = l2 + 1; j < u2; j++)
    for (k = l3 + 1; k < u3; k++)
      a[i,j,k] = ( a[i-1,j,k] + a[i+1,j,k] +
                   a[i,j-1,k] + a[i,j+1,k] +
                   a[i,j,k-1] + a[i,j,k+1] ) / 6.0;
Here is the definition of a for statement:
for (i = l1 + 1; i < u1; i++) S;

i = l1 + 1;
while (i < u1) {
  S;
  i++;
}

[Discuss-

]

Control flow graph and dominator tree

The dominance frontier of each dominator tree node is given in square brackets beside the node:

Here are the tuples, grouped into basic blocks numbered according to the diagram:

1:
V11:=l1+1
i1:=V11
Jump(2)

2:
i2:=phi(i1,i3)
j1:=phi(j,j3)
V21:=i2
V22:=V21<u3
JumpT(3,V22)
JumpF(exit,V22)

3:
V31:=l2+1
j2:=V31
Jump(4)

4:
j3:=phi(j2,j4)
k1:=phi(k,k3)
V41:=j3
V42:=V41<u3
JumpT(5,V42)
JumpF(9,V42)

5:
V51:=l3+1
k2:=V51
Jump(6)

6:
k3:=phi(k2,k4)
V61:=k3
V62:=V61<u3
JumpT(7,V62)
JumpF(8,V62)

7:
V1:=i2
V2:=V1*C1
...
V71:=V6+1
k4:=V71
Jump(6)

8:
V81:=V41+1
j4:=V81
Jump(4)

9:
V61:=V21+1
i3=V61
Jump(2)

[Discuss-

]

Landing pads

A partial solution to the problem is for the translator to modify the implementation of the for statement by duplicating the test:

[Discuss-

]


waite@isa.informatik.th-darmstadt.de
Revision $Revision: 1.4 $