Combining Analyses

When we do an analysis of a program, we gain certain information. We can then use that information in later analyses. For example, we've seen how constant propagation can be used to eliminate edges that will never be traversed and thus allow computations to be moved out of loops. Sometimes, however, two analyses depend on each other in such a way that they cannot effectively be carried out separately. In that case, they should be combined (Combining Analyses, Combining Optimizations, ACM Transactions on Programming Languages and Systems 17 (March, 1995) 181-196).

Example

Consider the following trivial C procedure:
int
p(void)
{ int x = 1;
  do { if (x != 1) x = 2; } while (pred());
  return x;
}
Here pred is some arbitrary predicate about which we know nothing.

[Discuss-

]

SSA representation

1:
p11: x:=1
p12: Jump(2)
2:
p21: x1:=phi(x,x3)
p22: V21:=x1
p23: V22:=V21!=1
p24: JumpT(3,V22)
p25: JumpF(4,V22)
3:
p31: x2:=2
p32: Jump(4)
4:
p41: x3:=phi(x1,x2)
p42: V41:=pred
p43: V42:=V41!=0
p44: JumpT(2,V42)
p45: JumpF(5,V42)
5:
p51: V51:=x3
p52: Return(V51)

[Discuss-

]

Propagate constants

Assume that the work list is operated like a stack. We first run through all of the computations in order, as though they had been on the work list. As we do this, we place only lower-numbered program points on the work list (because higher-numbered ones will be reached in due course). After the first complete scan, we take program points off of the work list.

StepProgram pointResultTo work list
1p11x = 1--
2p21x1 = 1--
3p22V21 = 1--
4p23V22 = 0--
5p31x2 = 2--
6p41x3 = Variablep21
7p42V41 = Variable--
8p43V42 = Variable--
9p51V51 = Variable--
10p21x1 = Variablep22, p41
11p41x3 = Variable--
12p22V21 = Variablep23
13p23V22 = Variable--

[Discuss-

]

Combine analyses

Suppose that we make the constant propagation calculation dependent on the executability calculation as well as vice-versa:
fe+(e,v1,v2) =
  e == Non-executable ? Undetermined : f+(v1,v2)

[Discuss-

]

Combined calculation

Assume that the work list is operated like a stack. We first run through all of the computations in order, as though they had been on the work list. As we do this, we place only lower-numbered program points on the work list (because higher-numbered ones will be reached in due course). After the first complete scan, we take program points off of the work list.

StepProgram pointResultTo work list
1p11 x = 1--
Executable--
2p21 x1 = 1--
Executable--
3p22 V21 = 1--
Executable--
4p23 V22 = 0--
Executable--
5p24Non-executable--
6p25Executable--
7p31 x2 = Undetermined--
Non-executable--
8p32Non-executable--
9p41 x3 = 1p21
Executable--
10p42 V41 = Variable--
Executable--
11p43 V42 = Variable--
Executable--
12p44Executable--
13p45Executable--
14p51 V51 = 1--
Executable--
15p52Executable--
16p21 x1 = 1--
Executable--

[Discuss-

]

Global value numbering

Global value numbering clearly uses information from constant propagation and executability analysis; constant propagation can also use information from global value numbering: Here's an example in which global value numbering must be combined with constant propagation and executability determination to get the right answers:
main()
{ int x = 1;
  int z = read();	/* z is not constant */
  int y = z;		/* y equal to z but not constant */

  while (pred()) {
    if (y != z) x = 2;	/* y equal to z, so x unchanged */
    x = 2 - x;		/* x is still 1 after this */
    if (x != 1) y = 2;	/* x == 1, so y unchanged */
  }
  printf("x is %d\n", x);
}

[Discuss-

]

Value numbering as a data flow analysis problem

Global value numbering establishes a value at each program point, in the same manner as constant propagation. But we've taken the position that we allow either a constant or a value number as an operand of an expression and hence as a value at each program point. In some sense, constants are nothing but distinguished value numbers. This implies that a single semilattice should describe both constants and value numbers. The meet operation and the functions in a single F set should also yield both constants and value numbers, depending on their arguments.

[Discuss-

]

Lattice for global value numbering

[Discuss-

]

Functions for global value numbering

The set F contains functions similar to those in the set for the constant computations. Most can be derived in a straightforward way from those according to the models provided by fe+ and fe!=:
fe+(e,v1,v2) =
  e == Non-executable ? Undetermined : f+(v1,v2)
f+ Undetermined (K,i0) (V,i0) Variable
Undetermined Undetermined Undetermined Undetermined Undetermined
(K,i1) Undetermined (K,i0+i1) (V,C+((V,i0),(K,i1))) Variable
(V,i1) Undetermined (V,C+((K,i0),(V,i1))) (V,C+((V,i0),(V,i1))) Variable
Variable Undetermined Variable Variable Variable
Here the function C+ is the congruence function that yields the value number for the expression described by a + operator with two operands.
fe!=(e,v1,v2) =
  e == Non-executable ? Undetermined : f!=(v1,v2)
f!= Undetermined (K,i0) (V,i0) Variable
Undetermined Undetermined Undetermined Undetermined Undetermined
(K,i1) Undetermined (K,i0!=i1) (V,C!=((V,i0),(K,i1))) Variable
(V,i1) Undetermined (V,C!=((K,i0),(V,i1))) i0==i1 ? (K,0) :
(V,C!=((V,i0),(V,i1)))
Variable
Variable Undetermined Variable Variable Variable
Here the function C!= is the congruence function that yields the value number for the expression described by a + operator with two operands.

Function fphi is not monotone:

fphi = v == Variable ? (V,NewValueNumber()) : v
(Here v is the meet of all of the arguments and NewValueNumber is a function that yields a value number that has never been used.)

[Discuss-

]

SSA representation of main

1:
p11: x1:=1
p12: z1:=read()
p13: y1:=z1
p14: Jump(2)
2:
p21: x2:=phi(x1,x52)
p22: y2:=phi(y1,y2,y7)
p23: V21:=pred()
p24: JumpT(3,V21)
p25: JumpF(8,V21)
3:
p31: V31:=y2!=z1
p32: JumpT(4,V31)
p33: JumpF(5,V31)
4:
p41: x4:=2
p42: Jump(5)
5:
p51: x51:=phi(x2,x4)
p52: x52:=2-x51
p53: Jump(6)
6:
p61: V61:=x52!=1
p62: JumpT(7,V61)
p63: JumpF(2,V61)
7:
p71: y7:=2
p72: Jump(2)
8:
p81: V81:=x2
p82: printf(V81)
p83: Return()

[Discuss-

]

Combined global value numbering

Assume that the work list is operated like a stack. We first run through all of the computations in order, as though they had been on the work list. As we do this, we place only lower-numbered program points on the work list (because higher-numbered ones will be reached in due course). After the first complete scan, we take program points off of the work list.

StepProgram pointResultTo work list
1p11 x1 = (K,1)--
Executable--
2p12 z1 = (V,1)--
Executable--
3p13 y1 = (V,1)--
Executable--
4p14Executable--
5p21 x2 = 1--
Executable--
6p22 y2 = (V,1)--
Executable--
7p23 V21 = (V,2)--
Executable--
8p24Executable--
9p25Executable--
10p31 V31 = 0--
Executable--
11p32Non-executable--
12p33Executable--
13p41 x4 = Undetermined--
Non-executable--
14p42Non-executable--
15p51 x51 = 1--
Executable--
16p52 x52 = 1p21
Executable--
17p53Executable--
18p61 V61 = 0--
Executable--
19p62Non-executable--
20p63Executable--
21p71 y7 = Undetermined--
Non-executable--
22p72Non-executable--
23p81 V81 = 1--
Executable--
24p82Executable--
25p83Executable--
26p21 x2 = 1--

[Discuss-

]


waite@isa.informatik.th-darmstadt.de
Revision $Revision: 1.1 $