# Values

A computer program calculates a number of values. One way to improve its performance is to reduce the number of values it calculates. This is done primarily by avoiding repeated calculation of the same value. The problem for the compiler is then to determine when two computations yield the same value.

## Redundant computations

A programmer generally does not knowingly write the code to compute the same value several times. The programming language may, however, conceal such recomputations in such a way that the programmer has no choice in the matter. Address calculations are the primary culprit.

Suppose that the array a has bounds l1, u1, l2, u2, l3, u3. Assuming that the array is stored by rows starting at address A, and that the size of each element is e, the address of a[i,j,k] can be expressed as:

```A + e * ( (i-l1) +
(j-l2) * (u1-l1) +
(k-l3) * (u1-l1) * (u2-l2)
)
```
Let di = (ui-li), and after a little algebra:
```A - e * (l1 + l2*d1 + l3*d1*d2)
+ i*e + j*e*d1 + k*e*d1*d2)
```

### [Discuss-

• The programmer has no knowledge of or control over these computations
• The first line is a constant that depends only on the array dimensions
• d1*d2 is a constant
• The computation to be done at run time is A - C0 + i*C1 + j*C2 + k*C3
• A similar expression can be obtained if the array is stored by columns

## Example

Consider the following expression that is part of a typical scientific computation:
```a[i,j,k] := ( a[i-1,j,k] + a[i+1,j,k] +
a[i,j-1,k] + a[i,j+1,k] +
a[i,j,k-1] + a[i,j,k+1] ) / 6.0;
```
The address of a[i,j,k] was given above. The address of a[i-1,j,k] is:
```A - C0 + (i-1)*C1 + j*C2 + k*C3
= A - C0 - C1 + i*C1 + j*C2 + k*C3
= A - C01 + i*C1 + j*C2 + k*C3
```

### [Discuss-

• i*C1 + j*C2 + k*C3 appears in every index computation
• Only the ``base address'' changes
• There is no practical way for the programmer to specify that i*C1 + j*C2 + k*C3 should only be computed once
• A non-optimizing compiler will translate each array access independently, thus computing this same value 7 times

## How are values determined?

The normal code generator transforms the expression tree into a sequence of instructions. Considering the MIPS R2000, each instruction does one of the following things:
• Loads a register
• Computes a value into a register from values in registers
• Stores a register
• Transfers control

### [Discuss-

• RISC machine only computes values into registers
• Operands of a computation are always registers
• Specific operations that read and write memory, and do nothing else
• Strong distinction between registers and memory

## Optimizer conventions

The optimizer represents the program as a sequence of actions that are basically identical to RISC instructions.
• Values computed from other values by single operations:
```p: V:=E
```
Here E is an expression with exactly one operator
• Value read from a variable:
```p: V:=S
p: V1:=S[V2]
```
• Value written to a variable:
```p: S:=V
p: S[V1]:=V2
```
• Conditional and unconditional control transfers:
```p: Jump(L)
p: JTrue(L,V)
p: JFalse(L,V)
```

### [Discuss-

• Each action is a ``tuple''; p is a ``program point''
• Distinction between values and variables
• Possibility that variable addresses are computed
• Values are identified by integers
• Distinct integers represent distinct values

## When are values identical?

It is important to note that, for the optimizer, two values are identical if and only if they are obtained from the same values by use of the same operator. Consider the following sequence:
```pi: Vj:=Vm+Vm
pi+1: Vk:=Vm*2
pi+2: Vj:=Vm+Vm
```
Vj and Vk are not identical, even though they represent the same numerical result. The computations at program points i and i+2, on the other hand, involve identical operations on identical values and thus lead to identical values.

### [Discuss-

• The same computation can appear at any number of program points
• We say that values are identical if their defining tuples are congruent
• Congruence is used because mathematical equality is undecidable in general and complicated even in decidable cases
• It's common to extend the notion of congruence to encompass some simple identities:
• v + 0 is congruent to v
• v * 1 is congruent to v
• v2 op v1 is congruent to v1 op v2 when op is commutative
• ...
The length of the list depends on the time the compiler designer wishes to spend analyzing congruence.

## Variables

A value is a specific value that gets computed at a particular program point and never changes. A variable, on the other hand, can be changed by writing to it.
• If we see a read tuple for a variable whose value is unknown, the value delivered must be considered new -- not identical to any previous value
• Unless we can track the values of variables, we'll never find identical values

### [Discuss-

• The abstract program is a sequence of trees
• There is no subtree all of whose leaves are constants
• Non-constant leaves must be variable references
• If all variable references yield distinct values, then by induction on the tree height we will never find any congruent tuples

## Example

The first part of the tuple sequence for the array assignment mentioned above might be:
```p1: V1:=i
p2: V2:=V1*C1
p3: V3:=j
p4: V4:=V3*C2
p5: V5:=V2+V4
p6: V6:=k
p7: V7:=V6*C3
p8: V8:=V5+V7
p9: V9:=V8-C01
p10: V10:=a[V9]
```
If we didn't know that the variables i, j and k were unchanged by this tuple sequence, then we would continue with a sequence having the same form but different values. Actually, however, the next 8 tuples will be identical to those at program points 1-8 and can therefore be omitted.

### [Discuss-

• Need to keep track of the value of a variable
• Need to have some sort of tuple lookup to test congruence
• Writes to a variable are included in this solution
• Need to be able to reliably distinguish variables -- no aliasing

## Basic blocks

A basic block is a sequence of tuples containing no transfers of control: All of the tuples are executed, in order, if any of them are executed.

This technique for tracking the values of variables works within a single basic block. It also enables us to implement an arbitrarily complex basic block as one multiple assignment:

• Each variable used before being assigned in the block is read exactly once
• When a write tuple is encountered, the value is noted
• At the end of the basic block, write tuples are generated for all variables that were assigned
• The value assigned is the one currently shown as being the variable's value

### [Discuss-

• Theoretically interesting
• Demonstrates that all variable reads can be at the beginning of a block and all variable writes at the end
• Reads and writes are ``boundary conditions''

## Regions larger than a basic block

It is possible to identify several regions larger than a basic block that have useful properties for optimization:
Extended basic block
A maximal tree of code sequences
Strongly-connected region
A collection of program points in which there is a path from every program point to every other program point
Procedure
A collection of program points constituting a unit of abstraction

### [Discuss-

• Our previous technique for tracking variable values still works in an extended basic block but not in any larger region
• Computations yielding fixed values within a strongly connected region should be moved out of that region
• Modularity allows procedures to be optimized independently

## Tracking variables outside of extended basic blocks

The cornerstone of our analysis of values is our ability to track the values stored in variables. Let's look at the desired result in an extended basic block and then a region that is neither a basic block nor an extended basic block. Here the source language statements are to the right of the box representing a basic block. The top three boxes, enclosed by the dotted line, form an extended basic block. With the addtion of the fourth basic block, the structure no longer satisfies the definition of an extended basic block.

### [Discuss-

• Variable tracking within the extended basic block
• Addition of the join block
• Congruence does not apply to read and write tuples any more
• The symbol is a key that we use to obtain the variable's value
• Our method would work if an new variable name were used instead of x in the join block

## Static single assignment (SSA)

Use a new symbol for a variable whenever it is assigned or its value becomes unknown because of a join. Here's our example, with the source code altered so that it is SSA form. The phi function expresses the fact that any of the possible values reaching the beginning of the join box could be the value of the new variable. It permits some higher-level analysis of value propagation. ### [Discuss-

• Every variable has exactly one value
• There may be no assignment to a symbol, in which case a new value is created when the variable is accessed
• phi never has associated tuples -- it's an artifact of the analysis

## Converting a program to SSA form

Of course the program is not originally written in SSA form, so the compiler must transform it. There are two tasks that must be carried out:
• Find the places where variables are assigned (including joins) and establish the new symbol names
• Consistently rename the variables so that each use has the same symbol as the assignment that establishes its value
For a complete description of the process, see Efficiently Computing Static Single Assignment Form and the Control Dependence Graph, ACM Transactions on Programming Languages and Systems, 13 (1991) 451-490. (Their treatment does not use the standard consistent renaming technique.)

### [Discuss-

• Explicit assignments are easy to spot, but which variables are arguments of phi functions at joins?
• Consistent renaming is a well-known compiler process, but what is the ``scope'' of an assignment?
• It turns out that these two questions are related.

## Control flow graph

Here is an example of a control flow graph for complete procedure that illustrates most of the issues. It is slightly modifed from the cited SSA paper. The numbered nodes represent the code of the procedure. Entry and Exit represent the entry and exit operations of the procedure.

### [Discuss-

• Each numbered node of the graph corresponds to a basic block
• Directed edges represent the successor relation
• For the moment, the actual code is uninteresting

## Dominator trees

A node X in the control flow graph dominates a node Y if X lies on all paths from the entry node to Y. The dominance relation is reflexive and transitive. We say that X strictly dominates Y when X dominates Y and the two are different nodes. Strict dominance can be described by a tree. For our example: See A Fast Algorithm for Finding Dominators in a Flowgraph (ACM Transactions on Programming Languages and Systems 1 (1979) 121-141).

### [Discuss-

• Each node of the dominator tree is a node of the control flow graph
• Linkage among the nodes is the only difference between the dominator tree and the control flow graph
• Show that this tree really does describe strict dominance for this control flow graph

## Expansion of basic blocks

We've already seen that all of the assignments made within a basic block can be collected at the end. Also, if there is an assignment due to a phi function, it occurs at the beginning of the basic block. Thus we can regard each node of the dominator tree as consisting of three nodes: ### [Discuss-

• This expansion merely emphsizes the relations among assignments and computations
• No processing algorithm will actually expand the nodes in this way, but the expansion is reflected in attributes that decorate the tree

## The scope of an assignment

In order to use standard consistent renaming algorithms (see Modularity and Reusability in Attribute Grammars, Acta Informatica, 31(1994) 601-627), we need to identify three roles:
• Defining occurrences
• Scopes
• Applied occurrence
An assignment (either in the original program or as the result of a phi function) is a defining occurrence. If a variable is not assigned, its defining occurrence is its first use. If a variable use is not a defining occurrence, it's an applied occurrence. The scope of a defining occurrence is the subtree of the dominator tree rooted in the node containing that defining occurrence, less any subtrees rooted in nodes containing defining occurrences of the same variable.

### [Discuss-

• A use of a variable preceding any assignment establishes a value for that variable
• An assignment in a node of the dominator tree precedes all uses in nodes of the subtree
• If another assignment intervenes, the use will be in a subtree of the node containing that assignment
• If there is a join, and a variable has been assigned in the blocks joined, there will be an assignment via a phi function at the beginning of the join block

## Example

Let's return to the simple example we used earlier: Here's the expanded dominator tree: The dotted box on the right encloses the expansion of the basic block at the bottom of the previous figure to include the assignment to x via the phi function. (Only the non-empty elements of each expansion are shown.)

### [Discuss-

• All occurrences of x have been systematically renamed in the tuples of both pictures
• Notice how the scope of each variable is related to the structure of the dominator tree
• y is still valid in the lower left block because there was no assignment to it

## Dominance frontiers

The dominance frontier DF(X) of a control flow graph node X is the set of all nodes Y such that X dominates a predecessor of Y but does not strictly dominate Y.

If there is an assignment to a variable in node X, then there must be an assignment to that variable via a phi function at every node in DF(X).

Consider a node Y in DF(X)

• X dominates a predecessor of Y, so there is a path from X to Y
• X does not strictly dominate Y, so there is a path from Entry to Y that doesn't go through X
• Therefore we can arrive at Y with either of two values for the variable assigned in X
There is a linear algorithm for computing all dominance frontiers. (See Efficiently Computing Static Single Assignment Form and the Control Dependence Graph, ACM Transactions on Programming Languages and Systems, 13(1991) 451-490.)

### [Discuss-

• Strictly is important here because it allows a node to lie on its own dominance frontier

## Example

The dominance frontier for each node of our sample control flow graph is given next to that node, enclosed in square brackets: ### [Discuss-

• The dominance frontiers must be determined from a combination of the dominantor tree and the control flow graph
• Notice that none of these algorithms require the tuples

## Global value numbering

We can now determine which tuples result in identical values over the entire procedure:
• Determine the basic blocks, along with their predecessors and successors (this defines the control flow graph)
• Construct the dominator tree
• Add phi functions as necessary to the basic blocks
• Consistently rename the variables
• Generate the tuples in a depth-first, left-to-right traversal of the dominator tree

### [Discuss-

• Note that we end up with a tree representation of the program
• Consistent renaming can be combined with tuple generation
• When tuples are being generated, don't generate duplicates within a basic block
• Note that the whole tuple need not be stored at each program point -- the tuple identification is sufficient

### ]

 waite@isa.informatik.th-darmstadt.de Revision \$Revision: 1.4 \$