Code to carry out the action mapping task is designed on the basis of an exhaustive case analysis in which source tree constructs are implemented by sequences of target machine assembly code instructions. This case analysis is used to ensure that every source program can, in fact, be implemented correctly by some target program. The goal should be a clean, systematic translation that is easy to understand and to verify. Special-case optimization should not be included in the initial design, because it distracts the designer and sometimes leads to more convoluted translations.
Once the case analysis has been completed, the assembly code sequences must be abstracted and represented by constructs of the abstract target program tree. The purpose of this abstraction is to defer decisions like register allocation, evaluation order within expressions, and instruction selection. It abstracts the appropriate characteristics of the target machine, and is independent of any particular source language. Because of that independence, there may be no adequate abstraction for some of the assembly language sequences needed to implement particular source language constructs. In that case the designer must decide whether to extend the target abstraction used for their compiler. Such extensions are common, but should be carefully justified.
The final design is a mapping from source tree constructs to target tree constructs that specifies sets of translations and when each is applicable. Standard techniques can then be used to generate code that carries out the mapping.
It is always important to go to the best sources for information about the problem: the source language standard and the target machine manual. Textbooks can provide overviews and discussion, but the details are sometimes missing or incorrect. The target machine for homework in this course is the SPIM simulator for the MIPS R2000. An additional reference discussing the MIPS hardware is the book by Kane. Use the SPIM Manual as your primary reference for the target machine and the Kane book as backup for providing insight.
Peer review is a good way to weed out errors. You can also try out the effects of your assembly language sequences by stepping through them on the SPIM simulator. The commands spim (providing a command line interface) and xspim (providing an X-windows interface) can be used to run SPIM on either rintintin or nag, or you can download a copy and implement it on some other machine of your choice.
One obvious starting point for the case analysis is the source language arithmetic and logical operators. These operators are usually easy to understand, and correspond to very short assembly language sequences (often just single instructions). Try to write the most general sequence in each case.
As an example, consider the C-- integer addition operation (see Waite, Section A 1.5.2 and Table AI). This operation can be implemented by a single SPIM instruction:
Note that specific register numbers aren't given. The reason is that the register numbers must be determined by the context in which the integer addition operation appears.
The implementation given in the previous paragraph is the most general one. SPIM also provides an integer add instruction in which r3 is replaced by an integer constant. That instruction is less general, because its right operand is restricted. It is also unnecessary for a complete mapping, because SPIM has an instruction to load an integer constant into a register. Thus a C-- integer addition operation can be carried out by a register add instruction even when the right operand is an integer denotation.
The C-- complement operator illustrates the care that must be taken to ensure that the assembly language sequence actually has the effect defined for the source language construct. According to the C-- definition (Waite Section A.1.5.2), the effect of the expression !e is the same as the effect of the expression:
e == 0 ? 1 : 0
A search of the SPIM instruction set reveals that the following will implement !e correctly if the value of e is in r2:
Specific implementations of the arithmetic and logical operators impose requirements in addition to those stemming from the source language definition. For example, the sequences presented above require that operands be loaded into registers and that results reside in registers. Thus a SPIM li instruction would be needed to implement a C-- IntVal construct.
Sometimes a construct simply provides an environment for code, as in the case of DyadicExp: The code corresponding to this construct is actually the code corresponding to its Binop child. DyadicExp only represents the ``glue'' binding instructions like add and sub together.
Other constructs, like OneSided (Waite Section A.1.4.1), embody components implemented by code sequences of arbitrary length. When doing the case analysis, simply write descriptions of those component sequences:
Code to evaluate the component Expression into r beqz r,0,L Code to execute the component Body L:
The examples given below use the abstraction described in An Encoder for SPIM; you may use it for your translation. It is important to note that this abstraction is given, and is not a result of the action mapping design process.
Instruction selection decisions, described in Section 9.1.2 of Waite, are generally trivial for SPIM because only one instruction is available to implement each machine operation. Thus each abstract SPIM node alone provides enough information to support the code generator's instruction selection decision.
A target machine abstraction supports deferred decisions about resource allocation by prohibiting the specification of register numbers. Each value created by the program is represented by a node in the tree, and is used exactly once in the context specified by its parent. For example, consider the C-- expression x+y*z. Abstract SPIM constructs that could be used to represent this expression are:
RULE iRegrr: IntReg ::= Mop IntReg IntReg END; RULE iLoad: IntReg ::= Mop Memory END; RULE Local: Memory ::= Denoter END;
If x were stored at offset -4 in the activation record, y at offset -8, and z at offset -12, then the target program tree fragment would be:
iRegrr(add, iLoad(lw,Local(-4)), iRegrr(mul, iLoad(lw,Local(-8)), iLoad(lw,Local(-12)) ) )
(Here each Mop value is written as the corresponding SPIM assembly language operation code and each Denoter value as the number it denotes.)
The abstraction supports deferred decisions about execution order by using a tree rather than a sequence. An instruction represented by a construct must be executed after all of the instructions represented by its children have been executed, but there is generally no restriction on the order in which the children's instructions are carried out. Assuming appropriate register allocations, either of the following sequences would correctly implement the computation of x+y*z:
Sequence using 3 registers Sequence using 2 registers
lw $3,-4($fp) lw $4,-8($fp) lw $5,-12($fp) mul $4,$4,$5 add $3,$3,$4
lw $3,-8($fp) lw $4,-12($fp) mul $3,$3,$4 lw $4,-4($fp) add $3,$3,$4
Here is an example of a tabular format, applied to the sequences described above:
|Integer addition||add rl,r2,r3||add_iRegrr(el,e2)|
|Complement||seq rl,r2,0||seq-iRegri(e, 0)|
Code to evaluate Expression->r beqz r,L Code to execute Body L:
do_branch( beqz_Branchr(e), L) Items coding Body define_label(L)
The first column of the table is the name of a source language construct, the second column is the assembly code sequence implementing that construct, and the third is the (sequence of) factory method calls needed to build the target program abstraction of the assembly code sequence. 0 stands for a specific Denoter, while L stands for a definition table key representing the label. Abstract target program trees implementing expressions are represented by e, el and e2.
The implementation of OneSided consists of an arbitrary sequence of trees. That sequence begins with the tree implementing the conditional transfer of control, whose destination is determined by the translator. A list of items that implement the actions of the body of the OneSided follow, and the sequence ends with the item implementing the label definition.
Complete your personal postmortem for this assignment and your personal project plan for the next assignment, and submit them via the web before class on the due date of this assignment. Your group must hand in one set of answers to the questions at the beginning of class on the due date.
|Instructor||Revision 1.4 (1998/11/1417:18:23)|