Target Program Tree

Purpose

To solve the C-- to SPIM translation problem:

Due dates

You must complete the PEP tasks (individually), and then hand in the requested material (as a group) at the beginning of class on Monday, November 23. This assignment is worth 40 points.

Background

The action mapping task constructs the abstract target program tree during a computation over the source program tree. Additional computations must then be carried out over this constructed tree to implement the encoding task. In order to make the encoding computations possible, the compiler attaches the constructed abstract target program tree as a sibling of the source program tree. Attributes of target program tree symbols are then computed in the normal way.

Source program constructs like OneSided, which are elements of a Body, have fixed mappings into abstract SPIM constructs. The corresponding abstract target program tree fragments can therefore be constructed by straightforward LIDO computations. Those computations can be written directly from the action mapping design. Target program fragments for constructs like DyadicExp, on the other hand, depend upon the context.

A tree unparser can output the constructed target program tree for diagnostic purposes. That unparser is generated from the extended target abstraction required by the mapping specification.

Target tree construction

An abstract target program tree is constructed bottom-up by invoking ``factory methods'' provided by the target abstraction.

The design of the computation to build an abstract SPIM target program tree from a C-- source program tree is carried out using the normal two-step approach:

Statement computations

Target program tree construction processes for statements are often completely determined by the the statement construct itself. For example, here is an Eli computation for the OneSided rule, whose abstraction was given in the last assignment:

RULE: Statement::= 'if' '(' Expression ')' Body
COMPUTE
  .exit=NewKey();
  Body.items=
    TwoItems(
      Statement.items,
      do_branch(beqz_Branchr(Expression.Tree),.exit));
  Statement.items=
    TwoItems(
      Body.items,
      define_label(.exit)); 
END;

TwoItems, do_branch, beqz_Branchr and define_label are all factory methods defined by the SPIM abstraction. As discussed above, such factory methods abstract from the details of the tree construction function calls.

The chain that accumulates the list of Item nodes making up the body of the program is called items. The conditional branch is added to that list before the items of the Body component are added, and the label definition is added after the items of the Body component are added, but before any the items of any subsequent statement are added.

A new label, represented by the definition table key stored in the rule attribute exit, is generated and used in the conditional branch and the label declaration.

The Tree attribute of the Expression component holds the abstract target program tree fragment generated from the source program tree defining that component.

In a C++ implementation, a translation visitor would traverse the abstract program tree depth-first, left-to-right in the same manner as the items chain. Its visit to the Expression child of the OneSided node would yield the value of Expression.Tree, and its visit to the Body child would yield Body.items. These abstractions would be stitched together by using appropriate constructors and the same factory methods that appear in the LIDO computation above. The LIDO computation acts as a formal representation of the C++ design.

Expression computations

Target program tree construction processes for expressions often depend on the context in which the expression occurs -- on properties of the operands and the desired result. The major problem in writing such computations is therefore to efficiently obtain and use the necessary contextual information.

If the source-to-target mapping has been designed so that expression values of a certain type are always stored in the same way, then all of the necessary information is embodied in the operators identified during type analysis. For example, suppose that the source-to-target mapping always stores integer operands and results in MIPS general registers and always stores floating-point operands and results in floating-point coprocessor registers. There is then a 1-1 correspondence between identified dyadic operators and factory methods for target program tree construction:

Identified operatorFactory method
integer additionadd_iRegrr
floating additionadds_fRegrr
integer subtractionsub_iRegrr
floating subtractionsubs_fRegrr
integer multiplicationmul_iRegrr
floating multiplicationmuls_fRegrr
integer divisiondiv_iRegrr
floating divisiondivs_fRegrr
......

With this kind of a source-to-target mapping design, the translator can base its expression computations on the identified operators. One straightforward approach is to associate explicit tests with every operator context:

RULE: Binop ::= '+'
COMPUTE
  Binop.Tree=
    IF(EQ(Binop.operator,OilOplntegerAdd),
      add_iRegrr(Binop.left,Binop.right),
      adds_fRegrr(Binop.left,Binop.right)); 
END;

The operands must be passed into the operator context, and the result passed out:

RULE: Expression::= Expression Binop Expression
COMPUTE
  Expression[1].Tree=Binop.Tree;
  Binop.left=Expression[2].Tree;
  Binop.right=Expression[3].Tree;
END;

Again, the LIDO rules could be considered to be formalizations of a C++ design.

Coercions

Each node construction function knows what symbols are required in the right-hand side of the node it constructs. It checks the arguments it is given to ensure that their left-hand-side symbols match the required symbol. If the check fails, the construction function looks for rules that could be inserted to ``transform'' the given symbol to the required one. For example, consider the following rules:

RULE iRegrr: IntReg ::= Mop IntReg IntReg END;
RULE fRegrr: FltReg ::= Mop FltReg FltReg END;

Suppose that the action mapper wished to build a target program tree by invoking the following factory methods:

subs_fRegrr(add_iRegrr(i,j) ,a)

Expanding the definitions of the factory methods shows that this is equivalent to the following construction function invocations:

MkfRegrr(NoPosition,subsOp,MkiRegrr(NoPosition,addOp,i,j),a)

The third argument of MkfRegrr should be a tree fragment corresponding to a rule whose left-hand-side symbol is FltReg, but the left-hand-side symbol of rule iRegrr is IntReg. Thus the check fails, and the construction function MkfRegrr looks for a sequence of rules capable of transforming IntReg into FltReg. An appropriate sequence of rules would have the form:

RULE rl: FltReg ::= nl     END;
RULE r2: nl     ::= n2     END; 
  ...
RULE rk: nk     ::= IntReg END;

Each member of the sequence is a chain rule, a rule with a single nonterminal symbol on the right-hand side. If such a sequence can be found, MkfRegrr will behave as though the original invocation had been written:

MkfRegrr(
  NoPosition,
  subsOp,
  Mkrl(NoPosition,Mkr2(...Mkrk(NoPosition,MkiRegrr(NoPosition,addOp,i,j).. .))),
  a)

When more than one sequence is possible, an arbitrary selection will be made.

This property of the construction functions means that target abstractions implementing coercions must be defined, but no explicit computation need be written to add coercion nodes to the target program tree. They will be inserted automatically by the construction functions as the tree is being built.

Task

Implement the action mapping task of a compiler that translates C-- to assembly code for the MIPS R2000, using any tools and techniques you wish. Combine your action mapper with the structural analyzer, name analyzer, type analyzer, and data mapper that you implemented for earlier assignments. The resulting program must read input text, construct the corresponding source program tree, and report violations of all context conditions if the text is lexically and syntactically correct. If the input text is incorrect according to the lexical or syntactic rules of C--, your program must report at least the first error. It may terminate immediately after reporting the first error, or it may continue in an attempt to detect further errors.

If no errors are detected in the input text, your program must construct the corresponding SPIM target program tree, and print a representation of that tree. If errors are detected, your program may construct and print a target program tree or it may terminate after reporting all of the errors. The target program tree constructed in this case need not reflect the source program, but it must be consistent according to the SPIM abstraction.

Spim.fw is the complete definition of the SPIM target abstraction. SpimTree.fw contains the additional declarations to complete a tree unparser for the SPIM target abstraction.

Test your program by applying it to the C-- test suite. If you feel that the test suite is inadequate, provide any additional tests you deem necessary. Your program should not detect any errors in any conformance test, and it should detect at least one error in every deviance test.

Hand in the printed representation of the target program tree for the factorial program. A directory containing the complete source code (or specifications, if you used tools) for your program must be permitted for world read by 1700 on the due date for this assignment. You need not make it available to be read by others before that time. Hand in the location of this directory (machine and path name).

Complete your personal postmortem for this assignment and your personal project plan for the next assignment, and submit them via the web before class on the due date of this assignment.


Instructor
Revision 1.5 (1998/11/14 17:15:19)