Trees and Visitors

Purpose

To provide some insight into important data structures and operations:

Due dates

You must submit your solution via electronic mail by 9:00am on Tuesday, September 13. Your submission must be an attachment, and must contain your name(s) and the assignment subject. The attachment may be either a tar file or a jar file.

This assignment is worth 20 points.

Background

Experience in object-oriented programming techniques has been distilled into a number of design patterns -- proven techniques for solving large classes of problems. Two of these patterns are applicable to many of the sub-problems of compiler construction. Because of their importance, you need to gain some experience in using them. Your program design for this homework is therefore constrained to make use of these two patterns.

Representing a tree

A tree used in a compiler is an example of the use of the Composite design pattern. The remainder of this subsection is organized in the same manner as Gamma, pages 163-173. Our example is a tree representing simple integer expressions.

Intent

Compose constants with operators into tree structures to represent expressions. Composite lets us treat individual constants and compositions of constants and operators uniformly.

Motivation

An operator has two operands, each of which can be any constant or combination of operators and operands. Although it would be possible to build a different class for every possible combination of components, this would lead to a large and complex definition.

Applicability

We want to represent arbitrary expressions, and we want clients processing individual expressions to be able to ignore the specific kinds of operands. Clients will treat all operands uniformly.

Structure

Here is the formal definition of an (infinite) set of expression trees:

RULE Sum:        Expression ::= Expression '+' Expression END;
RULE Difference: Expression ::= Expression '-' Expression END;
RULE Product:    Expression ::= Expression '*' Expression END;
RULE Quotient:   Expression ::= Expression '/' Expression END;
RULE Constant:   Expression ::= Integer                   END;

This definition is basically a textual form of the diagrams described in Gamma, Appendix B.

Participants

Note that, unlike the description in Gamma, Expression has no interface for accessing and managing its child components. The reason for this is that, in a compiler, trees are never modified after they are constructed. Thus there is no need to ``manage'' child components. Visitors are the only clients that need to access child components, and they are specific to each composite.

Because there are no child-related operations in the Component interface, there is no need for Composite objects to implement them.

Collaborations

Consequences

Gamma says that use of the Composite pattern ``can make your design overly general''. The problem is that in its published form, the Composite pattern provides only one ``Component'' class. For example, consider the following tree structure (adapted from Appel, Section 1.3):

RULE CompoundStm: Statement  ::= Statement ';' Statement          END;
RULE AssignStm:   Statement  ::= Identifier ':=' Expression       END;
RULE PrintStm:    Statement  ::= 'print' '(' ExpList ')'          END;
RULE PairExpList: ExpList    ::= Expression ',' ExpList           END;
RULE LastExpList: ExpList    ::= Expression                       END;
RULE EseqExp:     Expression ::= '(' Statement ',' Expression ')' END;
RULE Sum:         Expression ::= Expression '+' Expression        END;
RULE Difference:  Expression ::= Expression '-' Expression        END;
RULE Product:     Expression ::= Expression '*' Expression        END;
RULE Quotient:    Expression ::= Expression '/' Expression        END;
RULE Constant:    Expression ::= Integer                          END;
RULE Variable:    Expression ::= Identifier                       END;

If we were to implement this tree with only one ``Component'' class (Node, say) then the two children of Sum would be of type Node*, as would the two children of CompoundStm. This would allow ill-formed trees and require run-time checks.

The solution to this problem is to have a two-level hierarchy of ``Component'' classes: At the root is Node, which plays the role of the ``Component'' class in the Composite pattern. Statement, ExpList, and Expression each plays the role of a ``sub-Component'' class. For example, ExpList declares a common interface for objects of classes PairExpList and LastExpList and implements default behavior for that interface as appropriate.

Implementation

Click here for a C++ implementation of the simple expression tree defined in the Structure section.

Click here for a C++ implementation of the more complex tree defined in the Consequences section. This implementation uses the two-level hierarchy of ``Component'' classes to guarantee that the tree will be well-formed and thereby avoid run-time checks.

Computations over a tree

Computations over a tree are examples of the Visitor design pattern. The remainder of this sub-section is organized in the same manner as Gamma, pages 331-344. Our example is a printing visitor.

Intent

Define a new operation to print a simple expression tree in normal algebraic notation. The C++ implementation of the tree, given by file Tree.h, will not be changed.

Motivation

A compiler must perform a number of computations over the tree that represents the source program. We do not want to have to repeatedly edit the file defining that tree as we develop those computations.

Applicability

The Visitor pattern is applicable to a compiler because

Structure

The concrete elements of the tree are the Leaf and Composite objects of the Composite pattern defined in the last section.

Participants

Collaboration

Consequences

None of the operations defined by ExpressionVisitor returns a result or is given an argument other than the node pointer. This design reflects the fact that visitors are used for such a wide variety of computations that nothing else makes sense. A visitor must therefore retain information in its own state as it traverses the tree.

For example, Printer keeps a public variable that contains the string representation of the tree it has just visited. Each operation is responsible for ensuring that when it terminates this variable contains the string representation of the construct. Since there is only one such variable, each operation must save its value when necessary. (Clearly the VisitSum operation must save the result of visiting one of its operands while it is visiting the other operand.)

An alternative to saving the content of a public variable is to simply create a new visitor to visit each child. Creating a new visitor is usually more costly than saving state in local variables of a method, however, and may lead to memory leaks.

Information can be passed into a visitor invocation by storing that information into a public variable of the visitor's class before invoking the visitor.

Task

Write a C++ or Java program that reads and executes a sequence of commands to build an expression tree, print the current tree as a string with only required parentheses, and print the integer result of evaluating the current tree. The program need not do any error checking: you may assume that all commands are valid, that they are given in an appropriate sequence, and that all integer inputs and intermediate results are in range.

Program behavior

The behavior of the program is defined in terms of a single stack of tree nodes:

CommandEffect
integerCreate a leaf specifying the integer value and push it onto the stack.
+Remove the top two nodes from the stack, create a new node specifying the addition of those nodes, and push it onto the stack.
-Remove the top two nodes from the stack, create a new node specifying the subtraction of those nodes, and push it onto the stack.
*Remove the top two nodes from the stack, create a new node specifying the multiplication of those nodes, and push it onto the stack.
/Remove the top two nodes from the stack, create a new node specifying the integer division of those nodes, and push it onto the stack.
pPrint the expression specified by the top node of the stack, with only essential parentheses.
eEvaluate the expression specified by the top node of the stack.
xTerminate the program.

A prompt ``> '' indicates that the program is ready to read a command. Here are some examples:

> 1
> 2
> 3
> *
> +
> p
1+2*3
> e
7
> x
> 1
> 2
> +
> 3
> 4
> +
> *
> p
(1+2)*(3+4)
> e
21
> x
> 2
> 3
> /
> p
2/3
> e
0
> 3
> 2
> /
> p
3/2
> e
1
> +
> p
2/3+3/2
> e
1
> x
> 4
> 2
> -
> 3
> 2
> -
> -
> p
4-2-(3-2)
> e
1
> x

Program development

Use Eli to build C++ or Java classes that define the expression tree structure discussed above and its associated abstract visitor. You should plan to do this step in the laboratory on Wednesday, August 31. Click here for a description of the process. Note that you must specify either lang=CPP (for C++) or lang=Java (for Java).

Write the following additional code:

  1. A concrete visitor class to print a tree as a string in standard algebraic notation with only essential parentheses. Assume that each of the operators associates to the left, that they have the standard precedence relations, and that none are either associative or commutative. Parentheses are necessary if the interpretation of these rules would result in a tree different than the original.
  2. A concrete visitor class to evaluate a tree and print the result as an integer.
  3. A main program that reads commands from the standard input stream and invokes tree constructors and visitors as necessary to provide the behavior described above.
Your printing visitor should contain only information needed during printing, and your evaluation visitor should contain only information needed during evaluation.

Planning information

Please note that this information is provided only to give you some indication of the problem's complexity.

The instructor's solution to this problem required about 230 lines of code. Development time was approximately 100 minutes.

Program submission

When you have convinced yourself that you have a working program that meets the specifications of the problem, create a compressed tar file or jar file containing your solution. Send that file as an electronic mail attachment to a message giving your name(s) and the title of the assignment. Do not include your solution file as part of the message body!
Instructor
Revision $Revision: 1.12 $ ($Date: 2005/08/23 21:11:16 $)