# Purpose

To provide some insight into important data structures and operations:
• Gain experience with important design patterns
• Develop tree computations
• See how trees might be constructed

# Due dates

You must submit your solution via electronic mail by 9:00am on Tuesday, September 13. Your submission must be an attachment, and must contain your name(s) and the assignment subject. The attachment may be either a tar file or a jar file.

This assignment is worth 20 points.

# Background

Experience in object-oriented programming techniques has been distilled into a number of design patterns -- proven techniques for solving large classes of problems. Two of these patterns are applicable to many of the sub-problems of compiler construction. Because of their importance, you need to gain some experience in using them. Your program design for this homework is therefore constrained to make use of these two patterns.

## Representing a tree

A tree used in a compiler is an example of the use of the Composite design pattern. The remainder of this subsection is organized in the same manner as Gamma, pages 163-173. Our example is a tree representing simple integer expressions.

### Intent

Compose constants with operators into tree structures to represent expressions. Composite lets us treat individual constants and compositions of constants and operators uniformly.

### Motivation

An operator has two operands, each of which can be any constant or combination of operators and operands. Although it would be possible to build a different class for every possible combination of components, this would lead to a large and complex definition.

### Applicability

We want to represent arbitrary expressions, and we want clients processing individual expressions to be able to ignore the specific kinds of operands. Clients will treat all operands uniformly.

### Structure

Here is the formal definition of an (infinite) set of expression trees:

```RULE Sum:        Expression ::= Expression '+' Expression END;
RULE Difference: Expression ::= Expression '-' Expression END;
RULE Product:    Expression ::= Expression '*' Expression END;
RULE Quotient:   Expression ::= Expression '/' Expression END;
RULE Constant:   Expression ::= Integer                   END;
```

This definition is basically a textual form of the diagrams described in Gamma, Appendix B.

### Participants

• Component (Expression)
• declares the interface (the Accept method) for objects in the composition (the tree).
• implements the default behavior for the interface (the Accept method) common to all classes (do nothing).

• Leaf (Constant)
• represents leaf objects (integer values) in the composition. A leaf has no children.
• defines behavior (yielding a value, invoking a method) for primitive objects (integer values) in the composition.

• Composite (Sum, Difference, Product, Quotient)
• defines behavior (yielding children, invoking a method) for components having children.
• stores child components.

• Client (e.g. a program building an expression tree)
• manipulates objects in the composition through the component interface

Note that, unlike the description in Gamma, Expression has no interface for accessing and managing its child components. The reason for this is that, in a compiler, trees are never modified after they are constructed. Thus there is no need to ``manage'' child components. Visitors are the only clients that need to access child components, and they are specific to each composite.

Because there are no child-related operations in the Component interface, there is no need for Composite objects to implement them.

### Collaborations

• A client usually constructs a tree from a linear sequence by using a stack. When the representation of a leaf is encountered in the sequence, a leaf object is constructed and pushed onto the stack as a component. When the representation of an operator is encountered, the components representing its operands are popped off of the stack. The composite defined by the operator is constructed with the operand components as children, and pushed onto the stack as a component.

• When a computation must be done over the tree (for example, the expression is to be evaluated or printed), the client uses the interface of the component at the root of the tree to Accept the appropriate visitor. (Visitors are discussed below.)

### Consequences

Gamma says that use of the Composite pattern ``can make your design overly general''. The problem is that in its published form, the Composite pattern provides only one ``Component'' class. For example, consider the following tree structure (adapted from Appel, Section 1.3):

```RULE CompoundStm: Statement  ::= Statement ';' Statement          END;
RULE AssignStm:   Statement  ::= Identifier ':=' Expression       END;
RULE PrintStm:    Statement  ::= 'print' '(' ExpList ')'          END;
RULE PairExpList: ExpList    ::= Expression ',' ExpList           END;
RULE LastExpList: ExpList    ::= Expression                       END;
RULE EseqExp:     Expression ::= '(' Statement ',' Expression ')' END;
RULE Sum:         Expression ::= Expression '+' Expression        END;
RULE Difference:  Expression ::= Expression '-' Expression        END;
RULE Product:     Expression ::= Expression '*' Expression        END;
RULE Quotient:    Expression ::= Expression '/' Expression        END;
RULE Constant:    Expression ::= Integer                          END;
RULE Variable:    Expression ::= Identifier                       END;
```

If we were to implement this tree with only one ``Component'' class (Node, say) then the two children of Sum would be of type Node*, as would the two children of CompoundStm. This would allow ill-formed trees and require run-time checks.

The solution to this problem is to have a two-level hierarchy of ``Component'' classes: At the root is Node, which plays the role of the ``Component'' class in the Composite pattern. Statement, ExpList, and Expression each plays the role of a ``sub-Component'' class. For example, ExpList declares a common interface for objects of classes PairExpList and LastExpList and implements default behavior for that interface as appropriate.

### Implementation

Click here for a C++ implementation of the simple expression tree defined in the Structure section.

Click here for a C++ implementation of the more complex tree defined in the Consequences section. This implementation uses the two-level hierarchy of ``Component'' classes to guarantee that the tree will be well-formed and thereby avoid run-time checks.

## Computations over a tree

Computations over a tree are examples of the Visitor design pattern. The remainder of this sub-section is organized in the same manner as Gamma, pages 331-344. Our example is a printing visitor.

### Intent

Define a new operation to print a simple expression tree in normal algebraic notation. The C++ implementation of the tree, given by file Tree.h, will not be changed.

### Motivation

A compiler must perform a number of computations over the tree that represents the source program. We do not want to have to repeatedly edit the file defining that tree as we develop those computations.

### Applicability

The Visitor pattern is applicable to a compiler because
• a program is composed of many constructs with differing properties, and the analysis and translation operations depend on particular constructs.

• a number of largely-independent operations must be performed on the program tree, and we want to separate concerns to facilitate development and maintenance.

• the tree structure representing the program rarely changes, because it reflects fundamental concepts in programming languages.

### Structure

The concrete elements of the tree are the Leaf and Composite objects of the Composite pattern defined in the last section.

### Participants

• Visitor (ExpressionVisitor)
• declares a Visit operation for each of the concrete elements of the tree. The operator's name (e.g. VisitSum) and signature identifies the class of the element being visited. Thus the operation knows which interface to use to carry out its task.

• ConcreteVisitor (Printer)
• implements the printing of the tree. Each operation implements a fragment of the complete algorithm, printing a representation of one tree node. Storage for the string being built up is provided as a public variable of the Printer class. Individual methods may contain additional local storage to save this variable during visits.

• Element (Expression)
• defines an Accept operation that takes a Visitor as its argument.

• ConcreteElement (Constant, Sum)
• implements an Accept operation that takes a Visitor as its argument.

• ObjectStructure (Stack)
• can enumerate its elements
• provides a high-level interface to allow the Visitor to visit its elements
• is a collection of Composites

### Collaboration

• A program that wants to print a tree must create a Printer and then pass it to the Accept method of the tree's root. Each operation that Printer implements for a Composite object in the tree uses the interface of that Composite object to visit its children. Thus the statement made on page 335 Gamma is false for compilers: The client does not traverse the object structure; that's done by the visitor itself.

The reason for leaving control of the tree traversal in the hands of the concrete visitor is that different tasks may require different traversals. We tend to fall into the rut of using depth-first, left-to-right traversals but in many cases we should do something completely different.

• When a Sum node of the tree is visited, it calls Printer's VisitSum operation. A pointer to the specific Sum node is provided so that VisitSum can access its state.

### Consequences

None of the operations defined by ExpressionVisitor returns a result or is given an argument other than the node pointer. This design reflects the fact that visitors are used for such a wide variety of computations that nothing else makes sense. A visitor must therefore retain information in its own state as it traverses the tree.

For example, Printer keeps a public variable that contains the string representation of the tree it has just visited. Each operation is responsible for ensuring that when it terminates this variable contains the string representation of the construct. Since there is only one such variable, each operation must save its value when necessary. (Clearly the VisitSum operation must save the result of visiting one of its operands while it is visiting the other operand.)

An alternative to saving the content of a public variable is to simply create a new visitor to visit each child. Creating a new visitor is usually more costly than saving state in local variables of a method, however, and may lead to memory leaks.

Information can be passed into a visitor invocation by storing that information into a public variable of the visitor's class before invoking the visitor.

Write a C++ or Java program that reads and executes a sequence of commands to build an expression tree, print the current tree as a string with only required parentheses, and print the integer result of evaluating the current tree. The program need not do any error checking: you may assume that all commands are valid, that they are given in an appropriate sequence, and that all integer inputs and intermediate results are in range.

## Program behavior

The behavior of the program is defined in terms of a single stack of tree nodes:

CommandEffect
integerCreate a leaf specifying the integer value and push it onto the stack.
+Remove the top two nodes from the stack, create a new node specifying the addition of those nodes, and push it onto the stack.
-Remove the top two nodes from the stack, create a new node specifying the subtraction of those nodes, and push it onto the stack.
*Remove the top two nodes from the stack, create a new node specifying the multiplication of those nodes, and push it onto the stack.
/Remove the top two nodes from the stack, create a new node specifying the integer division of those nodes, and push it onto the stack.
pPrint the expression specified by the top node of the stack, with only essential parentheses.
eEvaluate the expression specified by the top node of the stack.
xTerminate the program.

A prompt ``> '' indicates that the program is ready to read a command. Here are some examples:

 ```> 1 > 2 > 3 > * > + > p 1+2*3 > e 7 > x ``` ```> 1 > 2 > + > 3 > 4 > + > * > p (1+2)*(3+4) > e 21 > x ``` ```> 2 > 3 > / > p 2/3 > e 0 > 3 > 2 > / > p 3/2 > e 1 > + > p 2/3+3/2 > e 1 > x ``` ```> 4 > 2 > - > 3 > 2 > - > - > p 4-2-(3-2) > e 1 > x ```

## Program development

Use Eli to build C++ or Java classes that define the expression tree structure discussed above and its associated abstract visitor. You should plan to do this step in the laboratory on Wednesday, August 31. Click here for a description of the process. Note that you must specify either lang=CPP (for C++) or lang=Java (for Java).

1. A concrete visitor class to print a tree as a string in standard algebraic notation with only essential parentheses. Assume that each of the operators associates to the left, that they have the standard precedence relations, and that none are either associative or commutative. Parentheses are necessary if the interpretation of these rules would result in a tree different than the original.
2. A concrete visitor class to evaluate a tree and print the result as an integer.
3. A main program that reads commands from the standard input stream and invokes tree constructors and visitors as necessary to provide the behavior described above.
Your printing visitor should contain only information needed during printing, and your evaluation visitor should contain only information needed during evaluation.

## Planning information

Please note that this information is provided only to give you some indication of the problem's complexity.

The instructor's solution to this problem required about 230 lines of code. Development time was approximately 100 minutes.

## Program submission

When you have convinced yourself that you have a working program that meets the specifications of the problem, create a compressed tar file or jar file containing your solution. Send that file as an electronic mail attachment to a message giving your name(s) and the title of the assignment. Do not include your solution file as part of the message body!
 Instructor Revision \$Revision: 1.12 \$ (\$Date: 2005/08/23 21:11:16 \$)