Abstractions of Programs


To focus your attention on the use of trees as abstractions of programs and to explore two possible implementations:

Due dates

You must complete the PEP tasks (individually), and then hand in the answers to the questions in this exercise (as a group) at the beginning of class on Wednesday, September 9. This assignment is worth 20 points.


A compiler reads text and builds an internal representation that it can analyze. The particular internal representation is chosen because it facilitates the compiler's analysis by making important contextual relationships explicit. Contextual relationships are important if they bear on the goals of the analysis:

Both the legality and the meaning of a specific construct are usually stated in terms of the components of that construct, so a tree is a common abstraction because it makes the ``component'' relation explicit.

In designing an abstraction of a program, we need to understand what information is present explicitly in the text and what information is implicit in the structure. Explicit information must be captured by the original abstraction, while implicit information must be made explicit through computation over the abstraction.

The precise information, both explicit and implicit, to be carried by the abstraction is determined by the problem to be solved. Tree Abstractions for Programs analyzes a specific example: an interpreter for a simple straight-line programming language. In that example, the explicit information carried by the abstraction includes the relationship among the components of the program, the names of the variables, and the values of the constants. The implicit information is the set of expression values computed during interpretation.

How much effort will you need to put forth to understand a document like Tree Abstractions for Programs? Reading and understanding such documents is something that you must do throughout your career, and is an integral part of every project you undertake. You need to develop a process that will generate information to answer that question. After applying such a process you need to reflect on the factors that brought your estimate close to the actual figures, the factors that worked against a close estimate, and how you might use that knowledge to your benefit in the future.


Tree Abstractions for Programs solves the programming problem from Section 1.3 of Appel's book. Familiarize yourself with both the hand-coded solution and the formal specification. Click here for a zip file containing the source code for both. Obtain copies of these solutions via the web, and run both of them.

The hand-coded solution was tested using gcc version 2.7.2. Since it uses the standard template library it may or may not work with other versions of C++. You may make changes to accommodate your C++ compiler, but you must retain the basic structure. You may also produce a Java version of the implementation if you wish. If you do so, you must follow the consistent implementation procedure for tree abstractions.

The specifications were tested using Eli version 4.2. Eli is available on the ITS machine supporting this course and on almost all CS machines. The command is eli. You may also implement Eli on another machine of your choice; click here for information. Answer the following questions about your experiences.

  1. What are the sizes of the source code for the two solutions? Do you think that the complexities of the two solutions are proportional to their size in lines (in other words, is the size of the source code an appropriate measure of complexity)?
  2. Compare the sizes of the executable programs obtained from the two solutions. Can you draw any conclusions about the results? If so, what are they; if not, why can't you?
  3. Appel returns to source program trees ("abstract parse trees") in Section 4.2. The discussion around Figure 4.10 distinguishes two styles, which Appel terms ``syntax separate from interpretations'' and ``object-oriented''. At issue is the flexibility of the style with respect to certain changes.
    1. How would you characterize the hand-coded solution in his terms? Does the use of the "visitor" design pattern have a bearing on this characterization?
    2. How flexible is the generated solution with respect to the changes Appel envisions?
  4. The two solutions store the value of an expression in different ways. How could you change the hand-coded solution so that the two would use identical storage mechanisms? Would that change affect the modularity of the hand-coded solution? What are the corresponding modularity issues for the generated solution?
  5. Some of the rules in the generated solution have no associated computation, but in the hand-coded solution the visitor has a method for every node. Explain this discrepancy.

Complete your personal postmortem for this assignment and your personal project plan for the next assignment, and submit them via the web before class on the due date of this assignment. Your group must hand in one set of answers to the questions at the beginning of class on the due date.

Revision 1.6 (1998/08/31 20:07:52)