Data Mapping


To illustrate typical techniques for representing source program data on computers:

Due dates

You must complete the PEP tasks (individually), and then hand in the answers to the questions in this exercise (as a group) at the beginning of class on Monday, November 2. This assignment is worth 30 points.


Data mapping is the task of representing source language objects on the target machine. Section 8.1 of Waite explains the relationship between source and target data representations, and lays out the approach that the compiler designer follows when determining how to map source data onto the target machine. This approach involves a sequence of distinct steps:

  1. Decide how to use target data types to implement source data types.
  2. Decide how to implement storage areas (such as activation records) implied by the source language.
  3. Determine the target storage characteristics of source data types and storage areas.
  4. Identify each source tree context (such as a variable declaration or construct bounding variable lifetime) that plays a role in data mapping.
  5. Use the two-step approach to design the necessary computations.

Using target data types to implement source data types

It is always important to go to the best sources for information about the problem: the source language standard and the target machine manual. Textbooks can provide overviews and discussion, but the details are sometimes missing or incorrect. The target machine for homework in this course is the SPIM simulator for the MIPS R2000. An additional reference discussing the MIPS hardware is the book by Kane. Use the SPIM Manual as your primary reference for the target machine and the Kane book as backup for providing insight.

You can run the SPIM simulator via the command spim (providing a command line interface) or xspim (providing an X-windows interface) on either rintintin or nag, or you can download a copy and implement it on some other machine of your choice.

Implementing activation records

Activation record mapping is discussed in Section 8.1.3, again in terms of Pascal and the VAX. C-- is much simpler than Pascal, since there is only a single routine. Nevertheless, C-- objects have defined extents within the program as described in Section A.1.3. While a C-- program has only a single activation record, the compiler can minimize the size of that activation record by overlaying the storage needed for variables whose extents are disjoint.

The basic structure of the activation record should, as always, be determined by the target machine. Thus the activation record design for a compiler from C-- to SPIM code must follow the overall conventions dictated by the SPIM simulator. These are summarized in Section 4 of the SPIM Manual, entitled ``Calling Conventions''. Within that structure the compiler can lay out storage for the parameters and local variables of the C-- program in such a way as to minimize the amount of storage required.

One simple strategy is to treat the space for parameters and local variables like a Pascal variant record (Section 8.1.2): Storage for nested ranges acts like nested variants.


Answer the following questions:
  1. How will you represent the C-- types on the SPIM simulator?

  2. How do C-- parameters (Section A.1.3.2) differ from the Pascal parameters described in Section 8.1.3?

  3. Define the layout of the activation record you will use. Be specific about what will be stored where, and what characteristics of C-- or SPIM you are accommodating. Explain how parameters and variables will be accessed using SPIM addressing modes.

  4. Show the activation record for the factorial program, writing a SPIM load instruction for each variable and parameter.

  5. Write a C-- program that illustrates sharing of storage by variables. Show its activation record, identifying the contents of each location. Quote the section of the C-- definition that allows the sharing you describe.

  6. Define C++ operations or Eli attribute computations that establish
    1. The equivalent of Section 8.3.1's StorageRequired value for each of the basic types
    2. The equivalent of Section 8.3.1's StorageRequired value for an empty activation record.
    Briefly justify your definition.

  7. Where would you invoke these operations, and where would you store the result in each case?

  8. Which symbols in the C-- abstract tree are the roots of subtrees representing extents of parameters or variables?

  9. Suppose that I want to determine, for each subtree representing an extent of parameters or variables, the total storage required for the variables of that extent and of any greater extent.
    1. Carry out the two-step approach to sketch the design of a computation yielding this information.
    2. Explain why the minimum storage for the activation record results from overlaying these storage requirements.
    3. Carry out the two-step approach to sketch the design of a computation yielding the storage for the activation record.
Write a brief description of the tasks to be carried out by each team member for the next assignment, giving a rationale for your partitioning. This description forms a part of your group's project plan, but it should be submitted by electronic mail to rather than being entered into the PEP system.

Complete your personal postmortem for this assignment and your personal project plan for the next assignment, and submit them via the web before class on the due date of this assignment. Your group must hand in one set of answers to the questions at the beginning of class on the due date.

Revision 1.1 (1998/09/29 17:43:11)