# Purpose

To characterize a specification language:
• Problem class
• Type of leverage
• Learning curve

# Due dates

You must submit your answers to the questions via electronic mail by 9:00am on Tuesday, August 30. Your submission must be an attachment, and must contain your name(s) and the assignment subject. The attachment may be a PDF file, PostScript file, or Word document.

This assignment is worth 30 points.

# Background

Language is more than a communication mechanism; our language shapes our thought processes. In a very real sense, if we don't have a name for something then it doesn't exist!

Programming languages allow us to describe algorithms for solving problems. Thus we can express our understanding of a problem by providing a program solving that problem. Our only strategy for controlling complexity in this case is a modular decomposition of the program: we partition the code into components with which we associate mnemonic identifiers. Because the identifiers are mnemonic, we are able to recognize them as representing the associated components and understand the function of those components. In effect, we are defining a new language whose words are the components. That language is ideally suited to the problem at hand, because of the way in which the words were chosen.

Programs solving common problems can be written, packaged as components, named, and placed in a library. We can then express our understanding of a problem simply by stating the name of the library component solving it. For example, the problem of finding the roots of a quadratic equation involves solving ``the square root problem''. That is a common problem, and most mathematical libraries include a function named sqrt to solve it. Thus, when writing a program to find the roots of a quadratic equation, I can solve the subsidiary square root problem by simply invoking the sqrt component from a mathematical library.

The square root problem is solved by a component that has no internal state. In other words, sqrt does not remember any information from one invocation to the next. Many problem solutions have both state and behavior, and the concept of an abstract data type was developed to embody such solutions. In some situations, a number of abstract data types share state and/or behavior. The common aspects of these abstract data types can be factored out and shared via inheritance. Classes are simply abstract data types on which an inheritance relation is defined.

Because classes embody both state and behavior, they broaden the set of problems for which we can create and name components. For example, the problem of recognizing arithmentic expressions involves last-in, first-out storage. Last-in, first-out storage is a common data structure, and most class libraries include a class named Stack to implement it. Thus, when writing a program to recognize an arithmetic expression, I can obtain last-in, first-out storage simply by instantiating the Stack module from a class library.

Libraries cater for the situation in which the problems being solved can be named. Any problem involving an interaction of named problems must either be packaged into a component and named or expressed in terms of the programming language and a set of names. Although this facility is extremely useful, it fails to address a wide range of problems. The issue is one of generalization. Consider the problem of recognizing an arithmetic expression whose operands are single digits or single letters. I can express my understanding of this problem by a program that uses a stack and a single-character input routine. That understanding can be packaged into a class and put into a library that anyone can use to recognize such expressions.

It has long been known, however, that the problem of recognizing such an arithmetic expression is only one instance of the problem of recognizing any sentence in in a very broad class of languages. There is an infinite number of such ``LALR(1) parsing problems'', and once I understand how to solve one of them I understand how to solve them all. How can I express this understanding so that it can be re-used in the same way that sqrt and Stack can be used?

The key insight is that I need a specification language to describe my particular LALR(1) parsing problem in the infinite universe of LALR(1) parsing problems, and there must be a generator to accept my description and create an appropriate program using a stack and a single-character input routine. The specification language captures exactly the information needed to differentiate one LALR(1) parsing problem from another; it does not embody any understanding of how to implement the component that solves the problem. The knowledge about how to produce an appropriate component lies with the generator.

Chapters 2 and 3 of the text describe two specification languages that are useful in specifying generators, and shows how each provides a different sort of leverage on problems in its domain.

One argument against using specification languages and generators is that one must spend time learning the language and learning how to use the generator. The same argument can be used against using libraries, and it represents an important component of the design process: We must estimate how long it will take to master new material relative to the effort to solve the problem without using that material. One significant point in favor of specification languages is that one must understand the problem that one is trying to solve in any event. Usually some record of this understanding must written down, both to guide development and to support maintenance. If the notation used for this record is similar to that of the specification language, one can get the implementation free by using the specification language for the record.

Choose a specification language other than those supported by Eli or described in the text. (You may choose a problem class with which you are familiar and define your own specification language if you wish.)

Characterize the problem class for which this specification language is useful, and briefly describe one example from that class.

Explain how the language and its associated generator provides leverage for the user:

• Does it dramatically reduce the amount they have to write?
• Does it hide complexity that they will not have to master?
• Does it guarantee some important property of the solution that would be questionable in a hand implementation?
• Does it allow the user to gather information in one location that would otherwise be scattered?

Will it require significant effort to master the language, or would the user have to write a similar specification as part of a design anyway? Is the generator easy to use, or does it require a lot of coordination with other tools and specifications?

What is the cost generator relative to the savings in development cost from using it? (Assume that one man-year of programming effort costs \$150,000.)