170 THE STATIC STRUCTURE:CLASSES $7.3 Modules and types Programming languages and other notations used in software development (design languages,specification languages,graphical notations for analysis)always include both some module facility and some type system. A module is a unit of software decomposition.Various forms of module,such as See chapter 3. routines and packages,were studied in an earlier chapter.Regardless of the exact choice of module structure,we may call the notion of module a syntactic concept,since the decomposition into modules only affects the form of software texts,not what the software can do;it is indeed possible in principle to write any Ada program as a single package,or any Pascal program as a single main program.Such an approach is not recommended,of course,and any competent software developer will use the module facilities of the language at hand to decompose his software into manageable pieces.But if we take an existing program,for example in Pascal,we can always merge all the modules into a single one,and still get a working system with equivalent semantics.(The presence of recursive routines makes the conversion process less trivial,but does not fundamentally affect this discussion.)So the practice of decomposing into modules is dictated by sound engineering and project management principles rather than intrinsic necessity. Types,at first sight,are a quite different concept.A type is the static description of certain dynamic objects:the various data elements that will be processed during the execution ofa software system.The set of types usually includes predefined types such as INTEGER and CHARACTER as well as developer-defined types:record types (also known as structure types),pointer types,set types(as in Pascal),array types and others. The notion of type is a semantic concept,since every type directly influences the execution of a software system by defining the form of the objects that the system will create and manipulate at run time. The class as module and type In non-O-0 approaches,the module and type concepts remain distinct.The most remarkable property of the notion of class is that it subsumes these two concepts,merging them into a single linguistic construct.A class is a module,or unit of software decomposition;but it is also a type (or,in cases involving genericity,a type pattern) Much of the power of the object-oriented method derives from this identification. Inheritance,in particular,can only be understood fully if we look at it as providing both module extension and type specialization. What is not clear yet is how it is possible in practice to unify two concepts which appear at first so distant.The discussion and examples in the rest of this chapter will answer this question
170 THE STATIC STRUCTURE: CLASSES §7.3 Modules and types Programming languages and other notations used in software development (design languages, specification languages, graphical notations for analysis) always include both some module facility and some type system. A module is a unit of software decomposition. Various forms of module, such as routines and packages, were studied in an earlier chapter. Regardless of the exact choice of module structure, we may call the notion of module a syntactic concept, since the decomposition into modules only affects the form of software texts, not what the software can do; it is indeed possible in principle to write any Ada program as a single package, or any Pascal program as a single main program. Such an approach is not recommended, of course, and any competent software developer will use the module facilities of the language at hand to decompose his software into manageable pieces. But if we take an existing program, for example in Pascal, we can always merge all the modules into a single one, and still get a working system with equivalent semantics. (The presence of recursive routines makes the conversion process less trivial, but does not fundamentally affect this discussion.) So the practice of decomposing into modules is dictated by sound engineering and project management principles rather than intrinsic necessity. Types, at first sight, are a quite different concept. A type is the static description of certain dynamic objects: the various data elements that will be processed during the execution of a software system. The set of types usually includes predefined types such as INTEGER and CHARACTER as well as developer-defined types: record types (also known as structure types), pointer types, set types (as in Pascal), array types and others. The notion of type is a semantic concept, since every type directly influences the execution of a software system by defining the form of the objects that the system will create and manipulate at run time. The class as module and type In non-O-O approaches, the module and type concepts remain distinct. The most remarkable property of the notion of class is that it subsumes these two concepts, merging them into a single linguistic construct. A class is a module, or unit of software decomposition; but it is also a type (or, in cases involving genericity, a type pattern). Much of the power of the object-oriented method derives from this identification. Inheritance, in particular, can only be understood fully if we look at it as providing both module extension and type specialization. What is not clear yet is how it is possible in practice to unify two concepts which appear at first so distant. The discussion and examples in the rest of this chapter will answer this question. See chapter 3
$7.4 A UNIFORM TYPE SYSTEM 171 7.4 A UNIFORM TYPE SYSTEM An important aspect of the O-O approach as we will develop it is the simplicity and uniformity of the type system,deriving from a fundamental property: Object rule Every object is an instance of some class. The Object rule will apply not just to composite,developer-defined objects(such as data structures with several fields)but also to basic objects such as integers,real numbers boolean values and characters,which will all be considered to be instances of predefined library classes (INTEGER,REAL,DOUBLE,BOOLEAN,CHARACTER). This zeal to make every possible value,however simple,an instance of some class may at first appear exaggerated or even extravagant.After all,mathematicians and engineers have used integers and reals successfully for a long time,without knowing they were manipulating class instances.But insisting on uniformity pays off for several reasons: It is always desirable to have a simple and uniform framework rather than many special cases.Here the type system will be entirely based on the notion of class. The mathematical Describing basic types as ADTs and hence as classes is simple and natural.It is not axioms defining hard,for example,to see how to define the class INTEGER with features covering integers are known as Peano's axioms. arithmetic operations such as "+"comparison operations such as"<=",and the associated properties,derived from the corresponding mathematical axioms. By defining the basic types as classes,we allow them to take part in all the O-O games,especially inheritance and genericity.If we did not treat the basic types as classes,we would have to introduce severe limitations and many special cases. As an example of inheritance,classes /NTEGER,REAL and DOUBLE will be heirs to more general classes:NUMER/C,introducing the basic arithmetic operations such as "+"," and ""and COMPARABLE,introducing comparison operations such as "<"As an example of genericity,we can define a generic class MATRIY whose generic parameter represents the type of matrix elements,so that instances of MATRIY [INTEGER]represent matrices of integers,instances of MATR/Y [REAL]represent matrices of reals and so on.As an example of combining genericity with inheritance,the preceding definitions also allow us to use the type MATRIY [NUMER/C],whose instances represent matrices containing objects of type INTEGER as well as objects of type REAL and objects of any new type T defined by a software developer so as to inherit from NUMER/C. With a good implementation,we do not need to fear any negative consequence from the decision to define all types from classes.Nothing prevents a compiler from having special knowledge about the basic classes;the code it generates for operations on values of types such as INTEGER and BOOLEAN can then be just as efficient as if these were built-in types in the language
§7.4 A UNIFORM TYPE SYSTEM 171 7.4 A UNIFORM TYPE SYSTEM An important aspect of the O-O approach as we will develop it is the simplicity and uniformity of the type system, deriving from a fundamental property: The Object rule will apply not just to composite, developer-defined objects (such as data structures with several fields) but also to basic objects such as integers, real numbers, boolean values and characters, which will all be considered to be instances of predefined library classes (INTEGER, REAL, DOUBLE, BOOLEAN, CHARACTER). This zeal to make every possible value, however simple, an instance of some class may at first appear exaggerated or even extravagant. After all, mathematicians and engineers have used integers and reals successfully for a long time, without knowing they were manipulating class instances. But insisting on uniformity pays off for several reasons: • It is always desirable to have a simple and uniform framework rather than many special cases. Here the type system will be entirely based on the notion of class. • Describing basic types as ADTs and hence as classes is simple and natural. It is not hard, for example, to see how to define the class INTEGER with features covering arithmetic operations such as "+", comparison operations such as "<=", and the associated properties, derived from the corresponding mathematical axioms. • By defining the basic types as classes, we allow them to take part in all the O-O games, especially inheritance and genericity. If we did not treat the basic types as classes, we would have to introduce severe limitations and many special cases. As an example of inheritance, classes INTEGER, REAL and DOUBLE will be heirs to more general classes: NUMERIC, introducing the basic arithmetic operations such as "+", "–" and "✳", and COMPARABLE, introducing comparison operations such as "<". As an example of genericity, we can define a generic class MATRIX whose generic parameter represents the type of matrix elements, so that instances of MATRIX [INTEGER] represent matrices of integers, instances of MATRIX [REAL] represent matrices of reals and so on. As an example of combining genericity with inheritance, the preceding definitions also allow us to use the type MATRIX [NUMERIC], whose instances represent matrices containing objects of type INTEGER as well as objects of type REAL and objects of any new type T defined by a software developer so as to inherit from NUMERIC. With a good implementation, we do not need to fear any negative consequence from the decision to define all types from classes. Nothing prevents a compiler from having special knowledge about the basic classes; the code it generates for operations on values of types such as INTEGER and BOOLEAN can then be just as efficient as if these were built-in types in the language. Object rule Every object is an instance of some class. The mathematical axioms defining integers are known as Peano’s axioms
172 THE STATIC STRUCTURE:CLASSES $7.5 Reaching the goal of a fully consistent and uniform type system requires the combination of several important O-O techniques,to be seen only later:expanded classes, to ensure proper representation of simple values;infix and prefix operators,to enable usual arithmetic syntax (such as a b or-a rather than the more cumbersome a.less than(b)or a.negated);constrained genericity,needed to define classes which may be adapted to various types with specific operations,for example a class MATR/Y that can represent matrices of integers as well as matrices of elements of other numeric types. 7.5 A SIMPLE CLASS Let us now see what classes look like by studying a simple but typical example,which shows some of the fundamental properties applicable to almost all classes. The features The example is the notion ofpoint,as it could appear in a two-dimensional graphics system A point and its coordinates To characterize type POINT as an abstract data type,we would need the four query functions x,y,p,0.(The names of the last two will be spelled out as rho and theta in software texts.)Function x gives the abscissa of a point (horizontal coordinate),y its ordinate(vertical coordinate),p its distance to the origin,0 the angle to the horizontal axis. The values ofx and y for a point are called its cartesian coordinates,those of p and e its polar coordinates.Another useful query function is distance,which will yield the distance between two points. Then the ADT specification would list commands such as translate (to move a point The name translate by a given horizontal and vertical displacement),rotate (to rotate the point by a certain refers to the "trans- angle,around the origin)and scale (to bring the point closer to or further from the origin lation"operation of geometry. by a certain factor). It is not difficult to write the full ADT specification including these functions and some of the associated axioms.For example,two of the function signatures will be x:POINT-→REAL translate:POINT X REAL×REAL→POINT and one of the axioms will be(for any point p and any reals a,b): x (translate (pl,a,b))=x(pl)+a expressing that translating a point by <a.b>increases its abscissa by a
172 THE STATIC STRUCTURE: CLASSES §7.5 Reaching the goal of a fully consistent and uniform type system requires the combination of several important O-O techniques, to be seen only later: expanded classes, to ensure proper representation of simple values; infix and prefix operators, to enable usual arithmetic syntax (such as a < b or –a rather than the more cumbersome a ● less_ than (b) or a ● negated); constrained genericity, needed to define classes which may be adapted to various types with specific operations, for example a class MATRIX that can represent matrices of integers as well as matrices of elements of other numeric types. 7.5 A SIMPLE CLASS Let us now see what classes look like by studying a simple but typical example, which shows some of the fundamental properties applicable to almost all classes. The features The example is the notion of point, as it could appear in a two-dimensional graphics system. To characterize type POINT as an abstract data type, we would need the four query functions x, y, ρ, θ. (The names of the last two will be spelled out as rho and theta in software texts.) Function x gives the abscissa of a point (horizontal coordinate), y its ordinate (vertical coordinate), ρ its distance to the origin, θ the angle to the horizontal axis. The values of x and y for a point are called its cartesian coordinates, those of ρ and θ its polar coordinates. Another useful query function is distance, which will yield the distance between two points. Then the ADT specification would list commands such as translate (to move a point by a given horizontal and vertical displacement), rotate (to rotate the point by a certain angle, around the origin) and scale (to bring the point closer to or further from the origin by a certain factor). It is not difficult to write the full ADT specification including these functions and some of the associated axioms. For example, two of the function signatures will be x: POINT → REAL translate: POINT × REAL × REAL → POINT and one of the axioms will be (for any point p and any reals a, b): x (translate (p1, a, b)) = x (p1) + a expressing that translating a point by <a, b> increases its abscissa by a. A point and its coordinates θ ρ p1 x y The name translate refers to the “translation” operation of geometry
$7.5 A SIMPLE CLASS 173 Exercise E7.2,page You may wish to complete this ADT specification by yourself.The rest of this 216 discussion will assume that you have understood the ADT,whether or not you have written it formally in full,so that we can focus on its implementation-the class. Attributes and routines Any abstract data type such as POINT is characterized by a set of functions,describing the operations applicable to instances of the ADT.In classes (ADT implementations), functions will yield features-the operations applicable to instances of the class. “Function catego- We have seen that ADT functions are of three kinds:queries,commands and ries”,page134. creators.For features,we need a complementary classification,based on how each feature is implemented:by space or by time. The example of point coordinates shows the difference clearly.Two common representations are available for points:cartesian and polar.If we choose cartesian representation,each instance of the class will contain two fields representing the x and y of the corresponding point: Representing a point in cartesian coordinates (CARTESIAN_POINT) If pl is the point shown,getting its x or its y simply requires looking up the corresponding field in this structure.Getting p or 0,however,requires a computation:for p we must computey,and for e we must compute arcig (v/x)with non-zero x. If we use polar representation,the situation is reversed:p and 0 are now accessible by simple field lookup,x and y require small computations (of p cos 0 and p sin 0). Representing a point in polar rho coordinates theta (POLAR POINT) This example shows the need for two kinds of feature: Some features will be represented by space,that is to say by associating a certain piece of information with every instance of the class.They will be called attributes. For points,x and y are attributes in cartesian representation;rho and theta are attributes in polar representation
§7.5 A SIMPLE CLASS 173 You may wish to complete this ADT specification by yourself. The rest of this discussion will assume that you have understood the ADT, whether or not you have written it formally in full, so that we can focus on its implementation — the class. Attributes and routines Any abstract data type such as POINT is characterized by a set of functions, describing the operations applicable to instances of the ADT. In classes (ADT implementations), functions will yield features — the operations applicable to instances of the class. We have seen that ADT functions are of three kinds: queries, commands and creators. For features, we need a complementary classification, based on how each feature is implemented: by space or by time. The example of point coordinates shows the difference clearly. Two common representations are available for points: cartesian and polar. If we choose cartesian representation, each instance of the class will contain two fields representing the x and y of the corresponding point: If p1 is the point shown, getting its x or its y simply requires looking up the corresponding field in this structure. Getting ρ or θ, however, requires a computation: for ρ we must compute , and for θ we must compute arctg (y/x) with non-zero x. If we use polar representation, the situation is reversed: ρ and θ are now accessible by simple field lookup, x and y require small computations (of ρ cos θ and ρ sin θ). This example shows the need for two kinds of feature: • Some features will be represented by space, that is to say by associating a certain piece of information with every instance of the class. They will be called attributes. For points, x and y are attributes in cartesian representation; rho and theta are attributes in polar representation. Exercise E7.2, page 216. “Function categories”, page 134. Representing a point in cartesian coordinates x y (CARTESIAN_POINT) x 2 y 2 + Representing a point in polar coordinates rho theta (POLAR_POINT)
174 THE STATIC STRUCTURE:CLASSES $7.5 Some features will be represented by time,that is to say by defining a certain computation (an algorithm)applicable to all instances of the class.They will be called routines.For points,rho and theta are routines in cartesian representation;x and y are routines in polar representation. A further distinction affects routines (the second of these categories).Some routines will return a result,they are called functions.Herex andy in polar representation,as well as rho and theta in cartesian representation,are functions since they return a result,of type REAL.Routines which do not return a result correspond to the commands of an ADT specification and are called procedures.For example the class PO/NT will include procedures translate,rotate and scale. Be sure not to confuse the use of"function"to denote result-returning routines in classes with the earlier use of this word to denote the mathematical specifications of operations in abstract data types.This conflict is unfortunate,but follows from well-established usage of the word in both the mathematics and software fields. The following tree helps visualize this classification of features: Feature Feature classification, No result:Command Returns result:Ouery by role Arguments, No argument Procedure Computation, Memory Function Function Attribute ROUTINE This is an extemal classification,in which the principal question is how a feature will look to its clients (its users). We can also take a more internal view,using as primary criterion how each feature is implemented in the class,and leading to a different classification:
174 THE STATIC STRUCTURE: CLASSES §7.5 • Some features will be represented by time, that is to say by defining a certain computation (an algorithm) applicable to all instances of the class. They will be called routines. For points, rho and theta are routines in cartesian representation; x and y are routines in polar representation. A further distinction affects routines (the second of these categories). Some routines will return a result; they are called functions. Here x and y in polar representation, as well as rho and theta in cartesian representation, are functions since they return a result, of type REAL. Routines which do not return a result correspond to the commands of an ADT specification and are called procedures. For example the class POINT will include procedures translate, rotate and scale. Be sure not to confuse the use of “function” to denote result-returning routines in classes with the earlier use of this word to denote the mathematical specifications of operations in abstract data types. This conflict is unfortunate, but follows from well-established usage of the word in both the mathematics and software fields. The following tree helps visualize this classification of features: This is an external classification, in which the principal question is how a feature will look to its clients (its users). We can also take a more internal view, using as primary criterion how each feature is implemented in the class, and leading to a different classification: Feature classification, by role Procedure Function Function Attribute No result: Command Returns result: Query Arguments No argument Computation Memory Feature ROUTINE