222 THE RUN-TIME STRUCTURE:OBJECTS $8.1 implies that the author of each class controls the precise set of operations that clients may execute on its instances.No such direct field assignment is possible in an O-O context; clients will perform field modifications through procedures of the class.Later in this chapter we will add to BOOKI a procedure that gives clients the effect of the above assignment,if the author of the class indeed wishes to grant them such privileges. We have already seen that C++and Java actually permit assignments of the form b/.page count:=355.But this simply reflects the inherent limits of attempts to integrate object technology in a C context. As the designers of Java themselves write in their book about the language:"A [Arnold 1996], programmer could still mess up the object by setting [a public]field,because the field [is] page 40. subject to change"through direct assignment instructions.Too many languages require such"don't do this"warnings.Rather than propose a language and then explain at length See also“"f it is how not to use it,it is desirable to define hand in hand the method and a notation that will baroque,,fiit” support it. page 670. In proper O-O development,classes without routines,such as BOOK/,have little practical use (except as ancestors in an inheritance hierarchy,where descendants will inherit the attributes and provide their own routines;or to represent external objects which the O-O part can access but not modify,for example sensor data in a real-time system) But they will help us go through the basic concepts;then we will add routines. Writers Using the types mentioned above,we can also define a class WRITER describing a simple notion of book author: class WR/TER feature name,real name:STRING birth year,death year:INTEGER end "Stendhal" A“vriter” name object real name "Henri Beyle" birth year 1783 death year 1842 (WRITER) References Objects whose fields are all of basic types will not take us very far.We need objects with fields that represent other objects.For example we will want to represent the property that a book has an author-denoted by an instance of class WR/TER
222 THE RUN-TIME STRUCTURE: OBJECTS §8.1 implies that the author of each class controls the precise set of operations that clients may execute on its instances. No such direct field assignment is possible in an O-O context; clients will perform field modifications through procedures of the class. Later in this chapter we will add to BOOK1 a procedure that gives clients the effect of the above assignment, if the author of the class indeed wishes to grant them such privileges. We have already seen that C++ and Java actually permit assignments of the form b1 ● page_count := 355. But this simply reflects the inherent limits of attempts to integrate object technology in a C context. As the designers of Java themselves write in their book about the language: “A programmer could still mess up the object by setting [a public] field, because the field [is] subject to change” through direct assignment instructions. Too many languages require such “don’t do this” warnings. Rather than propose a language and then explain at length how not to use it, it is desirable to define hand in hand the method and a notation that will support it. In proper O-O development, classes without routines, such as BOOK1, have little practical use (except as ancestors in an inheritance hierarchy, where descendants will inherit the attributes and provide their own routines; or to represent external objects which the O-O part can access but not modify, for example sensor data in a real-time system). But they will help us go through the basic concepts; then we will add routines. Writers Using the types mentioned above, we can also define a class WRITER describing a simple notion of book author: class WRITER feature name, real_name: STRING birth_ year, death_ year: INTEGER end References Objects whose fields are all of basic types will not take us very far. We need objects with fields that represent other objects. For example we will want to represent the property that a book has an author — denoted by an instance of class WRITER. [Arnold 1996], page 40. See also “If it is baroque, fix it”, page 670. A “writer” object "Stendhal" "Henri Beyle" name real_name birth_year 1783 death_year 1842 (WRITER)
$8.1 OBJECTS 223 A possibility is to introduce a notion of subobject.For example we might think of a book object,in a new version BOOK2 of the book class,as having a field author which is itself an object,as informally suggested by the following picture: Tw0“b0ok” title title objects with "The Red and the Black” "Life of Rossini" writer” date 1830 date 1823 subobjects page 341 page 307 count counf name Stendhal” name "Stendhal" real_name "Henri Beyle" real name Henri Beyle" birth year 1783 birth year 1783 death year 1842 death_year 1842 (WRITER) (WRITER) (BOOK2) (BOOK2) Such a notion of subobject is indeed useful and we will see,later in this chapter,how to write the corresponding classes. But here it is not exactly what we need.The example represents two books with the same author;we ended up duplicating the author information,which now appears as two subobjects,one in each instance of BOOK2.This duplication is probably not acceptable: It wastes memory space.Other examples would make this waste even more unacceptable:imagine for example a set of objects representing people,each one with a subobject representing the country of citizenship,where the number of people represented is large but the number of countries is small. Even more importantly,this technique fails to account for the need to express sharing.Regardless of representation choices,the author fields of the two objects refer to the same instance of WR/TER;if you update the WRITER object(for example to record an author's death),you will want the change to affect all book objects associated with the given author. Here then is a better picture of the desired situation,assuming yet another version of the book class,BOOK3:
§8.1 OBJECTS 223 A possibility is to introduce a notion of subobject. For example we might think of a book object, in a new version BOOK2 of the book class, as having a field author which is itself an object, as informally suggested by the following picture: Such a notion of subobject is indeed useful and we will see, later in this chapter, how to write the corresponding classes. But here it is not exactly what we need. The example represents two books with the same author; we ended up duplicating the author information, which now appears as two subobjects, one in each instance of BOOK2. This duplication is probably not acceptable: • It wastes memory space. Other examples would make this waste even more unacceptable: imagine for example a set of objects representing people, each one with a subobject representing the country of citizenship, where the number of people represented is large but the number of countries is small. • Even more importantly, this technique fails to account for the need to express sharing. Regardless of representation choices, the author fields of the two objects refer to the same instance of WRITER; if you update the WRITER object (for example to record an author’s death), you will want the change to affect all book objects associated with the given author. Here then is a better picture of the desired situation, assuming yet another version of the book class, BOOK3: Two “book” objects with “writer” subobjects "Life of Rossini" 1823 title date (BOOK2) page_ 307 "Stendhal" "Henri Beyle" name real_name birth_ year 1783 death_ year 1842 (WRITER) count "The Red and the Black” 1830 title date (BOOK2) page_ 341 "Stendhal" "Henri Beyle" name real_name birth_ year 1783 death_ year 1842 (WRITER) count
224 THE RUN-TIME STRUCTURE:OBJECTS $8.1 Tw0“b00k” title "The Red and the Black" title "The Charterhouse of Parma" objects with date 1830 date 1839 references to the same nnge 341 page 307 vriter”object count count author author (BOOK3) (BOOK3) name "Stendhal" real name "Henri Beyle" birth year 1783 death year 1842 (WRITER) The author field of each instance of BOOK3 contains what is known as a reference to a possible object of type WRITER.It is not difficult to define this notion precisely: Definition:reference A reference is a run-time value which is either void or attached. If attached,a reference identifies a single object.(It is then said to be attached to that particular object In the last figure,the author reference fields of the BOOK3 instances are both attached to the WR/TER instance,as shown by the arrows,which are conventionally used on such diagrams to represent a reference attached to an object.The following figure has a void reference (perhaps to indicate an unknown author),showing the graphical representation of void references: An object with title "Candide,or Optimism" a void reference field date 1759 page 120 ("Candide”was coum published anony- author moushy.) (BOOK3)
224 THE RUN-TIME STRUCTURE: OBJECTS §8.1 The author field of each instance of BOOK3 contains what is known as a reference to a possible object of type WRITER. It is not difficult to define this notion precisely: In the last figure, the author reference fields of the BOOK3 instances are both attached to the WRITER instance, as shown by the arrows, which are conventionally used on such diagrams to represent a reference attached to an object. The following figure has a void reference (perhaps to indicate an unknown author), showing the graphical representation of void references: Definition: reference A reference is a run-time value which is either void or attached. If attached, a reference identifies a single object. (It is then said to be attached to that particular object.) "The Charterhouse of Parma" 1839 title date (BOOK3) 307 count "The Red and the Black" 1830 title date (BOOK3) page_ 341 count (WRITER) "Stendhal" "Henri Beyle" name real_name birth_ year 1783 death_ year 1842 author author page_ Two “book” objects with references to the same “writer” object An object with a void reference field (“Candide” was published anonymously.) "Candide, or Optimism" 1759 title date (BOOK3) page_ 120 count author
$8.1 OBJECTS 225 The definition of references makes no mention of implementation properties.A reference,if not void,is a way to identify an object;an abstract name for the object.This is similar to a social security number that uniquely identifies a person,or an area code that identifies a phone area.Nothing implementation-specific or computer-specific here. The reference concept of course has a counterpart in computer implementations.In machine-level programming it is possible to manipulate addresses;many programming languages offer a notion of pointer.The notion of reference is more abstract.Although a reference may end up being represented as an address,it does not have to;and even when the representation of a reference includes an address,it may include other information. Another property sets references apart from addresses,although pointers in typed languages such as Pascal and Ada(not C)also enjoy it:as will be explained below,a reference in the approach described here is typed.This means that a given reference may only become attached to objects of a specific set of types,determined by a declaration in the software text.This idea again has counterparts in the non-computer world:a social security number is only meant for persons,and area codes are only meant for phone areas. (They may look like normal integers,but you would not add two area codes.) Object identity The notion of reference brings about the concept of object identity.Every object created during the execution of an object-oriented system has a unique identity,independent of the object's value as defined by its fields.In particular: I1.Two objects with different identities may have identical fields. I2.Conversely,the fields of a certain object may change during the execution of a system;but this does not affect the object's identity. These observations indicate that a phrase such as "a denotes the same object as b" may be ambiguous:are we talking about objects with different identities but the same contents(11)?Or about the states of an object before and after some change is applied to its fields (12)?We will use the second interpretation:a given object may take on new values for its constituent fields during an execution,while remaining "the same object". Whenever confusion is possible the discussion will be more explicit.For case II we may talk of equal (but distinct)objects;equality will be defined more precisely below. A point of terminology may have caught your attention.It is not a mistake to say (as in the definition of 12)that the fields of an object may change.The term "field"as defined above denotes one of the values that make up an object,not the corresponding field identifier,which is the name of one of the attributes of the object's generating class. For each attribute of the class,for example date in class BOOK3,the object has a field, for example /832 in the object of the last figure.During execution the attributes will never change,so each object's division into fields will remain the same;but the fields themselves may change.For example an instance of BOOK3 will always have four fields, corresponding to attributes title,date,page_count,author;these fields-the four values that make up a given object of type BOOK3-may change. “Object identity”, The study of how to make objects persistent will lead us to explore further properties page 1052. of object identity
§8.1 OBJECTS 225 The definition of references makes no mention of implementation properties. A reference, if not void, is a way to identify an object; an abstract name for the object. This is similar to a social security number that uniquely identifies a person, or an area code that identifies a phone area. Nothing implementation-specific or computer-specific here. The reference concept of course has a counterpart in computer implementations. In machine-level programming it is possible to manipulate addresses; many programming languages offer a notion of pointer. The notion of reference is more abstract. Although a reference may end up being represented as an address, it does not have to; and even when the representation of a reference includes an address, it may include other information. Another property sets references apart from addresses, although pointers in typed languages such as Pascal and Ada (not C) also enjoy it: as will be explained below, a reference in the approach described here is typed. This means that a given reference may only become attached to objects of a specific set of types, determined by a declaration in the software text. This idea again has counterparts in the non-computer world: a social security number is only meant for persons, and area codes are only meant for phone areas. (They may look like normal integers, but you would not add two area codes.) Object identity The notion of reference brings about the concept of object identity. Every object created during the execution of an object-oriented system has a unique identity, independent of the object’s value as defined by its fields. In particular: I1 • Two objects with different identities may have identical fields. I2 • Conversely, the fields of a certain object may change during the execution of a system; but this does not affect the object’s identity. These observations indicate that a phrase such as “a denotes the same object as b” may be ambiguous: are we talking about objects with different identities but the same contents (I1)? Or about the states of an object before and after some change is applied to its fields (I2)? We will use the second interpretation: a given object may take on new values for its constituent fields during an execution, while remaining “the same object”. Whenever confusion is possible the discussion will be more explicit. For case I1 we may talk of equal (but distinct) objects; equality will be defined more precisely below. A point of terminology may have caught your attention. It is not a mistake to say (as in the definition of I2) that the fields of an object may change. The term “field” as defined above denotes one of the values that make up an object, not the corresponding field identifier, which is the name of one of the attributes of the object’s generating class. For each attribute of the class, for example date in class BOOK3, the object has a field, for example 1832 in the object of the last figure. During execution the attributes will never change, so each object’s division into fields will remain the same; but the fields themselves may change. For example an instance of BOOK3 will always have four fields, corresponding to attributes title, date, page_count, author; these fields — the four values that make up a given object of type BOOK3 — may change. The study of how to make objects persistent will lead us to explore further properties of object identity. “Object identity”, page 1052
226 THE RUN-TIME STRUCTURE:OBJECTS $8.1 Declaring references Let us see how to extend the initial book class,BOOK/,which only had attributes of basic types,to the new variant BOOK3 which has an attribute representing references to potential authors.Here is the class text,again just showing the attributes;the only difference is an extra attribute declaration at the end: class BOOK3 feature title:STRING date,page_count:INTEGER author:WRITER -This is the new attribute. end The type used to declare author is simply the name of the corresponding class: WRITER.This will be a general rule:whenever a class is declared in the standard form class C feature...end then any entity declared of type C through a declaration of the form x:C denotes values that are references to potential objects of type C.The reason for this See page272. convention is that using references provides more flexibility,and so are appropriate in the vast majority of cases.You will find further examination of this rule (and of the other possible conventions)in the discussion section of this chapter. Self-reference Nothing in the preceding discussion precludes an object Ol from containing a reference field which (at some point of a system's execution)is attached to Ol itself.This kind of self-reference can also be indirect.In the situation pictured below,the object with "Almaviva"in its name field is its own landlord (direct reference cycle);the object "Figaro"loves "Susanna"which loves"Figaro"(indirect reference cycle). Direct and name "Almaviva" indirect self- reference landlord loved one (PERSONI name "Figaro" Susanna" name landlord landlord loved one loved one (PERSONI) (PERSONI
226 THE RUN-TIME STRUCTURE: OBJECTS §8.1 Declaring references Let us see how to extend the initial book class, BOOK1, which only had attributes of basic types, to the new variant BOOK3 which has an attribute representing references to potential authors. Here is the class text, again just showing the attributes; the only difference is an extra attribute declaration at the end: class BOOK3 feature title: STRING date, page_count: INTEGER author: WRITER -- This is the new attribute. end The type used to declare author is simply the name of the corresponding class: WRITER. This will be a general rule: whenever a class is declared in the standard form class C feature … end then any entity declared of type C through a declaration of the form x: C denotes values that are references to potential objects of type C. The reason for this convention is that using references provides more flexibility, and so are appropriate in the vast majority of cases. You will find further examination of this rule (and of the other possible conventions) in the discussion section of this chapter. Self-reference Nothing in the preceding discussion precludes an object O1 from containing a reference field which (at some point of a system’s execution) is attached to O1 itself. This kind of self-reference can also be indirect. In the situation pictured below, the object with "Almaviva" in its name field is its own landlord (direct reference cycle); the object "Figaro" loves "Susanna" which loves "Figaro" (indirect reference cycle). See page 272. Direct and indirect selfreference (PERSON1) name "Almaviva" landlord loved_one (PERSON1) name "Figaro" landlord loved_one (PERSON1) "Susanna" name landlord loved_one