284 MEMORY MANAGEMENT $9.1 D2.An assignment of the formy:==,where=is attached to an object other than O2, detaches y from 02. D3.Termination of a routine detaches formal arguments from any attached objects. D4.A creation instruction !!x,attaches x to a newly created object,and hence detaches x if it was previously attached to an object O1. Case D3 corresponds to the rule given earlier that the semantics of an assignment a=b is exactly the same as that of initializing a formal argument a of a routine r at the time of a call t.(...,b....),where the position of b in the call corresponds to that of a in the declaration ofr. Unreachable objects Does detachment mean that the detached object-Ol or 02 on the preceding figure- becomes useless and hence that the runtime mechanisms can reclaim the memory space it occupies,then recycle it for other objects?That would be too easy!The entity for which an object was initially created may have lost all interest in it,but because of dynamic aliasing other references may still be attached to it.For example the last figure may have shown only a partial view of attachments;looking at a broader context might reveal that Ol and O2 are still reachable from other objects: Detachment is not always death 02 But this is still not the entire object structure.By getting even more context,we might now discover that O4 and O5 are themselves useless,so that in the absence of other references O1 and O2 are not needed after all. So the answer to the question"what objects can we reclaim?"must follow from a global analysis of the entire set of objects created so far.We can identify three kinds of object: C1.Objects directly attached to entities of the software text,known(from the language rules)to be needed
284 MEMORY MANAGEMENT §9.1 D2 • An assignment of the form y := z, where z is attached to an object other than O2, detaches y from O2. D3 • Termination of a routine detaches formal arguments from any attached objects. D4 • A creation instruction !! x, attaches x to a newly created object, and hence detaches x if it was previously attached to an object O1. Case D3 corresponds to the rule given earlier that the semantics of an assignment a := b is exactly the same as that of initializing a formal argument a of a routine r at the time of a call t ● r (…, b, …), where the position of b in the call corresponds to that of a in the declaration of r. Unreachable objects Does detachment mean that the detached object — O1 or O2 on the preceding figure — becomes useless and hence that the runtime mechanisms can reclaim the memory space it occupies, then recycle it for other objects? That would be too easy! The entity for which an object was initially created may have lost all interest in it, but because of dynamic aliasing other references may still be attached to it. For example the last figure may have shown only a partial view of attachments; looking at a broader context might reveal that O1 and O2 are still reachable from other objects: But this is still not the entire object structure. By getting even more context, we might now discover that O4 and O5 are themselves useless, so that in the absence of other references O1 and O2 are not needed after all. So the answer to the question “what objects can we reclaim?” must follow from a global analysis of the entire set of objects created so far. We can identify three kinds of object: C1 • Objects directly attached to entities of the software text, known (from the language rules) to be needed. Detachment is not always death O1 O2 x y z O3 ✄ ✄ O4 O5
$9.1 WHAT HAPPENS TO OBJECTS 285 C2.Dependents of objects of category C1.(Recall that the direct dependents of an object are those to which it has references;here we are considering both direct and indirect dependents.) C3.Objects which are in neither of the preceding two categories. The objects of category CI may be called the origins.Together with those of category C2,the origins make up the set of reachable objects.Those of category C3 are unreachable.They correspond to what was informally called "useless objects"above.A more lively if somewhat macabre terminology uses the term"dead objects"for C3,the origins and their dependents being then called "live objects".(Computing scientists, however,have not quite managed to reconcile their various metaphors,as the process of reclaiming dead objects,studied below,is called "garbage collection".) The term“root”is also used for“orig in'”.But here the latter is preferable because an O-O system also has a "root object"and a root class.The resulting ambiguity would not be too damaging since the root object,as seen below,is indeed one of the origins. The first step towards addressing the problem of memory management under the free mode is to separate the reachable objects from the unreachable ones.To identify reachable objects,we must start from the origins and repeatedly follow all references.So the first question is to identify the origins;the answer depends on the run-time structure defined by the underlying language. Reachable objects in classical approaches Because the unreachability problem is already present in the run-time structure of such classical approaches as Pascal,C and Ada,it is interesting to start with this case.(More accurately,this is interesting for the reader who is familiar with one of these approaches. If you are not in this category,you may prefer to skip this section and go directly to the next one,which moves right on to the run-time structure of O-O software.) The approaches quoted combine the stack-based and free modes of allocation.C and Ada also support the static mode,but to keep things simple we may ignore static allocation by viewing it as a special case of stack-based allocation:we treat static objects as if they were allocated once and for all,when execution starts,at the bottom of the stack.(This is indeed the way Pascal developers emulate static entities:they declare them in the outermost block.) Another common property of these approaches is that entities may denote pointers. To provide a better preparation for the object-oriented approach of this book,where instead of pointers we use references (a more abstract notion,as discussed in the previous chapter),let us pretend that the pointers in question are actually references.This means in particular that we disregard the weakly typed nature of pointers in C. With these assumptions and simplifications the origins,shown with thick borders on the following figure,are all the objects which are either allocated on the stack or attached to references allocated on the stack.The reachable objects (including the origins)appear in color,the unreachable objects in black
§9.1 WHAT HAPPENS TO OBJECTS 285 C2 • Dependents of objects of category C1. (Recall that the direct dependents of an object are those to which it has references; here we are considering both direct and indirect dependents.) C3 • Objects which are in neither of the preceding two categories. The objects of category C1 may be called the origins. Together with those of category C2, the origins make up the set of reachable objects. Those of category C3 are unreachable. They correspond to what was informally called “useless objects” above. A more lively if somewhat macabre terminology uses the term “dead objects” for C3, the origins and their dependents being then called “live objects”. (Computing scientists, however, have not quite managed to reconcile their various metaphors, as the process of reclaiming dead objects, studied below, is called “garbage collection”.) The term “root” is also used for “origin”. But here the latter is preferable because an O-O system also has a “root object” and a root class. The resulting ambiguity would not be too damaging since the root object, as seen below, is indeed one of the origins. The first step towards addressing the problem of memory management under the free mode is to separate the reachable objects from the unreachable ones. To identify reachable objects, we must start from the origins and repeatedly follow all references. So the first question is to identify the origins; the answer depends on the run-time structure defined by the underlying language. Reachable objects in classical approaches Because the unreachability problem is already present in the run-time structure of such classical approaches as Pascal, C and Ada, it is interesting to start with this case. (More accurately, this is interesting for the reader who is familiar with one of these approaches. If you are not in this category, you may prefer to skip this section and go directly to the next one, which moves right on to the run-time structure of O-O software.) The approaches quoted combine the stack-based and free modes of allocation. C and Ada also support the static mode, but to keep things simple we may ignore static allocation by viewing it as a special case of stack-based allocation: we treat static objects as if they were allocated once and for all, when execution starts, at the bottom of the stack. (This is indeed the way Pascal developers emulate static entities: they declare them in the outermost block.) Another common property of these approaches is that entities may denote pointers. To provide a better preparation for the object-oriented approach of this book, where instead of pointers we use references (a more abstract notion, as discussed in the previous chapter), let us pretend that the pointers in question are actually references. This means in particular that we disregard the weakly typed nature of pointers in C. With these assumptions and simplifications the origins, shown with thick borders on the following figure, are all the objects which are either allocated on the stack or attached to references allocated on the stack. The reachable objects (including the origins) appear in color, the unreachable objects in black
286 MEMORY MANAGEMENT $9.1 Live objects (in color)and deadobjects (in black)in a combined THE STACK stack-based Stack top and free model Origin Reference origin Because the unreachability problem only arises for objects allocated under the free mode,and such objects are always attached to entities of reference types,it is convenient to ignore the reclamation problem for objects allocated on the stack(which can be handled simply by popping the stack at the time of block exit)and to start from the references coming from the stack.We may call these references reference origins.They are shown with thick arrows in the figure.A reference origin is either. O1.The value of a local entity or routine argument of reference type (as with the top two reference origins in the figure). O2.A field of reference type,in an object allocated on the stack(as with the lowest reference origin in the figure). As an example,consider the following type and procedure declarations,written in a syntax half-way between Pascal and the notation of the rest of this book(an entity of type reference G is a reference that may become attached to objects of type G): type COMPOSITE= record m:INTEGER r:reference COMPOSITE end
286 MEMORY MANAGEMENT §9.1 Because the unreachability problem only arises for objects allocated under the free mode, and such objects are always attached to entities of reference types, it is convenient to ignore the reclamation problem for objects allocated on the stack (which can be handled simply by popping the stack at the time of block exit) and to start from the references coming from the stack. We may call these references reference origins. They are shown with thick arrows in the figure. A reference origin is either: O1 • The value of a local entity or routine argument of reference type (as with the top two reference origins in the figure). O2 • A field of reference type, in an object allocated on the stack (as with the lowest reference origin in the figure). As an example, consider the following type and procedure declarations, written in a syntax half-way between Pascal and the notation of the rest of this book (an entity of type reference G is a reference that may become attached to objects of type G): type COMPOSITE = record m: INTEGER r: reference COMPOSITE end … THE STACK Stack top Reference origin Origin Live objects (in color) and dead objects (in black) in a combined stack-based and free model
$9.1 WHAT HAPPENS TO OBJECTS 287 procedure p is local n:INTEGER c:COMPOSITE s:reference COMPOSITE do end Every execution ofp allocates three values on the stack: Entity THE STACK allocation for a procedure New stack top (COMPOSITE) Previous stack top The three new values are an integer n,which does not affect the problem of object management(since it will disappear when the procedure terminates,and does not refer to any other object);a reference s,which is an example of category O1;and an object c of type COMPOSITE.This object is itself stack-based and its allocated memory may be reclaimed on procedure termination;but it contains a reference field for r,which is an example of category O2. In summary,to determine the reachable objects in a classical approach combining the stack-based and free modes,you can start from the references on the stack (variables of reference types,and reference fields of composite objects),and repeatedly follow all reference fields of the attached objects if any
§9.1 WHAT HAPPENS TO OBJECTS 287 procedure p is local n: INTEGER c: COMPOSITE s: reference COMPOSITE do … end Every execution of p allocates three values on the stack: The three new values are an integer n, which does not affect the problem of object management (since it will disappear when the procedure terminates, and does not refer to any other object); a reference s, which is an example of category O1; and an object c of type COMPOSITE. This object is itself stack-based and its allocated memory may be reclaimed on procedure termination; but it contains a reference field for r, which is an example of category O2. In summary, to determine the reachable objects in a classical approach combining the stack-based and free modes, you can start from the references on the stack (variables of reference types, and reference fields of composite objects), and repeatedly follow all reference fields of the attached objects if any. Entity allocation for a procedure THE STACK New stack top Previous stack top m r (COMPOSITE) c n s
288 MEMORY MANAGEMENT $9.1 Reachable objects in the object-oriented model The object-oriented run-time structure presented in the preceding chapter has a few differences from the one just discussed. Reachability in the object- THE ROOT oriented model THE STACK Stack top The execution of any system starts with the creation of one object,called the root object of the system,or just its root (when there is no confusion with the root class,a static notion).Clearly,the root is one of the origins in this case. Another set of origins arises because of the possible presence of local entities in a routine.Assume a routine of the form some routine is local rbl.rb2:BOOK3 eb:expanded BOOK3 do !rbl ..Operations possibly involving rbl,rb2 and eb... end Whenever a call to some routine is executed,and for the duration ofthat execution, the instructions in the routine's body may refer to rb/,rb2 and eb,and hence to the attached objects if any.(For eb there is always an attached object,but at various points rb/ and rb2 may be void.)This means that such objects must be part of the reachable set,even though they are not necessarily dependents of the root. Local entities of reference types,such as rb/and rb2,are similar to the local routine variables which,in the previous model,were allocated on the stack.Local entities of expanded types,such as eb,are similar to the stack-based objects
288 MEMORY MANAGEMENT §9.1 Reachable objects in the object-oriented model The object-oriented run-time structure presented in the preceding chapter has a few differences from the one just discussed. The execution of any system starts with the creation of one object, called the root object of the system, or just its root (when there is no confusion with the root class, a static notion). Clearly, the root is one of the origins in this case. Another set of origins arises because of the possible presence of local entities in a routine. Assume a routine of the form some_routine is local rb1, rb2: BOOK3 eb: expanded BOOK3 do … !! rb1 … Operations possibly involving rb1, rb2 and eb … end Whenever a call to some_routine is executed, and for the duration of that execution, the instructions in the routine’s body may refer to rb1, rb2 and eb, and hence to the attached objects if any. (For eb there is always an attached object, but at various points rb1 and rb2 may be void.) This means that such objects must be part of the reachable set, even though they are not necessarily dependents of the root. Local entities of reference types, such as rb1 and rb2, are similar to the local routine variables which, in the previous model, were allocated on the stack. Local entities of expanded types, such as eb, are similar to the stack-based objects. Reachability in the objectoriented model THE STACK Stack top THE ROOT