$23.1 SIDE EFFECTS IN FUNCTIONS 757 We saw then that an object of our software(a concrete object)is the representation of an abstract object,and that two concrete objects may represent the same abstract object For example two different stack representations,each made of an array and a top marker count,represent the same stack if they have the same value for count and the same array elements up to index count.They may differ in other properties,such as the array sizes and the values stored at indices above count.In mathematical terms,every concrete object belongs to the domain of the abstraction function a,and we can have c/#c2 even with a(c1)=a(c2) What this means for us is that a function that modifies a concrete object is harmless if the result of this modification still represents the same abstract object-yields the same a value.For example assume in a function on stacks contains the operation representation.put (some value,count 1) (with the guarantee that the array's capacity is at least count+/).This side effect changes a value above the stack-significant section of the array;it can do no ill. More generally,a concrete side effect which changes the concrete state of an object c is an abstract side effect if it also changes its abstract state,that is to say the value of a(c)(a more directly usable definition of abstractside effects will appear shortly).Ifa side effect is only concrete-does not affect the abstract state-it is harmless Figure page 751. In the object-as-machine metaphor,functions producing concrete-only side effects correspond to query buttons that may produce an internal state change having absolutely no effect on the answers given by any query button.For example the machine might save energy by automatically switching off some internal circuits if nobody presses a button for some time,and turning them on again whenever someone presses any button,queries included.Such an internal state change is unnoticeable from the outside and hence legitimate. The object-oriented approach is particularly favorable to clever implementations which,when computing a function,may change the concrete state behind the scenes without producing any visible side effect.The example of a stack function that changes array elements above the top is somewhat academic,but we will see below a practical and useful design that relies on this technique. Since not every class definition is accompanied by a full-fledged specification of the underlying abstract data type,we need a more directly usable definition of"abstract side effect".This is not difficult.In practice,the abstract data type is defined by the interface offered by a class to its clients(expressed for example as the short form of the class).A side effect will affect the abstract object if it changes the result of any query accessible to these clients.Hence the definition: Definition:abstract side effect An abstract side effect is a concrete side effect that can change the value of a non-secret query
§23.1 SIDE EFFECTS IN FUNCTIONS 757 We saw then that an object of our software (a concrete object) is the representation of an abstract object, and that two concrete objects may represent the same abstract object. For example two different stack representations, each made of an array and a top marker count, represent the same stack if they have the same value for count and the same array elements up to index count. They may differ in other properties, such as the array sizes and the values stored at indices above count. In mathematical terms, every concrete object belongs to the domain of the abstraction function a, and we can have c1 ≠ c2 even with a (c1) = a (c2). What this means for us is that a function that modifies a concrete object is harmless if the result of this modification still represents the same abstract object — yields the same a value. For example assume in a function on stacks contains the operation representation ● put (some_value, count + 1) (with the guarantee that the array’s capacity is at least count + 1). This side effect changes a value above the stack-significant section of the array; it can do no ill. More generally, a concrete side effect which changes the concrete state of an object c is an abstract side effect if it also changes its abstract state, that is to say the value of a (c) (a more directly usable definition of abstract side effects will appear shortly). If a side effect is only concrete — does not affect the abstract state — it is harmless. In the object-as-machine metaphor, functions producing concrete-only side effects correspond to query buttons that may produce an internal state change having absolutely no effect on the answers given by any query button. For example the machine might save energy by automatically switching off some internal circuits if nobody presses a button for some time, and turning them on again whenever someone presses any button, queries included. Such an internal state change is unnoticeable from the outside and hence legitimate. The object-oriented approach is particularly favorable to clever implementations which, when computing a function, may change the concrete state behind the scenes without producing any visible side effect. The example of a stack function that changes array elements above the top is somewhat academic, but we will see below a practical and useful design that relies on this technique. Since not every class definition is accompanied by a full-fledged specification of the underlying abstract data type, we need a more directly usable definition of “abstract side effect”. This is not difficult. In practice, the abstract data type is defined by the interface offered by a class to its clients (expressed for example as the short form of the class). A side effect will affect the abstract object if it changes the result of any query accessible to these clients. Hence the definition: Definition: abstract side effect An abstract side effect is a concrete side effect that can change the value of a non-secret query. Figure page 751
758 DESIGNING CLASS INTERFACES $23.1 This is the notion used by the Command-Query Separation principle-the principle The principle that prohibits abstract side effects in functions. appears on page 751. The definition refers to "non-secret"rather than exported queries.The reason is that in-between generally exported and fully secret status,we must permit a query to be selectively exported to a set ofclients.As soon as a query is non-secret-exported to any client other than NONE-we consider that changing its result is an abstract side effect, since the change will be visible to at least some clients. The policy As announced at the beginning of this discussion,abstract side effects are (unlike concrete side effects)not easily detectable by a compiler.In particular it does not suffice to check that a function preserves the values ofall non-secret attributes:the effect on other queries might be indirect,or(as in the max example)several concrete side effects might in the end cancel out.The most a compiler can do would be to issue a warning if a function modifies an exported attribute. So the Command-Query Separation principle is a methodological precept,not a language constraint.This does not,however,diminish its importance. Past what for some people will be an initial shock,every object-oriented developer should apply the principle without exception.I have followed it for years,and would never write a side-effect-producing function.ISE applies it in all its O-O software(for the C part we have of course to adapt to the dominant style,although even here we try to apply the principle whenever we can).It has helped us produce much better results-tools and libraries that we can reuse,explain to others,extend and scale up. Objections It is important here two deal with two common objections to the side-effect-free style. The first has to do with error handling.Sometimes a function with side effects is really a procedure,which in addition to doing its job returns a status code indicating how things went.But there are better ways to do this;roughly speaking,the proper O-O technique is to enable the client,after an operation on an object,to perform a query on the status,represented for example by an attribute of the object,as in target.some operation (... how did it go:=target.status Note that the technique of returning a status as function result is lame anyway.It transforms a procedure into a function by adding the status as a result;but it does not work if the routine was already a function,which already has a result of its own.It is also problematic if you need more than one status indicator.In such cases the C approach is either to return a"structure"(the equivalent of an object)with several components,which is getting close to the above scheme,or to use global variables-which raises a whole set of new problems,especially in a large system where many modules can trigger errors
758 DESIGNING CLASS INTERFACES §23.1 This is the notion used by the Command-Query Separation principle — the principle that prohibits abstract side effects in functions. The definition refers to “non-secret” rather than exported queries. The reason is that in-between generally exported and fully secret status, we must permit a query to be selectively exported to a set of clients. As soon as a query is non-secret — exported to any client other than NONE — we consider that changing its result is an abstract side effect, since the change will be visible to at least some clients. The policy As announced at the beginning of this discussion, abstract side effects are (unlike concrete side effects) not easily detectable by a compiler. In particular it does not suffice to check that a function preserves the values of all non-secret attributes: the effect on other queries might be indirect, or (as in the max example) several concrete side effects might in the end cancel out. The most a compiler can do would be to issue a warning if a function modifies an exported attribute. So the Command-Query Separation principle is a methodological precept, not a language constraint. This does not, however, diminish its importance. Past what for some people will be an initial shock, every object-oriented developer should apply the principle without exception. I have followed it for years, and would never write a side-effect-producing function. ISE applies it in all its O-O software (for the C part we have of course to adapt to the dominant style, although even here we try to apply the principle whenever we can). It has helped us produce much better results — tools and libraries that we can reuse, explain to others, extend and scale up. Objections It is important here two deal with two common objections to the side-effect-free style. The first has to do with error handling. Sometimes a function with side effects is really a procedure, which in addition to doing its job returns a status code indicating how things went. But there are better ways to do this; roughly speaking, the proper O-O technique is to enable the client, after an operation on an object, to perform a query on the status, represented for example by an attribute of the object, as in target ● some_operation (…) how_did_it_go := target ● status Note that the technique of returning a status as function result is lame anyway. It transforms a procedure into a function by adding the status as a result; but it does not work if the routine was already a function, which already has a result of its own. It is also problematic if you need more than one status indicator. In such cases the C approach is either to return a “structure” (the equivalent of an object) with several components, which is getting close to the above scheme, or to use global variables — which raises a whole set of new problems, especially in a large system where many modules can trigger errors. The principle appears on page 751
$23.1 SIDE EFFECTS IN FUNCTIONS 759 The second objection is a common misconception:the impression that Command- Query Separation,for example the list-with-cursor type of interface,is incompatible with concurrent access to objects.That belief is remarkably widespread (this is one of the places where I know that,if I am lecturing on these topics,someone in the audience will raise his hand,and the question will be the same whether we are in Santa Barbara,Seattle, Singapore,Sydney,Stockholm or Saint-Petersburg);but it is incorrect nonetheless Chapter 30. The misconception is that in a concurrent context it is essential to have atomic access-cum-modification operations,for example get on a buffer -the concurrent equivalent of a first-in,first out queue.Such a gei function non-interruptibly performs,in our terminology,both a call to item (obtain the oldest element)and remove (remove that element),returning the result of item as the result of get.But using such an example as an argument for get-style functions with side effects is confusing two notions.What we need in a concurrent context is a way to offer a client exclusive access to a supplier object for certain operations.With such a mechanism,we can protect a client extract of the form x:=buffer.item,buffer.remove thereby guaranteeing that the buffer element returned by the call to item is indeed the same one removed by the following call to remove.Whether or not we permit functions to have side effects,we will have to provide a mechanism to ensure such exclusive access;for example a client may need to dequeue two elements buffer.remove,buffer.remove with the guarantee that the removed elements will be consecutive;this requires exclusive access,and is unrelated to the question of side effects in functions Chapter 30.See in par- Later in this book we will have an extensive discussion of concurrency,where we ticuar"Supportfor will study a simple and elegant approach to concurrent and distributed computation,fully command-query sepa- ration",page 1029. compatible with the Command-Query Separation principle-which in fact will help us arrive at it. Legitimate side effects:an example To conclude this discussion of side effects let us examine a typical case of legitimate side effects-functions that do not change the abstract state,but can change the concrete state, and for good reason.The example is representative of a useful design pattern. Consider the implementation of complex numbers.As with points,discussed in an earlier chapter,two representations are possible:cartesian(by axis coordinates x and y)and polar(by distance to the origin p and angle 0).Which one do we choose?There is no easy answer.If we take,as usual,the abstract data type approach,we will note that what counts is the applicable operations-addition,subtraction,multiplication and division among others,as well as queries to access x,y,p and 6-and that for each of them one of the representations is definitely better:cartesian for addition,subtraction and such,polar for multiplication and division.(Try expressing division in cartesian coordinates!)
§23.1 SIDE EFFECTS IN FUNCTIONS 759 The second objection is a common misconception: the impression that CommandQuery Separation, for example the list-with-cursor type of interface, is incompatible with concurrent access to objects. That belief is remarkably widespread (this is one of the places where I know that, if I am lecturing on these topics, someone in the audience will raise his hand, and the question will be the same whether we are in Santa Barbara, Seattle, Singapore, Sydney, Stockholm or Saint-Petersburg); but it is incorrect nonetheless. The misconception is that in a concurrent context it is essential to have atomic access-cum-modification operations, for example get on a buffer — the concurrent equivalent of a first-in, first out queue. Such a get function non-interruptibly performs, in our terminology, both a call to item (obtain the oldest element) and remove (remove that element), returning the result of item as the result of get. But using such an example as an argument for get-style functions with side effects is confusing two notions. What we need in a concurrent context is a way to offer a client exclusive access to a supplier object for certain operations. With such a mechanism, we can protect a client extract of the form x := buffer ● item; buffer ● remove thereby guaranteeing that the buffer element returned by the call to item is indeed the same one removed by the following call to remove. Whether or not we permit functions to have side effects, we will have to provide a mechanism to ensure such exclusive access; for example a client may need to dequeue two elements buffer ● remove; buffer ● remove with the guarantee that the removed elements will be consecutive; this requires exclusive access, and is unrelated to the question of side effects in functions. Later in this book we will have an extensive discussion of concurrency, where we will study a simple and elegant approach to concurrent and distributed computation, fully compatible with the Command-Query Separation principle — which in fact will help us arrive at it. Legitimate side effects: an example To conclude this discussion of side effects let us examine a typical case of legitimate side effects — functions that do not change the abstract state, but can change the concrete state, and for good reason. The example is representative of a useful design pattern. Consider the implementation of complex numbers. As with points, discussed in an earlier chapter, two representations are possible: cartesian (by axis coordinates x and y) and polar (by distance to the origin ρ and angle θ). Which one do we choose? There is no easy answer. If we take, as usual, the abstract data type approach, we will note that what counts is the applicable operations — addition, subtraction, multiplication and division among others, as well as queries to access x, y, ρ and θ — and that for each of them one of the representations is definitely better: cartesian for addition, subtraction and such, polar for multiplication and division. (Try expressing division in cartesian coordinates!) Chapter 30. Chapter 30. See in particular “Support for command-query separation”, page 1029
760 DESIGNING CLASS INTERFACES $23.1 We could let the client decide what representation to use.But this would make our classes difficult to use,and violate information hiding:for the client author,the representation should not matter. Alternatively,we could keep both representations up to date at all times.But this may cause unnecessary performance penalties.Assume for example that a client only performs multiplications and divisions.The operations use polar representations,but after each one of them we must recompute x and y,a useless but expensive computation involving trigonometric functions. A better solution is to refuse to choose between the representations a priori,but update each of them only when we need it.As compared to the preceding approach,we do not gain anything in space (since we will still need attributes for each ofx,y,p and 6, plus two boolean attributes to tell us which of the representations are up to date);but we avoid wasting computation time. We may assume the following public operations,among others: class COMPLEX feature ..Feature declarations for: infix "+"infix "-"infix "*"infix "/" add,subtract,multiply,divide, x,y,rho,theta,.. end The queries x,y,rho and theta are exported functions returning real values.They are always defined (except theta for the complex number 0)since a client may request the x and y of a complex number even if the number is internally represented in polar, and its p and 0 even if it is in cartesian.In addition to the functions"+"etc.,we assume procedures add etc.which modify an object:=+=2 is a new complex number equal to the sum of=/and=2,whereas the procedure call=/.add(=2)changes=I to represent that sum.In practice,we might need only the functions or only the procedures. Internally,the class includes the following secret attributes for the representation: cartesian ready:BOOLEAN polar ready:BOOLEAN private_x,private y,private rho,private theta:REAL Not all of the four real attributes are necessarily up to date at all times;in fact only two need be up to date.More precisely,the following implementation invariant should be included in the class: invariant cartesian ready or polar ready polar ready implies (0<=private theta and private theta <Two pi) --cartesian ready implies (private x and private y are up to date) --polar ready implies (private rho and private theta are up to date)
760 DESIGNING CLASS INTERFACES §23.1 We could let the client decide what representation to use. But this would make our classes difficult to use, and violate information hiding: for the client author, the representation should not matter. Alternatively, we could keep both representations up to date at all times. But this may cause unnecessary performance penalties. Assume for example that a client only performs multiplications and divisions. The operations use polar representations, but after each one of them we must recompute x and y, a useless but expensive computation involving trigonometric functions. A better solution is to refuse to choose between the representations a priori, but update each of them only when we need it. As compared to the preceding approach, we do not gain anything in space (since we will still need attributes for each of x, y, ρ and θ, plus two boolean attributes to tell us which of the representations are up to date); but we avoid wasting computation time. We may assume the following public operations, among others: class COMPLEX feature … Feature declarations for: infix "+", infix "–", infix "∗", infix "/", add, subtract, multiply, divide, x, y, rho, theta, … end The queries x, y, rho and theta are exported functions returning real values. They are always defined (except theta for the complex number 0) since a client may request the x and y of a complex number even if the number is internally represented in polar, and its ρ and θ even if it is in cartesian. In addition to the functions "+" etc., we assume procedures add etc. which modify an object: z1 + z2 is a new complex number equal to the sum of z1 and z2, whereas the procedure call z1 ● add (z2) changes z1 to represent that sum. In practice, we might need only the functions or only the procedures. Internally, the class includes the following secret attributes for the representation: cartesian_ready: BOOLEAN polar_ready: BOOLEAN private_x, private_y, private_rho, private_theta: REAL Not all of the four real attributes are necessarily up to date at all times; in fact only two need be up to date. More precisely, the following implementation invariant should be included in the class: invariant cartesian_ready or polar_ready polar_ready implies (0 <= private_theta and private_theta <= Two_pi) -- cartesian_ready implies (private_x and private_ y are up to date) -- polar_ready implies (private_rho and private_theta are up to date)
$23.1 SIDE EFFECTS IN FUNCTIONS 761 The value of Two pi is assumed to be 2 n.The last two clauses may only be expressed informally,in the form of comments. At any time at least one of the representations is up to date,although both may be.Any operation requested by a client will be carried out in the most appropriate representation;this may require computing that representation if it was not up to date.If the operation produces a(concrete)side effect,the other representation will cease to be up to date. Two secret procedures are available for carrying out representation changes: prepare cartesian is --Make cartesian representation available do if not cartesian ready then check polar ready end --(Because the invariant requires at least one of the --two representations to be up to date) private x private rho cos (private theta) private y :private rho sin (private theta) cartesian ready :True --Here both cartesian ready and polar ready are true: --Both representations are available end ensure cartesian ready end prepare polar is --Make polar representation available do if not polar ready then check cartesian ready end private rho sgrt(private x^2 private y^2) private theta:=atan2 (private y,private x) polar ready :True --Here both cartesian ready and polar ready are true: --Both representations are available end ensure polar ready end Functions cos,sin,sqrt and atan2 are assumed to be taken from a standard mathematical library;atan2 (x)should compute the arc tangent ofy/x. We will also need creation procedures make cartesian and make polar:
§23.1 SIDE EFFECTS IN FUNCTIONS 761 The value of Two_pi is assumed to be 2 π. The last two clauses may only be expressed informally, in the form of comments. At any time at least one of the representations is up to date, although both may be. Any operation requested by a client will be carried out in the most appropriate representation; this may require computing that representation if it was not up to date. If the operation produces a (concrete) side effect, the other representation will cease to be up to date. Two secret procedures are available for carrying out representation changes: prepare_cartesian is -- Make cartesian representation available do if not cartesian_ready then check polar_ready end -- (Because the invariant requires at least one of the -- two representations to be up to date) private_x := private_rho ∗ cos (private_theta) private_y := private_rho ∗ sin (private_theta) cartesian_ready := True -- Here both cartesian_ready and polar_ready are true: -- Both representations are available end ensure cartesian_ready end prepare_polar is -- Make polar representation available do if not polar_ready then check cartesian_ready end private_rho := sqrt (private_x ^ 2 + private_y ^ 2) private_theta := atan2 (private_y, private_x) polar_ready := True -- Here both cartesian_ready and polar_ready are true: -- Both representations are available end ensure polar_ready end Functions cos, sin, sqrt and atan2 are assumed to be taken from a standard mathematical library; atan2 (y, x) should compute the arc tangent of y / x. We will also need creation procedures make_cartesian and make_polar: