23 Principles of class design erinere develoers knw that fewisathan theroe design of module interfaces.In a multi-person,or just multi-week software project,many of the decisions,discussions,disputes and confusions tend to revolve around matters of module interface specification:"Who takes care of making sure that...?","But I thought you only passed me normalized input...","Why are you processing this since I already took care of it??” If there were just one advantage to expect from object technology,this would have to be it.From the outset of this presentation,object-oriented development has been described as an architectural technique for producing systems made of coherent,properly interfaced modules.We have now accumulated enough technical background to review the design principles through which you can take advantage of the best O-O mechanisms to develop modules with attractive interfaces. In the following pages we will explore a set of class design principles which extensive practice has shown to yield quality and durability.Because what determines the success of a class is how it will look to its clients,the emphasis here is not on the internal implementation of a class but on how to make its interface simple,easy to learn,easy to remember,and able to withstand the test of time and change. We will successively examine:whether functions should be permitted to have side effects;how many arguments a feature should reasonably have,and the associated notions of operand and option;whether you should be concerned about the size of your classes; making abstract structures active;the role of selective exports;how to document a class; how to deal with abnormal cases. From this discussion will emerge an image of the class designer as a patient craftsman who chisels out and polishes each class to make it as attractive as possible to clients.This spirit of treating classes as carefully engineered products,aiming at perfection from the start and yet always perfectible,is a pervasive quality of well-applied object technology.For obvious reasons it is particularly visible in the construction of library classes,and indeed many of the design principles reviewed in this chapter originated in library design;in the same way that successful ideas first tried in Formula 1 racing eventually trickle down to the engineering of cars for the rest of us,a technique that has shown its value by surviving the toughest possible test-being applied to the development of a successful library of reusable components-will eventually benefit all object-oriented software,whether or not initially intended for reuse
23 Principles of class design Experienced software developers know that few issues are more critical than the proper design of module interfaces. In a multi-person, or just multi-week software project, many of the decisions, discussions, disputes and confusions tend to revolve around matters of module interface specification: “Who takes care of making sure that…?”, “But I thought you only passed me normalized input…”, “Why are you processing this since I already took care of it?”. If there were just one advantage to expect from object technology, this would have to be it. From the outset of this presentation, object-oriented development has been described as an architectural technique for producing systems made of coherent, properly interfaced modules. We have now accumulated enough technical background to review the design principles through which you can take advantage of the best O-O mechanisms to develop modules with attractive interfaces. In the following pages we will explore a set of class design principles which extensive practice has shown to yield quality and durability. Because what determines the success of a class is how it will look to its clients, the emphasis here is not on the internal implementation of a class but on how to make its interface simple, easy to learn, easy to remember, and able to withstand the test of time and change. We will successively examine: whether functions should be permitted to have side effects; how many arguments a feature should reasonably have, and the associated notions of operand and option; whether you should be concerned about the size of your classes; making abstract structures active; the role of selective exports; how to document a class; how to deal with abnormal cases. From this discussion will emerge an image of the class designer as a patient craftsman who chisels out and polishes each class to make it as attractive as possible to clients. This spirit of treating classes as carefully engineered products, aiming at perfection from the start and yet always perfectible, is a pervasive quality of well-applied object technology. For obvious reasons it is particularly visible in the construction of library classes, and indeed many of the design principles reviewed in this chapter originated in library design; in the same way that successful ideas first tried in Formula 1 racing eventually trickle down to the engineering of cars for the rest of us, a technique that has shown its value by surviving the toughest possible test — being applied to the development of a successful library of reusable components — will eventually benefit all object-oriented software, whether or not initially intended for reuse
748 DESIGNING CLASS INTERFACES $23.1 23.1 SIDE EFFECTS IN FUNCTIONS The first question that we must address will have a deep effect on the style of our designs. Is it legitimate for functions-routines that return a result-also to produce a side effect, that is to say,to change something in their environment? The gist of the answer is no,but we must first understand the role of side effects,and distinguish between good and potentially bad side effects.We must also discuss the question in light of all we now know about classes:their filiation from abstract data types, the notion of abstraction function,and the role of class invariants. Commands and queries A few reminders on terminology will be useful.The features that characterize a class are "Atributes and ro divided into commands and queries.A command serves to modify objects,a query to tines".page 173. return information about objects.A command is implemented as a procedure.A query may be implemented either as an attribute,that is to say by reserving a field in each run- time instance of the class to hold the corresponding value,or as a function,that is to say through an algorithm that computes the value when needed.Procedures(which also have an associated algorithm)and functions are together called routines. The definition of queries does not specify whether in the course of producing its result a query may change objects.For commands,the answer is obviously yes,since it is the role of commands(procedures)to change things.Among queries,the question only makes sense for functions,since accessing an attribute cannot change anything.A change performed by a function is known as a side effect to indicate that it is ancillary to the function's official purpose of answering a query.Should we permit side effects? Forms of side effect Let us define precisely what constructs may cause side effects.The basic operation that changes an object is an assignment a:=b(or an assignment attempt a ?b,or a creation instruction !a)where the target a is an attribute;execution of this operation will assign a new value to the field of the corresponding object(the target of the current routine call). We only care about such assignments when a is an attribute:if a is a local entity,its value is only used during an execution of the routine and assignments to it have no permanent effect,if a is the entity Result denoting the result of the routine,assignments to it help compute that result but have no effect on objects. Also note that as a result of information hiding principles we have been careful,in "The client's privi- the design of the object-oriented notation,to avoid any indirect form of object leges on an modification.In particular,the syntax excludes assignments of the form obj.attr:=b, attribute”page206. whose aim has to be achieved through a call obj.set attr(b),where the procedure set attr(x:...)performs the attribute assignment attr:=x. The attribute assignment that causes a function to produce a side effect may be in the function itself,or in another routine that the function calls.Hence the full definition:
748 DESIGNING CLASS INTERFACES §23.1 23.1 SIDE EFFECTS IN FUNCTIONS The first question that we must address will have a deep effect on the style of our designs. Is it legitimate for functions — routines that return a result — also to produce a side effect, that is to say, to change something in their environment? The gist of the answer is no, but we must first understand the role of side effects, and distinguish between good and potentially bad side effects. We must also discuss the question in light of all we now know about classes: their filiation from abstract data types, the notion of abstraction function, and the role of class invariants. Commands and queries A few reminders on terminology will be useful. The features that characterize a class are divided into commands and queries. A command serves to modify objects, a query to return information about objects. A command is implemented as a procedure. A query may be implemented either as an attribute, that is to say by reserving a field in each runtime instance of the class to hold the corresponding value, or as a function, that is to say through an algorithm that computes the value when needed. Procedures (which also have an associated algorithm) and functions are together called routines. The definition of queries does not specify whether in the course of producing its result a query may change objects. For commands, the answer is obviously yes, since it is the role of commands (procedures) to change things. Among queries, the question only makes sense for functions, since accessing an attribute cannot change anything. A change performed by a function is known as a side effect to indicate that it is ancillary to the function’s official purpose of answering a query. Should we permit side effects? Forms of side effect Let us define precisely what constructs may cause side effects. The basic operation that changes an object is an assignment a := b (or an assignment attempt a ?= b, or a creation instruction !! a) where the target a is an attribute; execution of this operation will assign a new value to the field of the corresponding object (the target of the current routine call). We only care about such assignments when a is an attribute: if a is a local entity, its value is only used during an execution of the routine and assignments to it have no permanent effect; if a is the entity Result denoting the result of the routine, assignments to it help compute that result but have no effect on objects. Also note that as a result of information hiding principles we have been careful, in the design of the object-oriented notation, to avoid any indirect form of object modification. In particular, the syntax excludes assignments of the form obj ● attr := b, whose aim has to be achieved through a call obj ● set_attr (b), where the procedure set_attr (x:…) performs the attribute assignment attr := x. The attribute assignment that causes a function to produce a side effect may be in the function itself, or in another routine that the function calls. Hence the full definition: “Attributes and routines”, page 173. “The client’s privileges on an attribute”, page 206
$23.1 SIDE EFFECTS IN FUNCTIONS 749 Definition:concrete side effect A function produces a concrete side effect if its body contains any of the following: An assignment,assignment attempt or creation instruction whose target is an attribute. ·A procedure call. (The term "concrete"will be explained below.)In a more fine-tuned definition we would replace the second clause by "A call to a routine that (recursively)produces a concrete side effect",the definition of side effects being extended to arbitrary routines rather than just functions.But the above form is preferable in practice even though it may be considered both too strong and too weak: The definition seems too strong because any procedure call is considered to produce a side effect whereas it is possible to write a procedure that changes nothing.Such procedures,however,are rarely useful-except if their role is to change something in the software's environment,for example printing a page,sending a message to the network or moving a robot arm;but then we do want to consider this a side effect even if it does not directly affect an object of the software itself. The definition seems too weak because it ignores the case of a function that calls a side-effect-producing function g.The convention will simply be that can still be considered side-effect-free.This is acceptable because the rule at which we will arrive in this discussion will prohibit all side effects of a certain kind,so we will need to certify each function separately. The advantage of these conventions is that to determine the side-effect status of a function you only need to look at the body of the function itself.It is in fact trivial,if you have a parser for the language,to write a simple tool that will analyze a function and tell you whether it produces a concrete side effect according to the definition. Referential transparency Why should we be concerned about side effects in functions?After all it is in the nature of software execution to change things. “ntroducing a more The problem is that if we allow functions to change things as well as commands,we imperative view”, lose many of the simple mathematical properties that enable us to reason about our page 145. software.As noted in the discussion of abstract data types,when we first encountered the distinction between the applicative and the imperative,mathematics is change-free:it talks about abstract objects and defines operations on these objects,but the operations do not change the objects.(Computing 2 does not change the number two.)This immutability is the principal difference between the worlds of mathematics and computer software
§23.1 SIDE EFFECTS IN FUNCTIONS 749 (The term “concrete” will be explained below.) In a more fine-tuned definition we would replace the second clause by “A call to a routine that (recursively) produces a concrete side effect”, the definition of side effects being extended to arbitrary routines rather than just functions. But the above form is preferable in practice even though it may be considered both too strong and too weak: • The definition seems too strong because any procedure call is considered to produce a side effect whereas it is possible to write a procedure that changes nothing. Such procedures, however, are rarely useful — except if their role is to change something in the software’s environment, for example printing a page, sending a message to the network or moving a robot arm; but then we do want to consider this a side effect even if it does not directly affect an object of the software itself. • The definition seems too weak because it ignores the case of a function f that calls a side-effect-producing function g. The convention will simply be that f can still be considered side-effect-free. This is acceptable because the rule at which we will arrive in this discussion will prohibit all side effects of a certain kind, so we will need to certify each function separately. The advantage of these conventions is that to determine the side-effect status of a function you only need to look at the body of the function itself. It is in fact trivial, if you have a parser for the language, to write a simple tool that will analyze a function and tell you whether it produces a concrete side effect according to the definition. Referential transparency Why should we be concerned about side effects in functions? After all it is in the nature of software execution to change things. The problem is that if we allow functions to change things as well as commands, we lose many of the simple mathematical properties that enable us to reason about our software. As noted in the discussion of abstract data types, when we first encountered the distinction between the applicative and the imperative, mathematics is change-free: it talks about abstract objects and defines operations on these objects, but the operations do not change the objects. (Computing does not change the number two.) This immutability is the principal difference between the worlds of mathematics and computer software. Definition: concrete side effect A function produces a concrete side effect if its body contains any of the following: • An assignment, assignment attempt or creation instruction whose target is an attribute. • A procedure call. “Introducing a more imperative view”, page 145. 2
750 DESIGNING CLASS INTERFACES $23.1 Some approaches to programming seek to retain the immutability of mathematics:Lisp in its so-called"pure"form,"Functional Programming"languages such as Backus's FP, and other applicative languages shun change.But they have not caught on for practical software development,suggesting that change is a fundamental property of software. The object immutability of mathematics has an important practical consequence known as referential transparency,a property defined as follows: Definition:referential transparency Definition from “The Free On-Lime An expression e is referentially transparent if it is possible to exchange any Dictionary of Com- subexpression with its value without changing the value of e. puting"',htp∥ wombat. Ifx has value three,we can use x instead of 3,or conversely,in any part of a The Swifi quotation referentially transparent expression.(Only Swift's Laputa academicians were willing to was on page 672. pay the true price of renouncing referential transparency:always carrying around all the things you will ever want to talk about.)As a consequence of the definition,if we know that x and y have the same value,we can use one interchangeably with the other.For that reason referential transparency is also called "substitutivity of equals for equals". With side-effect-producing functions,referential transparency disappears.Assume a class contains the attribute and the function attr:INTEGER sneaky:INTEGER is do attr:=attr 1 end Remember that Result in an integer Then the value of sneaky (meaning:of a call to that function)is always 0;but you function is initial- cannot use 0 and sneaky interchangeably,since an extract of the form ized to zero. attr =0;if attr /=0 then print ("Something bizarre!")end will print nothing,but would print Something bizarre!if you replaced 0 by sneaky. Maintaining referential transparency in expressions is important to enable us to See [Dijkstra 1968] reason about our software.One of the central issues of software construction,analyzed clearly by Dijkstra many years ago,is the difficulty of getting a clear picture of the dynamic behavior(the myriad possible executions of even a simple software element) from its static description(the text of the element).In this effort it is essential to be able to rely on the proven form of reasoning,provided by mathematics.With the demise of referential transparency,however,we lose basic properties of mathematics,so deeply rooted in our practice that we may not even be aware of them.For example,it is no longer true that n+n is the same thing as 2 n if n is the sneaky-like function n:INTEGER is do attr :attr 1:Result:=attr end since,with attr initially zero,2 n will return 2 whereas n+n will return 3. By limiting ourselves to functions that do not produce side effects,we will ensure that talking about "functions"in software ceases to betray the meaning of this term in ordinary mathematics.We will maintain a clear distinction between commands,which
750 DESIGNING CLASS INTERFACES §23.1 Some approaches to programming seek to retain the immutability of mathematics: Lisp in its so-called “pure” form, “Functional Programming” languages such as Backus’s FP, and other applicative languages shun change. But they have not caught on for practical software development, suggesting that change is a fundamental property of software. The object immutability of mathematics has an important practical consequence known as referential transparency, a property defined as follows: If x has value three, we can use x instead of 3, or conversely, in any part of a referentially transparent expression. (Only Swift’s Laputa academicians were willing to pay the true price of renouncing referential transparency: always carrying around all the things you will ever want to talk about.) As a consequence of the definition, if we know that x and y have the same value, we can use one interchangeably with the other. For that reason referential transparency is also called “substitutivity of equals for equals”. With side-effect-producing functions, referential transparency disappears. Assume a class contains the attribute and the function attr: INTEGER sneaky: INTEGER is do attr := attr + 1 end Then the value of sneaky (meaning: of a call to that function) is always 0; but you cannot use 0 and sneaky interchangeably, since an extract of the form attr := 0; if attr /= 0 then print ("Something bizarre!") end will print nothing, but would print Something bizarre! if you replaced 0 by sneaky. Maintaining referential transparency in expressions is important to enable us to reason about our software. One of the central issues of software construction, analyzed clearly by Dijkstra many years ago, is the difficulty of getting a clear picture of the dynamic behavior (the myriad possible executions of even a simple software element) from its static description (the text of the element). In this effort it is essential to be able to rely on the proven form of reasoning, provided by mathematics. With the demise of referential transparency, however, we lose basic properties of mathematics, so deeply rooted in our practice that we may not even be aware of them. For example, it is no longer true that n + n is the same thing as 2 ∗ n if n is the sneaky-like function n: INTEGER is do attr := attr + 1; Result := attr end since, with attr initially zero, 2 ∗ n will return 2 whereas n + n will return 3. By limiting ourselves to functions that do not produce side effects, we will ensure that talking about “functions” in software ceases to betray the meaning of this term in ordinary mathematics. We will maintain a clear distinction between commands, which Definition: referential transparency An expression e is referentially transparent if it is possible to exchange any subexpression with its value without changing the value of e. Definition from “The Free On-Line Dictionary of Computing”, http:// wombat. The Swift quotation was on page 672. Remember that Result in an integer function is initialized to zero. See [Dijkstra 1968]
$23.1 SIDE EFFECTS IN FUNCTIONS 751 change objects but do not directly return results,and queries,which provide information about objects but do not change them. Another way to express this rule informally is to state that asking a question should not change the answer. Objects as machines The following principle expresses the prohibition in more precise terms: Command-Query Separation principle Functions should not produce abstract side effects. The definition of Note that we have only defined concrete side effects so far;for the moment you can abstract side effects ignore the difference. appears on page 757. As a result of the principle,only commands(procedures)will be permitted to produce side effects.(In fact,as noted,we not only permit but expect them to change objects- unlike in applicative,completely side-effect-free approaches.) A list object as list machine start forth 80 Dut search item before after index count The view of objects that emerges from this discussion (a metaphor,to be treated with care as usual)is that of a machine,with an internal state that is not directly observable,and two kinds ofbutton:command buttons,rectangular on the picture,and query buttons,round. Object lifecycle pic- Pressing a command button is a way to make the machine change state:it starts moving ture:page 366. and clicking,then comes back to a new stable state (one of the states shown in the earlier picture of object lifecycle).You cannot directly see the state-open the machine-but you can press a query button.This does not change the state(remember:asking a question does not change the answer)but yields a response in the form of a message appearing in the display panel at the top;for boolean queries one of the two indicators in the display panel
§23.1 SIDE EFFECTS IN FUNCTIONS 751 change objects but do not directly return results, and queries, which provide information about objects but do not change them. Another way to express this rule informally is to state that asking a question should not change the answer. Objects as machines The following principle expresses the prohibition in more precise terms: Note that we have only defined concrete side effects so far; for the moment you can ignore the difference. As a result of the principle, only commands (procedures) will be permitted to produce side effects. (In fact, as noted, we not only permit but expect them to change objects — unlike in applicative, completely side-effect-free approaches.) The view of objects that emerges from this discussion (a metaphor, to be treated with care as usual) is that of a machine, with an internal state that is not directly observable, and two kinds of button: command buttons, rectangular on the picture, and query buttons, round. Pressing a command button is a way to make the machine change state: it starts moving and clicking, then comes back to a new stable state (one of the states shown in the earlier picture of object lifecycle). You cannot directly see the state — open the machine — but you can press a query button. This does not change the state (remember: asking a question does not change the answer) but yields a response in the form of a message appearing in the display panel at the top; for boolean queries one of the two indicators in the display panel, Command-Query Separation principle Functions should not produce abstract side effects. The definition of abstract side effects appears on page 757. A list object as list machine start forth go put search item before after index count Object lifecycle picture: page 366