966 CONCURRENCY,DISTRIBUTION,CLIENT-SERVER AND THE INTERNET $30.4 We may go further and specify that the handler is assigned to the object at the time of creation,and remains the same throughout the object's life.This assumption will help keep the mechanism simple.It may seem restrictive at first,since some distributed systems may need to support object migration across a network.But we can address this need in at least two other ways: By allowing the reassignment of a processor to a different CPU(with this solution, all objects handled by a processor will migrate together). By treating object migration as the creation of a new object. The dual semantics of calls With multiple processors,we face a possible departure from the usual semantics of the fundamental operation of object-oriented computation,feature call,of one of the forms x.f(a) --if/is a command y:=x.f(a) --if f is a query As before,let O2 be the object attached to x at the time of the call,and Ol the object on whose behalf the call is executed.(In other words,the instruction in either form is part of a call to a certain routine,whose execution uses Ol as its target.) We have grown accustomed to understanding the effect of the call as the execution of/'s body applied to 02,using a as argument,and returning a result in the query case.If the call is part of a sequence of instructions,as with ..previous instruction,x.f(a),next instruction;... (or the equivalent in the query case),the execution of next instruction will not commence until after the completion of / Not so any more with multiple processors.The very purpose of concurrent architectures is to enable the client computation to proceed without waiting for the supplier to have completed its job,if that job is handled by another processor.In the example of print controllers,sketched at the beginning of this chapter,a client application will want to send a print request(a"job")and continue immediately with its own agenda. So instead of one call semantics we now have two cases: If Ol and O2 have the same handler,any further operation on O1 (next instruction) must wait until the call terminates.Such calls are said to be synchronous. If O1 and O2 are handled by different processors,operations on O1 can proceed as soon as it has initiated the call on O2.Such calls are said to be asynchronous The asynchronous case is particularly interesting for a command,since the remainder of the computation may not need any of the effects of the call on O2 until much later(if at all:Ol may just be responsible for spawning one or more concurrent computations and then terminating).For a query,we need the result,as in the above example where we assign it to y,but as explained below we might be able to proceed concurrently anyway
966 CONCURRENCY, DISTRIBUTION, CLIENT-SERVER AND THE INTERNET §30.4 We may go further and specify that the handler is assigned to the object at the time of creation, and remains the same throughout the object’s life. This assumption will help keep the mechanism simple. It may seem restrictive at first, since some distributed systems may need to support object migration across a network. But we can address this need in at least two other ways: • By allowing the reassignment of a processor to a different CPU (with this solution, all objects handled by a processor will migrate together). • By treating object migration as the creation of a new object. The dual semantics of calls With multiple processors, we face a possible departure from the usual semantics of the fundamental operation of object-oriented computation, feature call, of one of the forms x ● f (a) -- if f is a command y := x ● f (a) -- if f is a query As before, let O2 be the object attached to x at the time of the call, and O1 the object on whose behalf the call is executed. (In other words, the instruction in either form is part of a call to a certain routine, whose execution uses O1 as its target.) We have grown accustomed to understanding the effect of the call as the execution of f ’s body applied to O2, using a as argument, and returning a result in the query case. If the call is part of a sequence of instructions, as with … previous_instruction; x ● f (a); next_instruction; … (or the equivalent in the query case), the execution of next_instruction will not commence until after the completion of f. Not so any more with multiple processors. The very purpose of concurrent architectures is to enable the client computation to proceed without waiting for the supplier to have completed its job, if that job is handled by another processor. In the example of print controllers, sketched at the beginning of this chapter, a client application will want to send a print request (a “job”) and continue immediately with its own agenda. So instead of one call semantics we now have two cases: • If O1 and O2 have the same handler, any further operation on O1 (next_instruction) must wait until the call terminates. Such calls are said to be synchronous. • If O1 and O2 are handled by different processors, operations on O1 can proceed as soon as it has initiated the call on O2. Such calls are said to be asynchronous. The asynchronous case is particularly interesting for a command, since the remainder of the computation may not need any of the effects of the call on O2 until much later (if at all: O1 may just be responsible for spawning one or more concurrent computations and then terminating). For a query, we need the result, as in the above example where we assign it to y, but as explained below we might be able to proceed concurrently anyway
$30.4 INTRODUCING CONCURRENT EXECUTION 967 Separate entities A general rule of software construction is that a semantic difference should always be reflected by a difference in the software text. Now that we have two variants of call semantics we must make sure that the software text incontrovertibly indicates which one is intended in each case.What determines the answer is whether the call's target,02,has the same handler (the same processor)as the call's originator,O1.So rather than the call itself we should mark x,the entity denoting the target object.In accordance with the static typing policy,developed in earlier chapters to favor clarity and safety,the mark should appear in the declaration ofx. This reasoning yields the only notational extension supporting concurrency.Along with the usual x:SOME TYPE we allow ourselves the declaration form x:separate SOME TYPE to express that x may become attached to objects handled by a different processor.If a class is meant to be used only to declare separate entities,you can also declare it as separate class Y...The rest as usual .. instead of just class X...or deferred class X.... “Expanded types”, The convention is the same as for declaring an expanded status:you can declare y as being page 254. of type expanded 7,or equivalently just as 7 if T itself is a class declared as ex panded class 7...The three possibilities-expanded,deferred,separate-are mutually exclusive,so at most one qualifying keyword may appear before class. It is quite remarkable that this addition of a single keyword suffices to turn our sequential object-oriented notation into one supporting general concurrent computation. Some straightforward terminology.We may apply the word "separate"to various elements,both static (appearing in the software text)and dynamic (existing at run time). Statically:a separate class is a class declared as separate class...;a separate type is based on a separate class,a separate entity is declared of a separate type,or as separate T for some T;x.f(..)is a separate call if its targetx is a separate entity.Dynamically:the value of a separate entity is a separate reference;if not void,it will be attached to an object handled by another processor-a separate object. Typical examples of separate class include: BOUNDED BUFFER,to describe a buffer structure that enables various concurrent components to exchange data (some components,the producers,depositing objects into the buffer,and others,the consumers,acquiring objects from it). PRINTER,perhaps better called PRINT CONTROLLER,to control one or more printers.By treating the print controllers as separate objects,applications do not need to wait for the print job to complete (unlike early Macintoshes,with which you were stuck until the last page had come out of the printer)
§30.4 INTRODUCING CONCURRENT EXECUTION 967 Separate entities A general rule of software construction is that a semantic difference should always be reflected by a difference in the software text. Now that we have two variants of call semantics we must make sure that the software text incontrovertibly indicates which one is intended in each case. What determines the answer is whether the call’s target, O2, has the same handler (the same processor) as the call’s originator, O1. So rather than the call itself we should mark x, the entity denoting the target object. In accordance with the static typing policy, developed in earlier chapters to favor clarity and safety, the mark should appear in the declaration of x. This reasoning yields the only notational extension supporting concurrency. Along with the usual x: SOME_TYPE we allow ourselves the declaration form x: separate SOME_TYPE to express that x may become attached to objects handled by a different processor. If a class is meant to be used only to declare separate entities, you can also declare it as separate class X … The rest as usual … instead of just class X … or deferred class X …. The convention is the same as for declaring an expanded status: you can declare y as being of type expanded T, or equivalently just as T if T itself is a class declared as expanded class T… The three possibilities — expanded, deferred, separate — are mutually exclusive, so at most one qualifying keyword may appear before class. It is quite remarkable that this addition of a single keyword suffices to turn our sequential object-oriented notation into one supporting general concurrent computation. Some straightforward terminology. We may apply the word “separate” to various elements, both static (appearing in the software text) and dynamic (existing at run time). Statically: a separate class is a class declared as separate class …; a separate type is based on a separate class; a separate entity is declared of a separate type, or as separate T for some T; x ● f (…) is a separate call if its target x is a separate entity. Dynamically: the value of a separate entity is a separate reference; if not void, it will be attached to an object handled by another processor — a separate object. Typical examples of separate class include: • BOUNDED_BUFFER, to describe a buffer structure that enables various concurrent components to exchange data (some components, the producers, depositing objects into the buffer, and others, the consumers, acquiring objects from it). • PRINTER, perhaps better called PRINT_CONTROLLER, to control one or more printers. By treating the print controllers as separate objects, applications do not need to wait for the print job to complete (unlike early Macintoshes, with which you were stuck until the last page had come out of the printer). “Expanded types”, page 254
968 CONCURRENCY,DISTRIBUTION,CLIENT-SERVER AND THE INTERNET $30.4 DATABASE,which in the client part of a client-server architecture may serve to describe the database hosted by a distant server machine,to which the client may send queries through the network. BROWSER WINDOW,in a Web browser that allows you to spawn a new window where you can examine different Web pages. Obtaining separate objects In practice,as illustrated by the preceding examples,separate objects will be of two kinds: In the first case an application will want to spawn a new separate object,grabbing the next available processor.(Remember that we can always get a new processor; since processors are not material resources but abstract facilities,their number is not bounded.)This is typically the case with BROWSER WINDOW:you create a new window when you need one.A BOUNDED BUFFER or PRINT CONTROLLER may also be created in this way. An application may simply need to access an existing separate object,usually shared between many different clients.This is the case in the D4T4B4.SE example:the client application uses an entity db server:separate DATABASE to access the database through such separate calls as db server.ask query (sgl query).The server must have at some stage obtained the value of server-the database handle-from the outside.Accesses to existing BOUNDED BUFFER or PRINT CONTROLLER objects will use a similar scheme. The separate object is said to be created in the first case and external in the second. To obtain a created object,you simply use the creation instruction.Ifx is a separate entity,the creation instruction !x.make (... will,in addition to its usual effect of creating and initializing a new object,assign a new processor to handle that object.Such an instruction is called a separate creation. To obtain an existing external object,you will typically use an external routine,such as server (name:STRING;..Other arguments ...)separate DATABASE where the arguments serve to identify the requested object.Such a routine will typically send a message over the network and obtain in return a reference to the object. A word about possible implementations may be useful here to visualize the notion of separate object.Assume each of the processors is associated with a task(process)of an operating system such as Windows or Unix,with its own address space;this is of course just one of many concurrent architectures.Then one way to represent a separate object within a task is to use a small local object,known as a proxy:
968 CONCURRENCY, DISTRIBUTION, CLIENT-SERVER AND THE INTERNET §30.4 • DATABASE, which in the client part of a client-server architecture may serve to describe the database hosted by a distant server machine, to which the client may send queries through the network. • BROWSER_WINDOW, in a Web browser that allows you to spawn a new window where you can examine different Web pages. Obtaining separate objects In practice, as illustrated by the preceding examples, separate objects will be of two kinds: • In the first case an application will want to spawn a new separate object, grabbing the next available processor. (Remember that we can always get a new processor; since processors are not material resources but abstract facilities, their number is not bounded.) This is typically the case with BROWSER_WINDOW: you create a new window when you need one. A BOUNDED_BUFFER or PRINT_CONTROLLER may also be created in this way. • An application may simply need to access an existing separate object, usually shared between many different clients. This is the case in the DATABASE example: the client application uses an entity db_server: separate DATABASE to access the database through such separate calls as db_server ● ask_query (sql_query). The server must have at some stage obtained the value of server — the database handle — from the outside. Accesses to existing BOUNDED_BUFFER or PRINT_CONTROLLER objects will use a similar scheme. The separate object is said to be created in the first case and external in the second. To obtain a created object, you simply use the creation instruction. If x is a separate entity, the creation instruction !! x ● make (…) will, in addition to its usual effect of creating and initializing a new object, assign a new processor to handle that object. Such an instruction is called a separate creation. To obtain an existing external object, you will typically use an external routine, such as server (name: STRING; … Other arguments …): separate DATABASE where the arguments serve to identify the requested object. Such a routine will typically send a message over the network and obtain in return a reference to the object. A word about possible implementations may be useful here to visualize the notion of separate object. Assume each of the processors is associated with a task (process) of an operating system such as Windows or Unix, with its own address space; this is of course just one of many concurrent architectures. Then one way to represent a separate object within a task is to use a small local object, known as a proxy:
$30.4 INTRODUCING CONCURRENT EXECUTION 969 A proxy for a PROXY x:separate U separate object OBJECT Other (non-separate) fields Other objects Address space 1 02 Address space 2 Other objects () The figure shows an object O1,instance of a class T with an attribute x:separate U. The corresponding reference field in O1 is conceptually attached to an object O2,handled by another processor.Internally,however,the reference leads to a proxy object,handled by the same processor as O1.The proxy is an internal object,not visible to the author of the concurrent application.It contains enough information to identify O2:the task that serves as O2's handler,and O2's address within that task.All operations on x on behalf of Ol or other clients from the same task will go through the proxy.Any other processor that also handles objects containing separate references to O2 will have its own proxy for O2. Be sure to note that this is only one possible technique,not a required property of the model.Operating system tasks with separate address spaces are just one way to implement processors.With threads,for example,the techniques may be different. Objects here,and objects there When first presented with the notion of separate entity,some people complain that it is over-committing:"I do not want to know where the object resides!I just want to request the operation,xf(...),and let the machinery do the rest-execute f on x wherever x is." Although legitimate,this desire to avoid over-commitment does not obviate the need for separate declarations.It is true that the precise location of an object is often an implementation detail that should not affect the software.But one"yes or no"property of the object's location remains relevant:whether the object is handled by the same processor or by another.This is a fundamental semantic difference since it determines whether calls on the object are synchronous or asynchronous-cause the client to wait,or not.Ignoring this property in the software would not be a convenience;it would be a mistake. Once we know the object is separate,it should not in most cases matter for the functionality of our software (although it may matter for its performance)whether the object belongs to another thread of the same process,another process on the same computer,another computer in the same room,another room in the same building,another site on the company's private network,or another Internet node half-way around the world.But it matters that it is separate
§30.4 INTRODUCING CONCURRENT EXECUTION 969 The figure shows an object O1, instance of a class T with an attribute x: separate U. The corresponding reference field in O1 is conceptually attached to an object O2, handled by another processor. Internally, however, the reference leads to a proxy object, handled by the same processor as O1. The proxy is an internal object, not visible to the author of the concurrent application. It contains enough information to identify O2: the task that serves as O2’s handler, and O2’s address within that task. All operations on x on behalf of O1 or other clients from the same task will go through the proxy. Any other processor that also handles objects containing separate references to O2 will have its own proxy for O2. Be sure to note that this is only one possible technique, not a required property of the model. Operating system tasks with separate address spaces are just one way to implement processors. With threads, for example, the techniques may be different. Objects here, and objects there When first presented with the notion of separate entity, some people complain that it is over-committing: “I do not want to know where the object resides! I just want to request the operation, x ● f (…), and let the machinery do the rest — execute f on x wherever x is.” Although legitimate, this desire to avoid over-commitment does not obviate the need for separate declarations. It is true that the precise location of an object is often an implementation detail that should not affect the software. But one “yes or no” property of the object’s location remains relevant: whether the object is handled by the same processor or by another. This is a fundamental semantic difference since it determines whether calls on the object are synchronous or asynchronous — cause the client to wait, or not. Ignoring this property in the software would not be a convenience; it would be a mistake. Once we know the object is separate, it should not in most cases matter for the functionality of our software (although it may matter for its performance) whether the object belongs to another thread of the same process, another process on the same computer, another computer in the same room, another room in the same building, another site on the company’s private network, or another Internet node half-way around the world. But it matters that it is separate. Other fields (non-separate) x: separate U (T) PROXY OBJECT (U) O1 O2 Other objects Other objects Address space 1 Address space 2 A proxy for a separate object
970 CONCURRENCY,DISTRIBUTION,CLIENT-SERVER AND THE INTERNET $30.4 A concurrency architecture The use of separate declarations to cover the fundamental boolean property"is this object here,or is it elsewhere?"while leaving room for various physical implementations of concurrency suggests a two-level architecture,similar to what is available for the graphical mechanisms (with the Vision library sitting on top of platform-specific libraries): Two-level General concurrency mechanism(SCOOP) architecture for concurrency mechanism (See a similar archi- Process-based Thread-based CORBA- ■01■■010■■0■n10■ tecture for graphical handle handle based handle libraries on page 1067.) At the highest level the mechanism is platform-independent.This is the level which most applications use,and which this chapter describes.To perform concurrent computation,applications simply use the separate mechanism. Internally,the implementation will rely on some practical concurrent architecture (lower level on the figure).The figure lists some possibilities: There may be an implementation using processes(tasks)as provided by the operating system.Each processor is associated with a process.This solution supports distributed computing:the process of a separate object can be on a remote machine as well as a local one.For non-distributed processing,it has the advantage that processes are stable and well known,and the disadvantage that they are CPU- intensive;both the creation of a new process and the exchange of information between processes are expensive operations. There may be an implementation using threads.Threads,as already noted,are a lighter alternative to processes,minimizing the cost of creation and context switching.Threads,however,have to reside on the same machine. A CORBA implementation is also possible,using CORBA distribution mechanisms as the physical layer to exchange objects across the network. Other possible mechanisms include PVM(Parallel Virtual Machine),the Linda language for concurrent programming,Java threads... As always with such two-level architectures,the correspondence between high-level constructs and the actual platform mapping(the handle in terms of a previous chapter)is in most cases automatic,so that application developers will see the highest level only.But
970 CONCURRENCY, DISTRIBUTION, CLIENT-SERVER AND THE INTERNET §30.4 A concurrency architecture The use of separate declarations to cover the fundamental boolean property “is this object here, or is it elsewhere?” while leaving room for various physical implementations of concurrency suggests a two-level architecture, similar to what is available for the graphical mechanisms (with the Vision library sitting on top of platform-specific libraries): At the highest level the mechanism is platform-independent. This is the level which most applications use, and which this chapter describes. To perform concurrent computation, applications simply use the separate mechanism. Internally, the implementation will rely on some practical concurrent architecture (lower level on the figure). The figure lists some possibilities: • There may be an implementation using processes (tasks) as provided by the operating system. Each processor is associated with a process. This solution supports distributed computing: the process of a separate object can be on a remote machine as well as a local one. For non-distributed processing, it has the advantage that processes are stable and well known, and the disadvantage that they are CPUintensive; both the creation of a new process and the exchange of information between processes are expensive operations. • There may be an implementation using threads. Threads, as already noted, are a lighter alternative to processes, minimizing the cost of creation and context switching. Threads, however, have to reside on the same machine. • A CORBA implementation is also possible, using CORBA distribution mechanisms as the physical layer to exchange objects across the network. • Other possible mechanisms include PVM (Parallel Virtual Machine), the Linda language for concurrent programming, Java threads… As always with such two-level architectures, the correspondence between high-level constructs and the actual platform mapping (the handle in terms of a previous chapter) is in most cases automatic, so that application developers will see the highest level only. But Two-level architecture for concurrency mechanism (See a similar architecture for graphical libraries on page 1067.) Process-based handle Thread-based handle CORBAbased handle General concurrency mechanism (SCOOP)