6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology 6.001 Notes: Section 32.1 Slide 32.1.1 We have spent some time looking at computational objects that rency and asynchronous computing are built around the use of local state variables and methods for Object oriented approaches lose "referential transparency Referential transparency means equal expressions changing those variables to reflect evolution in the state of the can be substituted for one another without changing the object. In particular, we saw an example of building a system around an object oriented paradigm, in which the central component of our system was a set of communicating objects, that took messages are arguments and returned methods that could be applied to objects and other arguments to simulate interactions in complex systems We saw a hint of the power of orienting system design around 4 such principles, but we also saw that this power of using local state to model systems also extracts a price the loss of referential transparency So what does this mean? A language with referential transparency means that equal expressions can be substituted for one another without changing the value of the expression Slide 32.1.2 Example of referential transparency or example consider the code shown on the slide. Make-adder creates a procedure that adds a fixed number to its argument. we (define (make-adder n) can use it to create two adders, as shown, called d1 and d2 The question in which we are interested is whether Dl and D2 are the same? We would argue that in one sense the answer is no (define D2 (make-adder 4)) since they point to different procedural structures, but in another sense the answer is yes, since we can replace any expression Are D1 and D2 the same? involving Dl with an equivalent expression involving D2 and we with D1 by will get exactly the same behavior expression with D2 and get same value- so YES Slide 32.1.3 But now consider the code shown on this slide Example of loss of transparency Here we have a simple message passing procedure for representing(dnt afine (make-accoumt balance bank accounts (if (>e balance amount) begin (setI Balance ( balance amount)) We can again ask whether peter and paul are the same. Here e know intuitively that the answer Even though the detine (depos it amount) expression used to create each is the know that the (define (dispatch m) behavior of these objects is different, because of the local state In this case, we do not have referential transparency, since the ance) balance) same expression does not give rise to things that can be substituted (else (error unknown request n)))) for one another define paul(make-aocount 100)) Are these the same
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. 6.001 Notes: Section 32.1 Slide 32.1.1 We have spent some time looking at computational objects that are built around the use of local state variables, and methods for changing those variables to reflect evolution in the state of the object. In particular, we saw an example of building a system around an object oriented paradigm, in which the central component of our system was a set of communicating objects, that took messages are arguments, and returned methods that could be applied to objects and other arguments to simulate interactions in complex systems. We saw a hint of the power of orienting system design around such principles, but we also saw that this power of using local state to model systems also extracts a price: the loss of referential transparency. So what does this mean? A language with referential transparency means that equal expressions can be substituted for one another without changing the value of the expression. Slide 32.1.2 For example consider the code shown on the slide. Make-adder creates a procedure that adds a fixed number to its argument. We can use it to create two adders, as shown, called D1 and D2. The question in which we are interested is whether D1 and D2 are the same? We would argue that in one sense the answer is no, since they point to different procedural structures, but in another sense the answer is yes, since we can replace any expression involving D1 with an equivalent expression involving D2 and we will get exactly the same behavior. Slide 32.1.3 But now consider the code shown on this slide. Here we have a simple message passing procedure for representing bank accounts. We can again ask whether peter and paul are the same. Here, we know intuitively that the answer is no. Even though the expression used to create each is the same, we know that the behavior of these objects is different, because of the local state. In this case, we do not have referential transparency, since the same expression does not give rise to things that can be substituted for one another
6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology Slide 32.1.4 The role of time in evaluation The apparently simple introduction of local state and mutation into ((petor ' deposit) 5) our language thus has some drastic consequences: it raises value: 9 : value: 105 questions about sameness and change ' deposit)5) g beneath the complexity sameness, and change is that by introducing assignment we are Order of evaluation doesn't forced to admit time into our computational models. Before we Order of evaluation does matter introduced assignment, all our programs were timeless, in the sense that any expression that has a value al ways has the same value. Thus, calling (d1 5) would al ways return the same value In contrast, look at our modeling of deposits to a bank account, that returns the resulting balance here successive evaluations of he same expression yield different values. This behavior arise from the fact that the execution of assignment statements (in this case, assignments to the variable balance) delineates moments in time when values change. The result of evaluating an expression depends not only on the of computational objects with local state forces us to confront time as an essential concept in programming n terms expression itself, but also on whether the evaluation occurs before or after these moments. Building models Slide 32. 1.5 We can go further in structuring computational models to match Role of concurrency and time our perception of the physical world. Objects in the world do not Behavior of objects with state depends on sequence of change one at a time in sequence rather we perceive them as events that precede it. Objects don't change one at a time; they act concurrently. acting concurrently---all at once. So it is often natural to model Computation could take advantage of this by letting systems as collections of computational processes that execute concurrently. Just as we can make our programs modular by But this raises issues of controlling interactions organizing models in terms of objects with separate local state, it is often appropriate to divide computational models into parts that evolve separately and concurrently. Even if the programs are to be executed on a sequential computer, the practice of writing programs as if they were to be executed concurrently forces the 4 programmer to avoid inessential timing constraints and thus makes programs more modular In addition to making programs more modular, concurrent computation can provide a speed advantage over sequential computation. Sequential computers execute only one operation at a time, so the amount of time it takes to erform a task is proportional to the total number of operations performed. However, if it is possible to decompose a problem into pieces that are relatively independent and need to communicate only rarely, it may be possible to allocate pieces to separate computing processors, producing a speed advantage proportional to the number of processors available Unfortunately, the complexities introduced by assignment become even more problematic in the presence of concurrency. The fact of concurrent execution, either because the world operates in parallel or because our computers do, entails additional complexity in our understanding of time Today, we are going to look at those issues and ways to try to get around them
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. Slide 32.1.4 The apparently simple introduction of local state and mutation into our language thus has some drastic consequences: it raises questions about sameness and change. The central issue lurking beneath the complexity of state, sameness, and change is that by introducing assignment we are forced to admit time into our computational models. Before we introduced assignment, all our programs were timeless, in the sense that any expression that has a value always has the same value. Thus, calling (d1 5) would always return the same value. In contrast, look at our modeling of deposits to a bank account, that returns the resulting balance. Here successive evaluations of the same expression yield different values. This behavior arises from the fact that the execution of assignment statements (in this case, assignments to the variable balance) delineates moments in time when values change. The result of evaluating an expression depends not only on the expression itself, but also on whether the evaluation occurs before or after these moments. Building models in terms of computational objects with local state forces us to confront time as an essential concept in programming. Slide 32.1.5 We can go further in structuring computational models to match our perception of the physical world. Objects in the world do not change one at a time in sequence. Rather we perceive them as acting concurrently---all at once. So it is often natural to model systems as collections of computational processes that execute concurrently. Just as we can make our programs modular by organizing models in terms of objects with separate local state, it is often appropriate to divide computational models into parts that evolve separately and concurrently. Even if the programs are to be executed on a sequential computer, the practice of writing programs as if they were to be executed concurrently forces the programmer to avoid inessential timing constraints and thus makes programs more modular. In addition to making programs more modular, concurrent computation can provide a speed advantage over sequential computation. Sequential computers execute only one operation at a time, so the amount of time it takes to perform a task is proportional to the total number of operations performed. However, if it is possible to decompose a problem into pieces that are relatively independent and need to communicate only rarely, it may be possible to allocate pieces to separate computing processors, producing a speed advantage proportional to the number of processors available. Unfortunately, the complexities introduced by assignment become even more problematic in the presence of concurrency. The fact of concurrent execution, either because the world operates in parallel or because our computers do, entails additional complexity in our understanding of time. Today, we are going to look at those issues and ways to try to get around them
6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology Slide 32.1.6 Why is time an issue? Now why should time be an issue? For any two events, a and b detine peter (nake-account 100) ither A occurs before B. a and b are simultaneous or A occurs define paul p。tex after B But let's look at that carefully. Suppose we create an account for withdraw) 25) Peter, that Paul shares. Now suppose thatthat Peter peter Bank paull eter 100 au dollars and Paul withdraws 25 dollars from this joint account that 90 -25 initially contains 100 dollars, leaving 65 in the account.Depending on the order of the two withdrawals, the sequence of balances in the account is either 100. then 90 then 65 or 100. then 75. then 65 In a computer implementation of the banking system, this 4 hanging sequence of balances could be modeled by successive signments to the variable balance Slide 32.1.7 In complex situations, however, such a view can be problematic. Why is time an issue? Suppose that Peter and Paul, and many other people besides, are accessing the same bank account through a network of banking ance ( balance anount)) machines distributed all over the world. The actual sequence of xuffieignt fund“) balances in the account will depend critically on the detailed (peter withdraW) 10 paul withdraw) 25 timing of the accesses and the details of the communication among [Peer the machines The size of the event matters in determining whether the outcome occurs correctly This indeterminacy in the order of events can pose serious nwva|u100-25=75 roblems in the design of concurrent systems. For instance Set balance 90 that the withdrawals made by peter and Paul set balanee 75 plemented as two separate processes sharing a common variable balance, each process specified by the withdraw procedure Consider the expression(set! balance (-balance amount)) executed as part of each withdrawal process. This consists of three steps: (1) accessing the value of the balance variable; (2)computing the new balance; (3)setting balance to this new value. If Peter and Paul's withdrawals execute this statement concurrently, then the two withdrawals might interleave the order in which they access balance and set it to the new value The timing diagram depicts an order of events where balance starts at 100, Peter withdraws 10, Paul withdraws 25, and yet the final value of balance is 75. As shown in the diagram, the reason for this anomaly is that Paul's assignment of 75 to balance is made under the assumption that the value of balance to be decremented is 100 That assumption, however, became invalid when Peter changed balance to 90. This is a catastrophic failure for the banking system, because the total amount of money in the system is not conserved. Before the transactions, the total amount of money was 100. Afterwards, Peter has 10, Paul has 25, and the bank has 75 The general phenomenon illustrated here is that several processes may share a common state variable. What makes this complicated is that more than one process may be trying to manipulate the shared state at the same time For the bank account example, during each transaction, each customer should be able to act as if the other customers did not exist. When a customer changes the balance in a way that depends on the balance he must be able to assume that, just before the moment of change, the balance is still what he thought it was 6.001 Notes: Section 32.2
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. Slide 32.1.6 Now why should time be an issue? For any two events, A and B, either A occurs before B, A and B are simultaneous, or A occurs after B. But let's look at that carefully. Suppose we create an account for Peter, that Paul shares. Now suppose that that Peter withdraws 10 dollars and Paul withdraws 25 dollars from this joint account that initially contains 100 dollars, leaving 65 in the account. Depending on the order of the two withdrawals, the sequence of balances in the account is either 100, then 90 then 65, or 100, then 75, then 65. In a computer implementation of the banking system, this changing sequence of balances could be modeled by successive assignments to the variable balance. Slide 32.1.7 In complex situations, however, such a view can be problematic. Suppose that Peter and Paul, and many other people besides, are accessing the same bank account through a network of banking machines distributed all over the world. The actual sequence of balances in the account will depend critically on the detailed timing of the accesses and the details of the communication among the machines. The size of the event matters in determining whether the outcome occurs correctly. This indeterminacy in the order of events can pose serious problems in the design of concurrent systems. For instance, suppose that the withdrawals made by Peter and Paul are implemented as two separate processes sharing a common variable balance, each process specified by the withdraw procedure: Consider the expression (set! balance (- balance amount)) executed as part of each withdrawal process. This consists of three steps: (1) accessing the value of the balance variable; (2) computing the new balance; (3) setting balance to this new value. If Peter and Paul's withdrawals execute this statement concurrently, then the two withdrawals might interleave the order in which they access balance and set it to the new value. The timing diagram depicts an order of events where balance starts at 100, Peter withdraws 10, Paul withdraws 25, and yet the final value of balance is 75. As shown in the diagram, the reason for this anomaly is that Paul's assignment of 75 to balance is made under the assumption that the value of balance to be decremented is 100. That assumption, however, became invalid when Peter changed balance to 90. This is a catastrophic failure for the banking system, because the total amount of money in the system is not conserved. Before the transactions, the total amount of money was 100. Afterwards, Peter has 10, Paul has 25, and the bank has 75. The general phenomenon illustrated here is that several processes may share a common state variable. What makes this complicated is that more than one process may be trying to manipulate the shared state at the same time. For the bank account example, during each transaction, each customer should be able to act as if the other customers did not exist. When a customer changes the balance in a way that depends on the balance, he must be able to assume that, just before the moment of change, the balance is still what he thought it was. 6.001 Notes: Section 32.2
6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology Slide 32.2.1 The above example typifies the subtle bugs that can creep into Correct behavior of concurrent programs concurrent programs. The root of this complexity lies in the REQUIRE assignments to variables that are shared among the different ations that change any shared state variables can occur at the same time processes. We already know that we must be careful in writing That a concurrent system produces the same result as if programs that use set!, because the results of a computation the processes had run sequentially in some order depend on the order in which the assignments occur With concurrent processes we must be especially careful about assignments, because we may not be able to control the order of the assignments made by the different processes. If several sut changes might be made concurrently(as with two depositors accessing a joint account)we need some way to ensure that our 4 stem behaves correctly. For example, in the case of withdrawals from a joint bank account, we must ensure that money is conserved. To make concurrent programs behave correctly, we may have to place some restrictions on concurrent execution One possible restriction on concurrency would stipulate that no two operations that change any shared state variables can occur at the same time. This is an extremely stringent requirement For distributed banking, it would require the system designer to ensure that only one transaction could proceed at a time. This would be both inefficient and overly conservative a less stringent restriction on concurrency would ensure that a concurrent system produces the same result as if the processes had run sequentially in some order Slide 32.2.2 Correct behavior of concurrent programs There are two important aspects to this requirement. First, it does REQUIRE not require the processes to actually run sequentially, but only to That no two operations that change any shared state produce results that are the same as if they had run sequentially That a concurrent system produces the same result as if For our previous example, the designer of the bank account system the processes had run sequentially in some orde fely allow Paul's deposit and Peter,s withdrawal to happen by to areer e sa es ey a urena. concurrently, because the net result will be the same as if the two sequentially operations had happened sequentially. Second, there may be more -There may be more than one"correct" result as a han one possible correct result produced by a concurrent progra because we require only that the result be the same as for some sequential order Slide 32. 2.3 Let's look at this a bit more carefully Parallel execution To make the above mechanism more concrete, suppose that we (define x 10) have extended Scheme to include a procedure called parallel (define p3 (Lambda ((set! x (*xx))) execute this procedure takes a set of procedures of no ( define p4(1 anda()《set!K件x1)))) arguments as input. It then creates a separate process for each such procedure( think of this as a separate evaluator, with its own environment) and applies such procedure within that process. All these processes(or evaluators)then run concurrently As an example of how this is used, consider the example shown here This creates two concurrent processes: Pl, which sets x to x times 4 x and p2. which increments x
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. Slide 32.2.1 The above example typifies the subtle bugs that can creep into concurrent programs. The root of this complexity lies in the assignments to variables that are shared among the different processes. We already know that we must be careful in writing programs that use set!, because the results of a computation depend on the order in which the assignments occur. With concurrent processes we must be especially careful about assignments, because we may not be able to control the order of the assignments made by the different processes. If several such changes might be made concurrently (as with two depositors accessing a joint account) we need some way to ensure that our system behaves correctly. For example, in the case of withdrawals from a joint bank account, we must ensure that money is conserved. To make concurrent programs behave correctly, we may have to place some restrictions on concurrent execution. One possible restriction on concurrency would stipulate that no two operations that change any shared state variables can occur at the same time. This is an extremely stringent requirement. For distributed banking, it would require the system designer to ensure that only one transaction could proceed at a time. This would be both inefficient and overly conservative. A less stringent restriction on concurrency would ensure that a concurrent system produces the same result as if the processes had run sequentially in some order. Slide 32.2.2 There are two important aspects to this requirement. First, it does not require the processes to actually run sequentially, but only to produce results that are the same as if they had run sequentially. For our previous example, the designer of the bank account system can safely allow Paul's deposit and Peter's withdrawal to happen concurrently, because the net result will be the same as if the two operations had happened sequentially. Second, there may be more than one possible correct result produced by a concurrent program, because we require only that the result be the same as for some sequential order. Slide 32.2.3 Let's look at this a bit more carefully. To make the above mechanism more concrete, suppose that we have extended Scheme to include a procedure called parallelexecute: this procedure takes a set of procedures of no arguments as input. It then creates a separate process for each such procedure (think of this as a separate evaluator, with its own environment) and applies such procedure within that process. All these processes (or evaluators) then run concurrently. As an example of how this is used, consider the example shown here. This creates two concurrent processes: P1, which sets x to x times x, and P2, which increments x
6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology Slide 32. 2.4 Parallel execution To understand what would be legitimate results under concurrent (define r 10) operation of PI and P2, let's break down the stages a bit more define p3 (lambda ()(setI x(*xx))) Pl executes three different stages. as shown And p2 executes two different stages, as shown a: lookup tirst m,地甲ubtx e: assign sun of d and l to x Slide 32. 2.5 For these processes to operate correctly, PI simply needs to ensureParallel execution that the ordering a, b, c takes places, and p2 simply needs to (lambda ( ensure that the ordering d, e occurs (define p3 (detine p4 (lambda ( X(·x1))》 Here are the different ways in which we can preserve these (parallel-execute p3 p4) orderings, while allowing for intertwining of stages between the P1: a: lookup tirst x in p3 two processes. look up second x in p3 since the result would reflect a correct sequence of operations irompoece o MB b m p ra act of a amd b to In other words, each of these orderings is a legitimate sequence, p2: d: looky each process. All that differs is how the sequences intertwine ab Slide 32.2.6 Parallel execution After execution is complete, x will be left with one of five possible values, depending on the interleaving of the events of Pl and p2 define p3 (lambda ()(setI x(* xx)))) define p4 (lambla ()(set1 x(. x 1))) We can see this by marking the value of x at each stage as shown The blue values come from Pl, the red from P2. If we allow any intertwining of the stages of each process, we see that there are five different possible final values for x: 11, 100, 101, 110, and e: assign sun of d and 1 to x abcde1010100t0010t bc。1101t01 dac bc s0 10 1111110 h dec 1010 1011 100 dea bc 1n 11111112 Slide 32.2.7 On the other hand, if we insist that only sequential orderings of theParallel execution two processes occur, i.e., no intertwining of intermediate stages, then there are only two possible outcomes (lambda ()(setI x(*x x))) be0t1111t0 a hce 10 1010 100 :1
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. Slide 32.2.4 To understand what would be legitimate results under concurrent operation of P1 and P2, let's break down the stages a bit more finely. P1 executes three different stages, as shown. And P2 executes two different stages, as shown. Slide 32.2.5 For these processes to operate correctly, P1 simply needs to ensure that the ordering a, b, c takes places, and P2 simply needs to ensure that the ordering d, e occurs. Here are the different ways in which we can preserve these orderings, while allowing for intertwining of stages between the two processes. In other words, each of these orderings is a legitimate sequence, since the result would reflect a correct sequence of operations from each process. All that differs is how the sequences intertwine. Slide 32.2.6 After execution is complete, x will be left with one of five possible values, depending on the interleaving of the events of P1 and P2. We can see this by marking the value of x at each stage as shown. The blue values come from P1, the red from P2. If we allow any intertwining of the stages of each process, we see that there are five different possible final values for x: 11, 100, 101, 110, and 121. Slide 32.2.7 On the other hand, if we insist that only sequential orderings of the two processes occur, i.e., no intertwining of intermediate stages, then there are only two possible outcomes