6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology 6.001 Notes: Section 11.1 Slide ll1.1 Elements of a Data Abstraction For the past few lectures, we have been exploring the topic of data abstractions, and their role in modularizing complex systems. We have particularly looked at the relationship between data structures and the procedures that manipulate them Today we are going to add a new aspect to this topic, by introducing the idea of mutation. This is the idea of changing or altering an existing data structure, rather than simply making a copy of it. We are going to look at two examples of useful data structures and show how while mutation carries some hazards 60015e with it; it also supports very efficient implementation of such structure Elements of a Data Abstraction Slide ll1.2 a data abstraction consists of To set the stage for our exploration, lets first review what we know about data abstractions First we know that a data structure has a constructor, a way of gluing pieces together Typically the definition of the constructor also involves a designation of the type contract of the constructor: what kind operations and how many objects are glued together Associated with the constructor are sets of accessors or selectors that get pieces of the object back out. These governed by a contract with the constructor, designating the 1001 SICP behavior by which pieces glued together can be pulled apar and we typically had a set of operations that used the data structures without actually using the details of the implementation. The standard form is to use the selectors to get pieces of an existing object, manipulate these pieces to create new components, then reassemble these using the constructor into a new version of the data abstraction The new thing we are adding is a mutator. This is something that will change an existing data object, that is, go into the exist structure and alter pieces of that structure in place, rather than constructing a new version of the object with copies of the parts. This is the topic to which we now turn
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. 6.001 Notes: Section 11.1 Slide 11.1.1 For the past few lectures, we have been exploring the topic of data abstractions, and their role in modularizing complex systems. We have particularly looked at the relationship between data structures and the procedures that manipulate them. Today we are going to add a new aspect to this topic, by introducing the idea of mutation. This is the idea of changing or altering an existing data structure, rather than simply making a copy of it. We are going to look at two examples of useful data structures, and show how while mutation carries some hazards with it; it also supports very efficient implementation of such structures. Slide 11.1.2 To set the stage for our exploration, let's first review what we know about data abstractions. First, we know that a data structure has a constructor, a way of gluing pieces together. Typically the definition of the constructor also involves a designation of the type contract of the constructor: what kind and how many objects are glued together. Associated with the constructor are sets of accessors or selectors that get pieces of the object back out. These are governed by a contract with the constructor, designating the behavior by which pieces glued together can be pulled apart. And we typically had a set of operations that used the data structures without actually using the details of the implementation. The standard form is to use the selectors to get pieces of an existing object, manipulate these pieces to create new components, then reassemble these using the constructor into a new version of the data abstraction. The new thing we are adding is a mutator. This is something that will change an existing data object, that is, go into the exist structure and alter pieces of that structure in place, rather than constructing a new version of the object with copies of the parts. This is the topic to which we now turn
6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology Slide 11.1.3 Let's start by looking at very simple data structures. For Primitive Data example, we can actually consider a variable or name to be a creates a new binding for name kind of data abstraction special form If we take this view then our "constructor" in this case is the special form define. It binds the name (or symbol) given returns value bound to name by the first argument to the value of the second expression. The key point is that we can consider this as a kind of primitive data abstraction that pairs up a name and a value The associated selector in this case is very simple. We just use the name itself, that is, when we evaluate the name, it looks up the value bound to it by the define expression, and returns 4 6001 sICP that value, that is, it pulls apart the binding of name and value, and returns one component of that binding Primitive Data Slide 1l. 1.4 (define x 10) creates a new binding for name Now we can introduce the ability to mutate that variable, that is special form the expression that accomplishes this, which is the set at to change the binding of the name and a value. Let's look firs eturns value bound to name expression shown To Mutate changes the binding for name This is also a special form, as it doesn't follow the normal rules for combinations The rule for evaluation of this form is take the second argument and evaluate it using standard rules then take the first argument and just treat it as a symbol (i.e. don't evaluate it). Now, find the binding of that symbol and change it 1001 SICP to take on the new value Note that this looks a lot like a define expression, but as we will see in a few lectures there is a fundamental difference. For now the distinction is that a de fine would create a new binding for the name, using up additional space, whereas set! changes an existing binding Slide 11.1.5 Assignment-- set! What does adding mutation do to our language? One of the primary effects of adding mutation is that we end up breaking 【 define x10 our substitution model. This is okay, as we will shortly replace +x5)=>15 each time it ey it with a better model +x5)=>15 same scope In fact, the substitution model inherently assumed that we were dealing with functional programming, with no mutation or side effects what does this mean? In essence. functional programming implies that we can conceptually treat our procedures as if they were mathematical fu know that we have procedures that deal with things other thar numbers but the same idea holds This means that we could treat our procedures as mappings from input values to output values, and more importantly, that mapping was consistent, providing the same output value for a particular input value no matter when we did the actual evaluation For example, if i define x to have the value 10, and then I evaluate the expression ( x 5), i of course get the value 15. If I then do some other computation and return to the evaluation of ( x 5)I still get the value 15
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. Slide 11.1.3 Let's start by looking at very simple data structures. For example, we can actually consider a variable or name to be a kind of data abstraction. If we take this view, then our "constructor" in this case is the special form define. It binds the name (or symbol) given by the first argument to the value of the second expression. The key point is that we can consider this as a kind of primitive data abstraction that pairs up a name and a value. The associated selector in this case is very simple. We just use the name itself, that is, when we evaluate the name, it looks up the value bound to it by the define expression, and returns that value, that is, it pulls apart the binding of name and value, and returns one component of that binding. Slide 11.1.4 Now we can introduce the ability to mutate that variable, that is, to change the binding of the name and a value. Let's look first at the expression that accomplishes this, which is the set! expression shown. This is also a special form, as it doesn't follow the normal rules for combinations. The rule for evaluation of this form is: take the second argument and evaluate it using standard rules; then take the first argument and just treat it as a symbol (i.e. don't evaluate it). Now, find the binding of that symbol and change it to take on the new value. Note that this looks a lot like a define expression, but as we will see in a few lectures, there is a fundamental difference. For now, the distinction is that a define would create a new binding for the name, using up additional space, whereas set! changes an existing binding. Slide 11.1.5 What does adding mutation do to our language? One of the primary effects of adding mutation is that we end up breaking our substitution model. This is okay, as we will shortly replace it with a better model. In fact, the substitution model inherently assumed that we were dealing with functional programming, with no mutation or side effects. What does this mean? In essence, functional programming implies that we can conceptually treat our procedures as if they were mathematical functions. Yes, we know that we have procedures that deal with things other than numbers, but the same idea holds. This means that we could treat our procedures as mappings from input values to output values, and more importantly, that mapping was consistent, providing the same output value for a particular input value, no matter when we did the actual evaluation. For example, if I define x to have the value 10, and then I evaluate the expression (+ x 5), I of course get the value 15. If I then do some other computation and return to the evaluation of (+ x 5) I still get the value 15
6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology This means that this expression always has the same value, no matter when I evaluate it, and the procedure acts like a mapping from an input value to an output value, independent of time Assignment--set Slide 11.1.6 Substitution model- functional programming Once we introduce mutation or assignment into our language, (define z 10) this model no longer holds. In particular, an expressions value +x5)=>15 now depends on when it is evaluated, that is, what other (+x5)=>15 ame scope as binding expressions have been evaluated before it. In fact, notice the With assignment use of the term when" We have now introduced time into our (define x 10 language in a very fundamental way. Now, two expression +x5) 15-expression value" depends on when it is evaluated with identical syntax may have different semantics, because 【set!x94} they inherently rely on the context surrounding their evaluation (+x5}=>99 To see this, consider the example shown. We again define x to 1001 SICP have the value 10. If we immediately evaluate the expression ( x 5) we will still get the value 15. Now suppose that at some future point, we mutate the value of x to have some new value, 94. Remember that set! finds the existing binding for x and changes it. If we then evaluate (+x 5) we now get 99 as a value, since the value of x has Notice, we now have two syntactically identical expressions, but with different values or meanings or semantics Thus, time now matters. Adding the ability to mutate a value means that context will influence the values of expressions. As we will see, having mutation makes some things much easier to do, but at the cost of greater potential for unanticipated effects Slide 11.1.7 Compound Data Not only can we mutate names, we can also mutate other basic data structures. For pairs, we also have procedures that change creates a new pair p their pieces To remind you, the constructor for a pair is the primitive selectors procedure cons, and the associated selectors are car and returns car part of pair returns cdr part of pair cdr. Now we introduce two mutators, one for each part of the mutators. data structure Their forms are shown on the slide (set-car! P new-x) changes car pointer in pair Both of these are normal procedures. Their behavior is to (set-cdr! p new-y) changes evaluate the first argument, which is expected to be a pair(or i Pair, anytype - undef something made by cons). They also evaluate their second 6 001 SICP argument to get a new value or structure. The behavior is to then take the pair pointed to by the first argument and change or mutate either the car or cdr part of that pair ( depending on which expression we are using) to point to the value of the second argument, thereby breaking the current pointer in the car or cdr part of the pair Notice the type definition of these procedures. In particular, the return type is unspecified, since the procedure is Ised strictly for the side effect of changing the pair structure
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. This means that this expression always has the same value, no matter when I evaluate it, and the procedure acts like a mapping from an input value to an output value, independent of time. Slide 11.1.6 Once we introduce mutation or assignment into our language, this model no longer holds. In particular, an expressions value now depends on when it is evaluated, that is, what other expressions have been evaluated before it. In fact, notice the use of the term "when". We have now introduced time into our language in a very fundamental way. Now, two expressions with identical syntax may have different semantics, because they inherently rely on the context surrounding their evaluation. To see this, consider the example shown. We again define x to have the value 10. If we immediately evaluate the expression (+ x 5) we will still get the value 15. Now suppose that at some future point, we mutate the value of x to have some new value, 94. Remember that set! finds the existing binding for x and changes it. If we then evaluate (+ x 5) we now get 99 as a value, since the value of x has changed. Notice, we now have two syntactically identical expressions, but with different values or meanings or semantics. Thus, time now matters. Adding the ability to mutate a value means that context will influence the values of expressions. As we will see, having mutation makes some things much easier to do, but at the cost of greater potential for unanticipated effects. Slide 11.1.7 Not only can we mutate names, we can also mutate other basic data structures. For pairs, we also have procedures that change their pieces. To remind you, the constructor for a pair is the primitive procedure cons, and the associated selectors are car and cdr. Now we introduce two mutators, one for each part of the data structure. Their forms are shown on the slide. Both of these are normal procedures. Their behavior is to evaluate the first argument, which is expected to be a pair (or something made by cons). They also evaluate their second argument to get a new value or structure. The behavior is to then take the pair pointed to by the first argument and change or mutate either the car or cdr part of that pair (depending on which expression we are using) to point to the value of the second argument, thereby breaking the current pointer in the car or cdr part of the pair. Notice the type definition of these procedures. In particular, the return type is unspecified, since the procedure is used strictly for the side effect of changing the pair structure
6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology Slide lll 8 Example: Pair/List Mutation As we will see shortly, having mutation buys us some new power in our computational capabilities, but it also raises some (define b a) a=→>(12) L-n new problems, which we want to highlight first. For example, b=>(12} suppose I create the list of l and 2, and give it the name a. I also give the name b to the same structure, and remember that the second define expression literally binds the name b to the value of a, which we know is the pointer to this box-and pointer structure. Remember that there is no new creation of pairs in this case b literally points to the same structure IfI 6 001 SICP ao ask for the value of a or b, in each case i get the list(1 2) shown in red. This gets the value of a which is the list structure(define bast 12 or Slide ll1,g Now suppose that some time later, I evaluate the expression in the upper right. It gets the value of the second argument, 12 which is just 10. It then takes the car part of the box pointed to by a, breaks the current pointer in that box, and creates a (set-car! a 10) new pointer to the new value, 10 b=>(102) So what! Well. notice a subtle effect. If I ask for the value ofb it has changed! We get the value of b by tracing out the box and-pointer structure, to get(10 2). Yet nowhere in my code is there an explicit expression changing b. In this simple case, we know there is a tie between a and b, but in more general cases you can see how trouble can arise. Given the ability to share data structures, and to mutate data structures, one piece of code could change a value affecting some other variable without realizing it Example 2: Pair/List Mutation Slide lll1o To use these mutators it is important to understand their affect enex3a"b)x山- on variables and structures so that we can work backwards from a desired effect to determine the right mutation to cause that to happen. For example, consider the simple list structure shown here 4 6001 SICP ②D
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. Slide 11.1.8 As we will see shortly, having mutation buys us some new power in our computational capabilities, but it also raises some new problems, which we want to highlight first. For example, suppose I create the list of 1 and 2, and give it the name a. I also give the name b to the same structure, and remember that the second define expression literally binds the name b to the value of a, which we know is the pointer to this box-andpointer structure. Remember that there is no new creation of pairs in this case, b literally points to the same structure. If I ask for the value of a or b, in each case I get the list (1 2). Slide 11.1.9 Now suppose that some time later, I evaluate the expression shown in red. This gets the value of a which is the list structure in the upper right. It gets the value of the second argument, which is just 10. It then takes the car part of the box pointed to by a, breaks the current pointer in that box, and creates a new pointer to the new value, 10. So what! Well, notice a subtle effect. If I ask for the value of b it has changed!. We get the value of b by tracing out the boxand-pointer structure, to get (10 2). Yet nowhere in my code is there an explicit expression changing b. In this simple case, we know there is a tie between a and b, but in more general cases you can see how trouble can arise. Given the ability to share data structures, and to mutate data structures, one piece of code could change a value affecting some other variable without realizing it. Slide 11.1.10 To use these mutators it is important to understand their affect on variables and structures, so that we can work backwards from a desired effect to determine the right mutation to cause that to happen. For example, consider the simple list structure shown here
6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology Slide lll Now let's suppose I want to change the structure to l Example 2: Pair/List Mutation form shown in red. What expression do I need to ev (define x (list cause this change? When you are ready to answer, next slide How mutate to achieve the result at right? 6001 sICP Example 2: Pair/List Mutation Slide l1.1. 12 Here is the answer, and let's reason through why. First, we ( define(a"b)x→-山 know that we want to change the car part of the second pair How mutate to achieve the in the list. So, we need to evaluate (cdr x) to get to the result at right? Hhi second pair, the one shown in blue. Since we want to change (set-car! (cdr c (1ist12)) the car part of this pair, we need to use set-car!And what should the new value be? Just the list(1 2), which we 1. Eval (cdr x) to get pair object know we can create directly 2. Change car pointer of 601 SICP Slide lll13 Sharing, Equivalence and Identity o it might occur to you to ask: " Given that different parts of a How can we tell if two things are equivalent? list structure can be changed by using mutation, how do we tell Well, what do you mean by"equivalent"? if two things are equivalent? We already saw that we could 1. The same object: test with eq? have two different names for the same structure. and thus (eq?ab)一># 2. Objects that look" the s test with equal? changing one name's value could cause the other name's value ?(1st12)(1ist12)) (eq?(1ist12)(1st12))=> to also change In fact, mutation actually causes us to first consider what equivalent means. If we want to know if two names point to exactly the same object, our test is eq? This returns true, if the values of the two arguments point to exactly the same 6m: SICP structure. Another way of saying that is that if we make some change to one structure, the corresponding change will be visible in the other structure. Thus eg? provides us ith the finest level of detail in testing equality On the other hand, if we just want to know if two objects"look the same", that is, they print out the same form, then we use equal?. Thus, the test using equal? return true because these two list expressions result in structures that look the same. But we know that each call to list causes a new pair to be create so in fact these two list expressions create different box-and-pointer structures, and are not eg? Thus, we have two different levels of granularity in deciding equivalence of structures
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. Slide 11.1.11 Now let's suppose I want to change the structure to have the form shown in red. What expression do I need to evaluate to cause this change? When you are ready to answer, go to the next slide. Slide 11.1.12 Here is the answer, and let's reason through why. First, we know that we want to change the car part of the second pair in the list. So, we need to evaluate (cdr x) to get to the second pair, the one shown in blue. Since we want to change the car part of this pair, we need to use set-car!. And what should the new value be? Just the list (1 2), which we know we can create directly. Slide 11.1.13 So it might occur to you to ask: "Given that different parts of a list structure can be changed by using mutation, how do we tell if two things are equivalent?" We already saw that we could have two different names for the same structure, and thus changing one name's value could cause the other name's value to also change. In fact, mutation actually causes us to first consider what equivalent means. If we want to know if two names point to exactly the same object, our test is eq?. This returns true, if the values of the two arguments point to exactly the same structure. Another way of saying that is that if we make some change to one structure, the corresponding change will be visible in the other structure. Thus eq? provides us with the finest level of detail in testing equality. On the other hand, if we just want to know if two objects "look the same", that is, they print out the same form, then we use equal?. Thus, the test using equal? return true because these two list expressions result in structures that look the same. But we know that each call to list causes a new pair to be create so in fact these two list expressions create different box-and-pointer structures, and are not eq?. Thus, we have two different levels of granularity in deciding equivalence of structures