6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology 6.001 Notes: Section 13.1 Slide 13.1.1 The role of abstractions In this lecture, we are going to look at a very different style of creating large systems, a style called object oriented programming. This style focuses on breaking systems up in a different manner than those we have seen before To set the stage for this, we are first going to return to the notion of abstractions, and use that idea to see how we can capture objects with some internal state that reflects the status of those objects. We are going to be led from there to a style of programming called message-passing in which we treat systems as if they consist of large collections of objects that 60015e communicate with one another to cause computation to take The role of abstractions Slide 13.1.2 Let's start by going back and thinking about the tools we have Data abstraction developed for thinking about computation. Two of the key tools we have developed dealt with abstractions We have seen procedural abstractions here the idea is to capture a common pattern of processing into a procedure, then isolate the details of the computation from the use of the computation, by simply naming the procedure and using that name with appropriate conditions on the procedure's input. We saw that this style of approach is particularly useful when dealing with problems that are easily addressed in a functional programming approach, that is, where we can treat the dures as generalized mathematical function that their output for a given input will be the same whenever we evaluate it We have also seen data abstractions. Here the idea is to modularize our system by creating data structures that capture key parts of the information we need to handle. The goal is to hide the details of the representation and storage of the data behind standard interfaces, primarily our constructors and selectors. This means the user can then manipulate data objects without having to worry about details of how they are maintained As you might expect, often the data abstractions and the procedural abstractions work hand-in-hand, with the procedures used to manipulate the data using the data abstraction interfaces, and with the structure of the procedures tending to mirror the actual structure of the data
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. 6.001 Notes: Section 13.1 Slide 13.1.1 In this lecture, we are going to look at a very different style of creating large systems, a style called object oriented programming. This style focuses on breaking systems up in a different manner than those we have seen before. To set the stage for this, we are first going to return to the notion of abstractions, and use that idea to see how we can capture objects with some internal state that reflects the status of those objects. We are going to be led from there to a style of programming called message-passing in which we treat systems as if they consist of large collections of objects that communicate with one another to cause computation to take place. Slide 13.1.2 Let's start by going back and thinking about the tools we have developed for thinking about computation. Two of the key tools we have developed dealt with abstractions. We have seen procedural abstractions. Here the idea is to capture a common pattern of processing into a procedure, then isolate the details of the computation from the use of the computation, by simply naming the procedure and using that name with appropriate conditions on the procedure's input. We saw that this style of approach is particularly useful when dealing with problems that are easily addressed in a functional programming approach, that is, where we can treat the procedures as generalized mathematical functions, meaning that their output for a given input will be the same whenever we evaluate it. We have also seen data abstractions. Here the idea is to modularize our system by creating data structures that capture key parts of the information we need to handle. The goal is to hide the details of the representation and storage of the data behind standard interfaces, primarily our constructors and selectors. This means the user can then manipulate data objects without having to worry about details of how they are maintained. As you might expect, often the data abstractions and the procedural abstractions work hand-in-hand, with the procedures used to manipulate the data using the data abstraction interfaces, and with the structure of the procedures tending to mirror the actual structure of the data
6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology Slide 13.1.3 The goal in each case is actually the same: we want to hide The role of abstractions details of the abstraction so that we can treat complex things as if they are primitive units. In the case of procedural abstractions, we want to hide the details of the computation, Goal: treat complex things as primitives, and hide details and treat the procedure as a primitive computational unit. In the case of data abstractions we want to hide the details of how components are glued together, and treat each unit as an abstract collection of parts 6001 sICP The role of abstractions Slide 13.1. 4 Procedural abstractions Given that we want to use abstractions as a tool in controlling Data abstractions complexity in large systems, there are several questions that come up when thinking about how to use abstractions. The first Goal: treat complex things as primitives, and hide details is: what is the best way to break a new problem area up into a set of modules? Both data modules and procedure modules? As ve have already seen in earlier lectures, some problems break Questions: How easy is it to em into abstraction modules? up in multiple ways, and breaking them up in different ways makes some processes easier and others harder so a ke Adding new question is: How do I use the idea of abstraction to break Adding new 601 SICP systems into modules and what s the best way to do this? The second question deals with how easy it is to extend the system. If I want to add new data types to my system, is that easy? If I want to add new methods to my system, new ways of manipulating data types, is that easy? We have seen several examples of this already, we are now going to return to these questions in order to lead to a very different way of breaking systems up into convenient sized chunks Slide 13. 1.5 Let's start by going back to data objects and data abstractions One view of Data Here is the traditional way of looking at data, at least as we Some complex structure constructed from cons cells have done thi Explicit tags to ke of data types Implement a data abstraction as set of procedures that operate on First, we build some complex data structure out of primitives the data for example, cons cells or pairs. Second, we use tags to identify the type of structure being represented This tells us how to interpret the different slots in the list structure. For example, is the car of the list structure the name of a person or his batting average or his GPA? Then, the data abstraction is actually built by creating a set of 4 procedures that operate on the data. These are procedures that take in instances of the data, use selectors to get out the pieces, do some manipulation to create new pieces, and then use the constructor to re-glue the abstraction back together. This led to the concept of data-directed ogramming, which we saw earlier. We use the tag to determine the right set of procedures to apply. and this allows the user to program in a generic fashion. They can focus on what they want to do, but have the code direct the data to the right place for the actual work
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. Slide 13.1.3 The goal in each case is actually the same: we want to hide details of the abstraction so that we can treat complex things as if they are primitive units. In the case of procedural abstractions, we want to hide the details of the computation, and treat the procedure as a primitive computational unit. In the case of data abstractions, we want to hide the details of how components are glued together, and treat each unit as an abstract collection of parts. Slide 13.1.4 Given that we want to use abstractions as a tool in controlling complexity in large systems, there are several questions that come up when thinking about how to use abstractions. The first is: what is the best way to break a new problem area up into a set of modules? Both data modules and procedure modules? As we have already seen in earlier lectures, some problems break up in multiple ways, and breaking them up in different ways makes some processes easier and others harder. So a key question is: How do I use the idea of abstraction to break systems into modules and what’s the best way to do this? The second question deals with how easy it is to extend the system. If I want to add new data types to my system, is that easy? If I want to add new methods to my system, new ways of manipulating data types, is that easy? We have seen several examples of this already, we are now going to return to these questions in order to lead to a very different way of breaking systems up into convenient sized chunks. Slide 13.1.5 Let's start by going back to data objects and data abstractions. Here is the traditional way of looking at data, at least as we have done things so far. First, we build some complex data structure out of primitives, for example, cons cells or pairs. Second, we use tags to identify the type of structure being represented. This tells us how to interpret the different slots in the list structure. For example, is the car of the list structure the name of a person or his batting average or his GPA? Then, the data abstraction is actually built by creating a set of procedures that operate on the data. These are procedures that take in instances of the data, use selectors to get out the pieces, do some manipulation to create new pieces, and then use the constructor to re-glue the abstraction back together. This led to the concept of data-directed programming, which we saw earlier. We use the tag to determine the right set of procedures to apply. And this allows the user to program in a generic fashion. They can focus on what they want to do, but have the code direct the data to the right place for the actual work
6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology Slide 13.1.6 One view of Data Here is a simple example to illustrate this point. Suppose I have a set of different geometric objects, things like numbers, lines Some complex structure constructed from cons cells shapes, and I want to write a procedure, or an operation, that Explicit tags to keep track of data types mplement a data abstraction as set of procedures that will scale each of those objects by some amount. Then a perate on the data generic operation, under the data-directed approach would look Generic"operations by looking at types like this procedure shown here. Given an object and my desired scale factor, I use the type of the object to dispatch: if it is a cond ((number? x)(* x factor)) number, I just multiply; if it is a line, I ship it to the procedure that will scale a line and so on (else (error unknown type)))) The point of this example is that I think about things in terms of 6 001 SICP I the kinds of objects I have and procedures for manipulating each distinct object type. I use the tag or the type of the object to tell me which procedure to send the object to Slide 13.17 Dispatch on Type So now let's go back to our questions. How easy is it to extend such a system, a system where we are breaking things up into Adding new data types Must change every generic operation tagged data, and using data directed programming? First, if we add a new data type to our system, what do we have to do? Adding new methods Just create generic operations Well we can see from our example that we will have to all the procedures like scale, to add a new clause to each cond dispatching on that new type of object. As a consequence, if there are many such procedures, we have a lot of changes to nake, both a great deal of code to write, and more importantly aking sure that we change all the relevant procedures If we add a new operation or method what do we need to do? This is easier, as we just need to develop a subprocedure for each type of object to which the method will apply Thus in this style of programming, adding a new data type is painful, while adding a new method is reasonable. As when the changes are mostly new methods or operations, or when the different kinds of data structures in the or a consequence, this approach to modularizing systems works best when there are only a few data abstractions system are mostly independent of one another. In those cases, this style of approach works well. But not everything fits these cases What should we do in those cases? Slide 13.1.8 Adding new data types So let's step back from this organization for a second. One way Must change every generic operation to think about structuring a large system is to realize that we are Must keep names distinct Adding new methods likely to have a large number of different data objects( Just create generic operations stances of data abstractions ) and a large number of operations we want to perform on those objects. Conceptually ata type 1 Data type 2 Data type 3 Data tyt this means we have a big table, where we can use a different operation 1 Some proc Some proc some proc Some proc row for each operation we want to perform, and a different peration 2 Some proe Some proc Some proc Some proc column for each kind of data abstraction we have. Then at each peration a some proc some proc some proc some proc element in this table, we can conceptualize having a specific Operation 4 Some proc Some proe Some proc Some proe procedure, intended to perform the particular operation(e.g scaling )on the particular kind of data object(e. g. a number)
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. Slide 13.1.6 Here is a simple example to illustrate this point. Suppose I have a set of different geometric objects, things like numbers, lines, shapes, and I want to write a procedure, or an operation, that will scale each of those objects by some amount. Then a generic operation, under the data-directed approach would look like this procedure shown here. Given an object and my desired scale factor, I use the type of the object to dispatch: if it is a number, I just multiply; if it is a line, I ship it to the procedure that will scale a line, and so on. The point of this example is that I think about things in terms of the kinds of objects I have and procedures for manipulating each distinct object type. I use the tag or the type of the object to tell me which procedure to send the object to. Slide 13.1.7 So now let's go back to our questions. How easy is it to extend such a system, a system where we are breaking things up into tagged data, and using data directed programming? First, if we add a new data type to our system, what do we have to do? Well we can see from our example that we will have to all the procedures like scale, to add a new clause to each cond, dispatching on that new type of object. As a consequence, if there are many such procedures, we have a lot of changes to make, both a great deal of code to write, and more importantly making sure that we change all the relevant procedures. If we add a new operation or method, what do we need to do? This is easier, as we just need to develop a subprocedure for each type of object to which the method will apply. Thus in this style of programming, adding a new data type is painful, while adding a new method is reasonable. As a consequence, this approach to modularizing systems works best when there are only a few data abstractions or when the changes are mostly new methods or operations, or when the different kinds of data structures in the system are mostly independent of one another. In those cases, this style of approach works well. But not everything fits these cases. What should we do in those cases? Slide 13.1.8 So let's step back from this organization for a second. One way to think about structuring a large system is to realize that we are likely to have a large number of different data objects (or instances of data abstractions), and a large number of operations we want to perform on those objects. Conceptually, this means we have a big table, where we can use a different row for each operation we want to perform, and a different column for each kind of data abstraction we have. Then at each element in this table, we can conceptualize having a specific procedure, intended to perform the particular operation (e.g. scaling) on the particular kind of data object (e.g. a number)
6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology Slide 13.1.9 One way of actually building such a system is to focus on the Dispatch on Type rows of the table, that is the operations. Indeed, our use of tagged data was based around this viewpoint, in which we Must change every generic operation Must keep names distinct created generic operations that handle the same operation for Adding new methods different data objects, and used the tag on the data object to Just create generic operations dispatch to the appropriate version of the procedure to handle that kind of data Data type 1 Data type 2 Data type 3 Data type 4 oc Some pro Some Operation 4 Some proc Some Some proc Some proc Dispatch on type Slide 13.1.10 Adding new data types But given this table, there is an alternative possible Must change every generic operation organization, which is around the columns of the table. This Must keep names distinct would focus on creating a generic data object that would know Adding new methods Just create generic operations how to handle different operations on that kind of data structure Data ty ata type 3Data type 4 proc Some proc me proc Some pr operation 4 Some proc ome proc/ Some proc Some pro Slide 13.1.1 Let's step back and rethink data. This sounds like an odd thing An Alternative View of Data Procedures with state to do but let's think about data in a very different way. Rather than thinking of data abstractions as some slots into which we A procedure has can put things, let's instead consider data to be a procedure with parameters and body as specified by n expression environment (which can hold name-value bindings some internal state This sounds strange! But, what is a procedure? It really has two parts: it has a set of parameters and a body which define the pattern of computation to perform as a function of the objects d in and as we saw in the environment model it has an associated environment which can hold name-value bindings, 4 60015e that is, pall of names and values
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. Slide 13.1.9 One way of actually building such a system is to focus on the rows of the table, that is the operations. Indeed, our use of tagged data was based around this viewpoint, in which we created generic operations that handle the same operation for different data objects, and used the tag on the data object to dispatch to the appropriate version of the procedure to handle that kind of data. Slide 13.1.10 But given this table, there is an alternative possible organization, which is around the columns of the table. This would focus on creating a generic data object that would know how to handle different operations on that kind of data structure. Slide 13.1.11 Let's step back and rethink data. This sounds like an odd thing to do but let's think about data in a very different way. Rather than thinking of data abstractions as some slots into which we can put things, let's instead consider data to be a procedure with some internal state. This sounds strange! But, what is a procedure? It really has two parts: it has a set of parameters and a body which define the pattern of computation to perform as a function of the objects passed in; and as we saw in the environment model, it has an associated environment which can hold name-value bindings, that is, pairings of names and values
6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology Slide 13.1.12 An Alternative view of data: So what, you say! Well, we can use this idea to captu Procedures with State information about a data structure. In particular, we can use a procedure to represent data objects with state. What would that A procedure has parameters and body as specified by a expression mean? It would say that we could use the local environment of environment (which can hold name-value bindings the procedure plus its parameters to hold the values of the late(and hide) data, and datum, and we could create local procedures within the data procedure to manipulate these values, to change the state of the Need access to that environment ructor, accessors, mutators, predicates This means that the only access to the values of the data object ion: changes in the private state of the will be through the procedure representing the data. This would 6 001 SICP I nicely encapsulate the data structure inside this procedure This probably still sounds odd so let's look at a specific 6.001 Notes: Section 13.2 Slide 13.2.1 Example: Pair as a Procedure with State To illustrate this idea of using a procedure to represent a data structure,an object with state, let's look at the following, rather(define (cons x y) dd, example Here is a very different way of implementing a cond《(eq?mg'cAR)x) ((eq? m84 'CDR) y) cons cell or a pair. Let me stress that this is not the way that 《eq?m8 f PAIR?)# else (error pair cannot msg))))) Scheme normally represents pairs. Of course, the idea of data abstraction is that the actual implementation of a data structure should be irrelevant to the user. This example is used to drive home a conceptual point Here, we have implemented a pair as a procedure! Thus our undamental data structure is now a procedure rather than some 60015c storage in memory slots Example: Pair as a Procedure with State Slide 13.2.2 Look at this carefully. First, note that cons, as defined here involves two lambdas Remem ber that there is a hidden ((eq? m84CDR lambda inside the syntactic sugar of this definition. This ((eqr? msg 'PAIR?) #t) means that there is a second l ambda as the body of the cons and thus when we evaluate(cons x y)using this pa ntation, we get value. a procedure of one parameter, ms g So what does this say? It says that when we use cons with this implementation our representation for our fundamental way of gluing things together is now a procedure of one argument So what would that cons thing do? Since it is a procedure, if we send it a value, or if we apply the procedure to a single argument, note what it does. It uses the value of the argument, in this case a particular symbol, to decide
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. Slide 13.1.12 So what, you say! Well, we can use this idea to capture information about a data structure. In particular, we can use a procedure to represent data objects with state. What would that mean? It would say that we could use the local environment of the procedure plus its parameters to hold the values of the datum, and we could create local procedures within the data procedure to manipulate these values, to change the state of the object. This means that the only access to the values of the data object will be through the procedure representing the data. This would nicely encapsulate the data structure inside this procedure. This probably still sounds odd so let's look at a specific example. 6.001 Notes: Section 13.2 Slide 13.2.1 To illustrate this idea of using a procedure to represent a data structure, an object with state, let's look at the following, rather odd, example. Here is a very different way of implementing a cons cell or a pair. Let me stress that this is not the way that Scheme normally represents pairs. Of course, the idea of data abstraction is that the actual implementation of a data structure should be irrelevant to the user. This example is used to drive home a conceptual point. Here, we have implemented a pair as a procedure! Thus our fundamental data structure is now a procedure rather than some storage in memory slots. Slide 13.2.2 Look at this carefully. First, note that cons, as defined here, involves two lambdas. Remember that there is a hidden lambda inside the syntactic sugar of this definition. This means that there is a second lambda as the body of the cons and thus when we evaluate (cons x y) using this particular implementation, we get back as a value, a procedure of one parameter, msg. So what does this say? It says that when we use cons with this implementation our representation for our fundamental way of gluing things together is now a procedure of one argument. So what would that cons thing do? Since it is a procedure, if we send it a value, or if we apply the procedure to a single argument, note what it does. It uses the value of the argument, in this case a particular symbol, to decide