6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology The concept of a tag Slide 9.1.16 Tagged data This simple example thus motivates the key idea we are going to attach an identifying symbol to all nontrivial data values explore in this lecture: by putting tags on data structures, we can identify the right operations to apply to that data structure. In define ( make-poant x y) (ist "point x y) fact, a careful programmer would always put tags on data system, and to provide a means verifying that correct operations are being applied to data Thus, tagged data now refers to the concept of attaching an ing label to all non-trivial data types in my 6001 SICP convention, we will try to always check the label on the data structure before applying any procedures to manipulate those structures Slide 9.1.17 Benefits of tagged data As we will see, there are two key reasons for wanting to use tagged data. The first is that it makes available to use the functions that decide what to do based on the arguments owerful idea of data directed programming. Here, the idea to let the type of an object direct the procedure to the right cample: in a graphics program area: triangle square -> number method to apply to that type of object. This means that our style will be to write procedures that look at the type of the argument, and use that information to apply a procedure specifically tuned to that type of object. This, in fact, is exactly what we just did ith complex numbers. In that case, the selectors used the data type to direct the system to the right method 6001 sICP another example, suppose you are writing a graphics program, and want to be able to compute the area of a figure Rather than writing one giant procedure to do this for all types of figures, we could let the type of the figure(a triangle, a square, some other form) direct the procedure to a subprocedure specifically designed for that type of figure. Such an approach leads to very modular code, which is much easier to modify and maintain Benefits of tagged data Slide 91.18 data-directed programming The second key reason is that this approach allows us to practice functions that decide what to do based on the arguments defensive programming. We want to be careful to ensure that we don't have unwarranted assumptions about inputs to our area: triangle square l circle - number procedures, and in particular, we want our procedures to fail gracefully when they unexpected receive inputs of the wrong type. As we saw in the previous example, when we created a selector for a complex number, we did exactly that. I checked .much better to give an error message than for specific types of objects, specifying the method to use, but if I did not recognize the type of the object as something I was set to handle, I returned an error message with appropriate debug code if the error doesn't show up until several stages of additional processing have been applied to thay s 6001 SICP information. The point is that by handling unknown types rather than assuming something about a data object, we catch errors before they can propagate. It is much harder to incorrect data type
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. Slide 9.1.16 This simple example thus motivates the key idea we are going to explore in this lecture: by putting tags on data structures, we can identify the right operations to apply to that data structure. In fact, a careful programmer would always put tags on data structures, exactly to provide the flexibility to extend his or her system, and to provide a means verifying that correct operations are being applied to data. Thus, tagged data now refers to the concept of attaching an identifying label to all non-trivial data types in my system. By convention, we will try to always check the label on the data structure before applying any procedures to manipulate those structures. Slide 9.1.17 As we will see, there are two key reasons for wanting to use tagged data. The first is that it makes available to use the powerful idea of data directed programming. Here, the idea is to let the type of an object direct the procedure to the right method to apply to that type of object. This means that our style will be to write procedures that look at the type of the argument, and use that information to apply a procedure specifically tuned to that type of object. This, in fact, is exactly what we just did with complex numbers. In that case, the selectors used the data type to direct the system to the right method. As another example, suppose you are writing a graphics program, and want to be able to compute the area of a figure. Rather than writing one giant procedure to do this for all types of figures, we could let the type of the figure (a triangle, a square, some other form) direct the procedure to a subprocedure specifically designed for that type of figure. Such an approach leads to very modular code, which is much easier to modify and maintain. Slide 9.1.18 The second key reason is that this approach allows us to practice defensive programming. We want to be careful to ensure that we don't have unwarranted assumptions about inputs to our procedures, and in particular, we want our procedures to fail gracefully when they unexpected receive inputs of the wrong type. As we saw in the previous example, when we created a selector for a complex number, we did exactly that. I checked for specific types of objects, specifying the method to use, but if I did not recognize the type of the object as something I was set to handle, I returned an error message with appropriate information. The point is that by handling unknown types here, rather than assuming something about a data object, we catch errors before they can propagate. It is much harder to debug code if the error doesn't show up until several stages of additional processing have been applied to that incorrect data type
6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology Slide 9.1. 19 To summarize we have introduced the idea of tagged data Benefits of tagged data types, and used that to consider clean ways to modularize data-directed program systems. We now want to explore in more detail how using functions that decide what to do based on the arguments tagged data makes it easier to build large systems ole: in a graphics program functions that fail gracefully if given bad arguments . much better to give an error message that 6001 SICP 6.001 Notes: Section 9. 2 Slide 9.2.1 Example: Arithmetic evaluation To show how to use these two ideas: data directed programming and defensive programming, the rest of this lecture is going to (define expl (make-sum (make-sum 3 15)20) consider an extended example. Because this example involves expl 仆+(+315}20 fair amount of code, it is important to keep in mind the key points being illustrated, so you don,'t lose sight of them amidst all the code fragments. To make this easier, we are going to incremental build new versions of the example on top of simpler versions, thus highlight key ideas and changes The example we are going to use to illustrate the ideas of data directed programming and defensive programming is a system 6001 sICP to manipulate arithmetic expressions. This will be similar to our example of symbolic derivatives, but now applying to more general expressions. So what does this mean? Not only do i want to be able to create symbolic arithmetic expressions, using appropriate constructors, I also want to be able to reduce those expressions to simpler forms, whenever possible. Thus, I would like to create symbolic expressions, such as exp l, as shown. Thus the value of the exp l will be the actual expression. And, I want to create a system that can evaluate expressions such as expl, that is, reduce this expression to a simpler form, in this case, the simpler expression 3 8
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. Slide 9.1.19 To summarize, we have introduced the idea of tagged data types, and used that to consider clean ways to modularize systems. We now want to explore in more detail how using tagged data makes it easier to build large systems. 6.001 Notes: Section 9.2 Slide 9.2.1 To show how to use these two ideas: data directed programming and defensive programming, the rest of this lecture is going to consider an extended example. Because this example involves a fair amount of code, it is important to keep in mind the key points being illustrated, so you don't lose sight of them amidst all the code fragments. To make this easier, we are going to incremental build new versions of the example on top of simpler versions, thus highlight key ideas and changes. The example we are going to use to illustrate the ideas of data directed programming and defensive programming is a system to manipulate arithmetic expressions. This will be similar to our example of symbolic derivatives, but now applying to more general expressions. So what does this mean? Not only do I want to be able to create symbolic arithmetic expressions, using appropriate constructors, I also want to be able to reduce those expressions to simpler forms, whenever possible. Thus, I would like to create symbolic expressions, such as exp1, as shown. Thus the value of the exp1 will be the actual expression. And, I want to create a system that can evaluate expressions such as exp1, that is, reduce this expression to a simpler form, in this case, the simpler expression 38
6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology Slide 9.2.2 Example: Arith metic evaluation Building a system to evaluate expressions such as those just shown would be interesting in its own right, but I would like to (define expl m"m+t3120 add more to my system. In particular, I want my system to not (eval-1 expl) ==>38 only simplify standard arithmetic expressions, but also expressions that involve numbers that are only known to lie Expressions might include values other than numbers within a particular range. For example, I want my system to be e unknown number between min and may able to simplify expressions where the numbers are only known arithmetic:3,7]+[13]=[4,10] to lie between specific bounds of uncertainty. In the example Limited pr shown, I may only know that one number lies between 3 and 7, some value t some error amount rithmetic:(100±1)+(3±0.5)=(103±1.5) and another number lies between I and 3, but I still want my OOt sIcP system to be able to simplify an addition involving these numbers whose result i know must lie between 4 and 10 Moreover, I would like my system to be able to deal with expressions involving data from scientific experiments, in which I know the ostensible value of a number and a specified range of precision, that is, that the true value may lie within some plus/minus range of the ostensible value. In the example shown, I want my system to be able to add together expressions where I know one number is 100 plus/minus 1 and the other number is 3 plus/minus 0.5, and ave the system deduce that the sum should be 103 plus/minus 1.5 Thus, my goal is to build an arithmetic evaluation system that reduces arithmetic expressions, consisting of a mix of symbols and numbers, to simplest form; where the expressions may be standard numbers, ranges of values or imited precision values Slide 9.2.3 Approach: start simple, then extend The basic approach we are going to take is to start with easy Characteristic of all software engineering projects things first, an obvious thing to do! In our case, we will first tart with eval for numbers, then add support for build a system to handle normal numbers, and then we will look at how to build on top of that to handle other expressions, like ranges and limited precision values This does sound obvious, but it is important to stress that this is an important characteristic of a well-designed software engineering project, (and one that all too often is not followed, even in commercial systems). It is almost al ways easier to extend a base system, than to try to do the whole thing at once. 4 This is an important lesson to learn, and one that hopefully you will see in this extended example Approach: start simple, then extend Slide 9.2.4 Characteristic of all software engineering projects One of the things to watch for as we go through this exercise is Start with eval for numbers, then add support fo ranges and limited-precision values to notice how, by doing the development in stages, we make extensions to the system much easier to conceptualize Indee -Goal: build eval in a way that it will extend easily safely our goal is to build our simple evaluator so that extensions are Easily: requires data-directed pro safely requires defensive programming both easy and safe So what does that mean To allow for easy extensions, we will need to utilize tools from data-directed programming. That is, if we use data-directed programming tools(e.g. tagged data, dispatch to methods based on tag types), we will see it is easy to 6001 SICP I add new types of expressions to our system(create a new structure, an appropriate tag, and a dispatch to a method to handle that new type) To allow for safe extensions, we will need to use methods from defensive programming. This will require explicit
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. Slide 9.2.2 Building a system to evaluate expressions such as those just shown would be interesting in its own right, but I would like to add more to my system. In particular, I want my system to not only simplify standard arithmetic expressions, but also expressions that involve numbers that are only known to lie within a particular range. For example, I want my system to be able to simplify expressions where the numbers are only known to lie between specific bounds of uncertainty. In the example shown, I may only know that one number lies between 3 and 7, and another number lies between 1 and 3, but I still want my system to be able to simplify an addition involving these numbers, whose result I know must lie between 4 and 10. Moreover, I would like my system to be able to deal with expressions involving data from scientific experiments, in which I know the ostensible value of a number and a specified range of precision, that is, that the true value may lie within some plus/minus range of the ostensible value. In the example shown, I want my system to be able to add together expressions where I know one number is 100 plus/minus 1 and the other number is 3 plus/minus 0.5, and have the system deduce that the sum should be 103 plus/minus 1.5. Thus, my goal is to build an arithmetic evaluation system that reduces arithmetic expressions, consisting of a mix of symbols and numbers, to simplest form; where the expressions may be standard numbers, ranges of values or limited precision values. Slide 9.2.3 The basic approach we are going to take is to start with easy things first, an obvious thing to do! In our case, we will first build a system to handle normal numbers, and then we will look at how to build on top of that to handle other expressions, like ranges and limited precision values. This does sound obvious, but it is important to stress that this is an important characteristic of a well-designed software engineering project, (and one that all too often is not followed, even in commercial systems). It is almost always easier to extend a base system, than to try to do the whole thing at once. This is an important lesson to learn, and one that hopefully you will see in this extended example. Slide 9.2.4 One of the things to watch for as we go through this exercise is to notice how, by doing the development in stages, we make extensions to the system much easier to conceptualize. Indeed, our goal is to build our simple evaluator so that extensions are both easy and safe. So what does that mean? To allow for easy extensions, we will need to utilize tools from data-directed programming. That is, if we use data-directed programming tools (e.g. tagged data, dispatch to methods based on tag types), we will see it is easy to add new types of expressions to our system (create a new structure, an appropriate tag, and a dispatch to a method to handle that new type). To allow for safe extensions, we will need to use methods from defensive programming. This will require explicit