6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology 6.001 Notes: Section 10.1 Slide 1o.1.1 Table: a set of bindings Over the past few lectures, we have introduced the concepts of data abstractions, types and aggregate structures. In this lecture we want to explore the idea of pulling those pieces together to explore data abstractions in more depth. to do this we will examine a couple of particular examples, but we want you to step away from these details to see how in design data structure we have a tradeoff to consider That tradeoff is between onstructing concrete structures that are very efficient but open to misuse versus abstract structures that nicely separate the implementation from its use but which may suffer in efficiency This may sound a bit odd. The typical instinct for a less experienced programmer is to opt for efficiency. But in fact, if one is writing code for a large system that will be used over extended periods of time and will involve components from a large number of coders, then robustness modularity and ease of use and maintenance are typically much more important that speed In this lecture we are going to do an initial exploration of these ideas. We are going to examine three different kinds of data structures that move from concrete to abstract. culminating in the creation of a table data abstraction We will also see how design choices influence the implementation of an abstract data structure, with different kinds of performance. And we will see that the fundamental issue in designing abstract data types is using this methodology to hide information, both in the types and in the code Slide 10.1.2 Table: a set of bindings So here is where we are headed. We would like to build an abstract binding: a pairing of a key and a value data structure of a table. Conceptually a table is just a collection of bindings, and we are free to consider different ways of structuring that collection. What's a binding? It is just a pairing of a key and a value, or in other words, the key tells us the entry into the table, or how to find the binding in the table and the value is the thing actually associated with that key
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. 6.001 Notes: Section 10.1 Slide 10.1.1 Over the past few lectures, we have introduced the concepts of data abstractions, types and aggregate structures. In this lecture, we want to explore the idea of pulling those pieces together to explore data abstractions in more depth. To do this we will examine a couple of particular examples, but we want you to step away from these details to see how in designing any data structure we have a tradeoff to consider. That tradeoff is between constructing concrete structures that are very efficient but open to misuse versus abstract structures that nicely separate the implementation from its use but which may suffer in efficiency as a consequence. This may sound a bit odd. The typical instinct for a less experienced programmer is to opt for efficiency. But in fact, if one is writing code for a large system that will be used over extended periods of time and will involve components from a large number of coders, then robustness, modularity and ease of use and maintenance are typically much more important that speed. In this lecture we are going to do an initial exploration of these ideas. We are going to examine three different kinds of data structures that move from concrete to abstract, culminating in the creation of a table data abstraction. We will also see how design choices influence the implementation of an abstract data structure, with different kinds of performance. And we will see that the fundamental issue in designing abstract data types is using this methodology to hide information, both in the types and in the code. Slide 10.1.2 So here is where we are headed. We would like to build an abstract data structure of a table. Conceptually a table is just a collection of bindings, and we are free to consider different ways of structuring that collection. What's a binding? It is just a pairing of a key and a value, or in other words, the key tells us the entry into the table, or how to find the binding in the table, and the value is the thing actually associated with that key
6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology Slide 10.1.3 We can define the behavior that we want a table to have without Table: a set of bindings having to consider the specifics of the implementation. In binding a pairing of a key and a value particular, we want the following abstract interface between the user of a table and the actual table itself. .Abstract interface to a table First, we need a constructor for building a table. We will just call that constructor, make. we ignore the details of how make works . put! key value insert a new binding We can just use make as a black box to construct a table for us replaces any previous binding of that key We need a way of inserting a binding into a table. We will assume look up the key, return the corresponding value that inserting a binding replaces any previous binding for that key in the table. Note that we don 't specify whether that old binding is actually removed from the table, or only if the old binding will longer be found by procedures that search the table old bindings around)versus efficiency in time(how quickly we can add new bindings and whether keeping old Why is this relevant? Well, notice that there is likely to be a tradeoff between efficiency in space(whether we ke bindings in the table will affect the speed of searching the table for bindings) Note that at this stage of our design, we have simply specified that put should take a key and a value, create a binding of those two pieces of information, and install that binding into the table in such a way that searching the table for that key will find this binding and not any previous ones We will also need a way of getting values out of the table. Thus, we will have another abstract interface, called get that takes as input a key, looks up the key in the table, and returns the corresponding value. Note again that we have said nothing about how to implement this operation, only what behave we expect as a user of tables Slide 10.1. 4 Table: a set of bindings This is a very important point. The definition we have just binding: a pairing of a key and a value data type of a table. From the user's perspective, using these cr provided of an interface to a table really does define the abstrac interface abstractions is sufficient to allow them to use the table eate a new table The details(the code we are about to show will be an .put! key valu implementation of this abstract data type, but there can be many alternatives that also satisfy the contract inherent in this description ok up the key return the corresponding value of the behavior of bindings in a table .This definition Is the table abstract data type .Code shown later is an implementation of the ADT Slide 10.1.5 Before we start exploring different kinds of data structures that willExamples of using tables lead to the implementation of a table, let's first think about why People tables might be of value to a user An obvious example is keeping track of personnel information. For example, suppose you have just started a new "dot com" company You will want to keep track of your employees, and a table is an obvious way to store data about them. You might start with information keyed by the names of the employees
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. Slide 10.1.3 We can define the behavior that we want a table to have, without having to consider the specifics of the implementation. In particular, we want the following abstract interface between the user of a table and the actual table itself. First, we need a constructor for building a table. We will just call that constructor, make. We ignore the details of how make works. We can just use make as a black box to construct a table for us. We need a way of inserting a binding into a table. We will assume that inserting a binding replaces any previous binding for that key in the table. Note that we don't specify whether that old binding is actually removed from the table, or only if the old binding will no longer be found by procedures that search the table. Why is this relevant? Well, notice that there is likely to be a tradeoff between efficiency in space (whether we keep old bindings around) versus efficiency in time (how quickly we can add new bindings and whether keeping old bindings in the table will affect the speed of searching the table for bindings). Note that at this stage of our design, we have simply specified that put! should take a key and a value, create a binding of those two pieces of information, and install that binding into the table in such a way that searching the table for that key will find this binding and not any previous ones. We will also need a way of getting values out of the table. Thus, we will have another abstract interface, called get, that takes as input a key, looks up the key in the table, and returns the corresponding value. Note again that we have said nothing about how to implement this operation, only what behave we expect as a user of tables. Slide 10.1.4 This is a very important point. The definition we have just provided of an interface to a table really does define the abstract data type of a table. From the user's perspective, using these interface abstractions is sufficient to allow them to use the table. The details (the code we are about to show) will be an implementation of this abstract data type, but there can be many alternatives that also satisfy the contract inherent in this description of the behavior of bindings in a table. Slide 10.1.5 Before we start exploring different kinds of data structures that will lead to the implementation of a table, let's first think about why tables might be of value to a user. An obvious example is keeping track of personnel information. For example, suppose you have just started a new "dot com" company. You will want to keep track of your employees, and a table is an obvious way to store data about them. You might start with information keyed by the names of the employees
6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology Slide 10.1.6 Examples of using tables Associated with each name key, we might have another table that People lists information about that person: their age, their salary, and so on. And associated with each of those pieces of information might be other tables, for example, the salary information for previous Slide 10.1.7 Not only do our table data abstractions need to be flexible enoughExamples of using tables to allow for other tables to be entries in a table, we may have other ways of gathering information together in a table. For example, we might have another table that lists employees by age Slide 10.1.8 Examples of using tables Clearly we can build a table where the keys are the ages of the employees. Here, however, the value associated with a key should be a list whose elements point to the entries in other tables that capture all the data about each employee So our tables need to be very flexible in terms of what information 6.001 Notes: Section 10.2
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. Slide 10.1.6 Associated with each name key, we might have another table that lists information about that person: their age, their salary, and so on. And associated with each of those pieces of information might be other tables, for example, the salary information for previous years. Slide 10.1.7 Not only do our table data abstractions need to be flexible enough to allow for other tables to be entries in a table, we may have other ways of gathering information together in a table. For example, we might have another table that lists employees by age. Slide 10.1.8 Clearly we can build a table where the keys are the ages of the employees. Here, however, the value associated with a key should be a list whose elements point to the entries in other tables that capture all the data about each employee. So our tables need to be very flexible in terms of what information they can store. 6.001 Notes: Section 10.2
6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology Slide 10.2.1 Let's start by looking at the idea of pairing keys and values, as Traditional LISP structure: association list this is going to be the heart of a table implementation. Our goal A list where each element is a list of the key and value is to see how different choices of implementation influence the behavior of the table A traditional structure for doing this is called an a-list or an association list This is a list in which each element is itself a list of a key and a value, or in other words, it is a list of lists, all of which are two elements long Traditional LISP structure: association list So for example, if we want to represent this table(and notice that A list where each element is a list of the key and value this is an abstract version of a table that is. a binding of x to 15 and epresent the table y to 20)as an association list, we can do so as the illustrated list of sts. Each of the inner lists is a two-element list of a key and a as the alist: ((x 15)(y 20)) alue Slide 10.2.3 Or to do this a little more concretely, we could represent this Traditional LIsP structure: association list particular table with the illustrated box-and-pointer diagram. This. A list where each element is a list of the key and value illustrates the nice structure of this system, in which each element of the top level list is a binding, and each of the bindings is just a two-element list of a name and a binding as the alist: ((x 15)(y 20) Alist operation: find-assoc If we were to make the design choice to use a-lists as our (define (find-assoc key alist) representation for a table, then we just need to add a couple of (cond ((nu11?a1ist)静f) operations. First, we will need a way to find an entry for a key in ((equal? key (caar alist))(cadar alist)) the a-list and returning the associated value else (find-assoc key (cdr alist))))) Here is one way to do this. Notice how it walks down the list ne key against the first element of the first element of the list(this is what caar extracts, which you can check by looking back at the box-and- pointer structure). If we get to the end of the dicate the key is not present. No how we use equal? to check the equality of the key and the L element. And notice that when we do find a matching key, we use cadar to extract the second element of the first element which exactly the value associated with this ke
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. Slide 10.2.1 Let's start by looking at the idea of pairing keys and values, as this is going to be the heart of a table implementation. Our goal is to see how different choices of implementation influence the behavior of the table. A traditional structure for doing this is called an a-list or an association list. This is a list in which each element is itself a list of a key and a value, or in other words, it is a list of lists, all of which are two elements long. Slide 10.2.2 So for example, if we want to represent this table (and notice that this is an abstract version of a table, that is, a binding of x to 15 and y to 20) as an association list, we can do so as the illustrated list of lists. Each of the inner lists is a two-element list of a key and a value. Slide 10.2.3 Or to do this a little more concretely, we could represent this particular table with the illustrated box-and-pointer diagram. This illustrates the nice structure of this system, in which each element of the top level list is a binding, and each of the bindings is just a two-element list of a name and a binding. Slide 10.2.4 If we were to make the design choice to use a-lists as our representation for a table, then we just need to add a couple of operations. First, we will need a way to find an entry for a key in the a-list and returning the associated value. Here is one way to do this. Notice how it walks down the list, checking the key against the first element of the first element of the list (this is what caar extracts, which you can check by looking back at the box-and-pointer structure). If we get to the end of the list, we return false to indicate the key is not present. Notice how we use equal? to check the equality of the key and the element. And notice that when we do find a matching key, we use cadar to extract the second element of the first element, which is exactly the value associated with this key
6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology Slide 10. 2.5 As an example, let's give a name to our simple alist(noticd the useAlist operation: find of the quoted list structure to create this list of lists ). If we then define (find-assoc key alist) evaluate (find-assoc 'y al)(note the use of the to get the (cond symbol y rather than any value associated with it)we get out the ((equal? key (caar alist))(cadar alist)) expected value of 20. Trace through the code and the box-and else (find-assoc key (cdr alist))))) pointer structure to convince yourself that this is correct (define al ' ((x 15)(y 20))) (find-assoc 'y al) >20 Slide 10.2.6 Alist operation: add-asso Adding a new entry into this a-list is pretty easy. We'll just"cons define (add-assoc key val alist it onto the front of the existing list. All we have to do is take the (cons (list key val) alist) key and value to be associated together, put them into a list, then put that list at the front of the existing a-list. Since the a-list is a list, consing something onto the front of the list, by the closure property, gives us a new list Note that in this implementation, we don't actually remove any prior binding of a key from the representation. We simply shadow the old binding with the new one, meaning that the procedure fo looking up a binding will always find the new one first, since it is I at the beginning of the list. So what does this do to the efficiency of find-assoc? Clearly the more things we add, the more things we have to search through to find a binding different from those added Slide 10.2.7 Nonetheless, this is a nice implementation. Thus, we can add a new|Alist operation:add-assoc binding to our old table, giving it a new name(which we need to lo because we need a way to refer to the table). Now, a 2 is a list (cons (list key val) alist)) of 3 elements, with the new binding of y at the front and our original binding further down the list (define a2 (add-assoc y 10 al)) Looking up the binding of y in this a-list will give us the value 10 2 Y10)(x15)y20)) that is, the new binding value (find-assoc y a2)=> 10 Slide 10.2.8 Alists are not an abstract data type This seems like a reasonable implementation, but let's think about this structure. In particular, this is not an abstract data type. And here is why
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. Slide 10.2.5 As an example, let's give a name to our simple alist (noticd the use of the quoted list structure to create this list of lists). If we then evaluate (find-assoc 'y a1) (note the use of the ' to get the symbol y rather than any value associated with it) we get out the expected value of 20. Trace through the code and the box-andpointer structure to convince yourself that this is correct. Slide 10.2.6 Adding a new entry into this a-list is pretty easy. We'll just "cons" it onto the front of the existing list. All we have to do is take the key and value to be associated together, put them into a list, then put that list at the front of the existing a-list. Since the a-list is a list, consing something onto the front of the list, by the closure property, gives us a new list. Note that in this implementation, we don't actually remove any prior binding of a key from the representation. We simply shadow the old binding with the new one, meaning that the procedure for looking up a binding will always find the new one first, since it is at the beginning of the list. So what does this do to the efficiency of find-assoc? Clearly the more things we add, the more things we have to search through to find a binding different from those added. Slide 10.2.7 Nonetheless, this is a nice implementation. Thus, we can add a new binding to our old table, giving it a new name (which we need to do because we need a way to refer to the table). Now, a2 is a list of 3 elements, with the new binding of y at the front and our original binding further down the list. Looking up the binding of y in this a-list will give us the value 10 that is, the new binding value. Slide 10.2.8 This seems like a reasonable implementation, but let's think about this structure. In particular, this is not an abstract data type. And here is why: