6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology 6.001 Notes: Section 15.1 Slide 15.1.1 Why do we need an interpreter? Our goal over the next few lectures is to build an interpreter which in a very basic sense is the ultimate in programming, since doing so will allow us to define our language This is a somewhat surprising statement. But, in fact, as we will see through these lectures, it really is correct. The reason it is correct is the following Every expression we write in our language has a meaning associated with it. Deducing the meaning associated with an expression is the process of evaluation. And therefore, if we can define the program that executes that deduction for us, our definition of that program (i.e. our definition of the interpreter) provides for us the exact specification of what's legal in our anguage and how to deduce meanings of expressions within our language Our goal is to work through this understanding in stages. We will explore a series of examples, initially some very simple examples, but culminating in a full-scale Scheme evaluator. We are going work our way up to this, letting you see how to add the different pieces into an interpreter Before you proceed, however, there is a code handout that goes with this lecture. I suggest that you stop, go back to the web page, and print out a copy of the code to have next to you as we go through the lecture Why do we need an interpreter? Slide 15.1.2 First, let's set the stage for why we want to do this, why do we modules t want to build an interpreter? Why do we need an interpreter? Think about what we have seen so far in this course we have spent a lot of time talking about, and using, the ideas of abstraction, both procedural and data. We know that this is a powerful idea, as it lets us bury details, and it certainly suppor the expression of ideas at an appropriate level. Said another way, we can think about what we want to do and separate that from the details of how to actually make it happen
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. 6.001 Notes: Section 15.1 Slide 15.1.1 Our goal over the next few lectures is to build an interpreter, which in a very basic sense is the ultimate in programming, since doing so will allow us to define our language. This is a somewhat surprising statement. But, in fact, as we will see through these lectures, it really is correct. The reason it is correct is the following: Every expression we write in our language has a meaning associated with it. Deducing the meaning associated with an expression is the process of evaluation. And therefore, if we can define the program that executes that deduction for us, our definition of that program (i.e. our definition of the interpreter) provides for us the exact specification of what's legal in our language and how to deduce meanings of expressions within our language. Our goal is to work through this understanding in stages. We will explore a series of examples, initially some very simple examples, but culminating in a full-scale Scheme evaluator. We are going work our way up to this, letting you see how to add the different pieces into an interpreter. Before you proceed, however, there is a code handout that goes with this lecture. I suggest that you stop, go back to the web page, and print out a copy of the code to have next to you as we go through the lecture. Slide 15.1.2 First, let's set the stage for why we want to do this, why do we want to build an interpreter? Why do we need an interpreter? Think about what we have seen so far in this course. We have spent a lot of time talking about, and using, the ideas of abstraction, both procedural and data. We know that this is a powerful idea, as it lets us bury details, and it certainly supports the expression of ideas at an appropriate level. Said another way, we can think about what we want to do and separate that from the details of how to actually make it happen
6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology Slide 15.1.3 But eventually, we need a process for unwinding all of those Why do we need an interpreter? abstractions, to get the value that corresponds to an expression's. Abstractions let us bury details and focus on use of meaning. That means we need to implement semantics of modules to sole large systems Interpreting expressions Need to unwind abstractions at execution time to deduce Why do we need an interpreter? Slide 15.1. 4 Abstractions let us bury details and focus on use of Notice the words I used. I said, "we need a process for modules to sole large systems unwinding those abstractions".If we can have such a process, Need to unwind abstractions at execution time to deduce hen we should be able to describe it in a procedure: that is our meaning language for describing processes Have seen such a process- Environment Model In fact, you have already seen a version of this description, just not as a procedure. what was the description?.the ow want to describe that process as a procedure environment model! If you think about it that makes sense. The environment model just described the process for how to determine a meaning I associated with an expression, which in turn just unwrapped the abstractions down to the primitive operations. Today, we want to talk about how to actually build an evaluator as a procedure rather than as that abstract environment model Slide 15.1 Stages of an interpreter First, what are the stages of an interpreter? For the kind of Lexical analy? languages we are considering, Scheme or Lisp like languages, typically there are five stages. There is a lexical analyzer, a Parse parser, an evaluator that typically works hand-in-hand with an environment, and there is a printer Evaluator Let's look at what goes on in each of these at a high level, before we build them evironment
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. Slide 15.1.3 But eventually, we need a process for unwinding all of those abstractions, to get the value that corresponds to an expression’s meaning. That means we need to implement semantics of interpreting expressions. Slide 15.1.4 Notice the words I used. I said, "we need a process for unwinding those abstractions". If we can have such a process, then we should be able to describe it in a procedure: that is our language for describing processes. In fact, you have already seen a version of this description, just not as a procedure. What was the description? ... the environment model! If you think about it, that makes sense. The environment model just described the process for how to determine a meaning associated with an expression, which in turn just unwrapped the abstractions down to the primitive operations. Today, we want to talk about how to actually build an evaluator as a procedure rather than as that abstract environment model. Slide 15.1.5 First, what are the stages of an interpreter? For the kind of languages we are considering, Scheme or Lisp like languages, typically there are five stages. There is a lexical analyzer, a parser, an evaluator that typically works hand-in-hand with an environment, and there is a printer. Let's look at what goes on in each of these at a high level, before we build them
6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology Slide 15.1.6 Stages of an interpreter Linput to each stage To do that, let's talk about the input and output characteristics Lexical analyzer of each of them By focusing on the input to each successive stage in the interpretation, we can get a sense of what should happen at each stage, and thus get a sense of how to build an interpreter Eva1uat。r Environmen p⊥nte Slide 15.1.7 Stages of an interpreter input to each stage The initial input is a string of characters, which represents the Lexical analyzer) "(average 4(+55)) typewritten version of the expression we want to evaluate. Th is exactly the thing that we would type in at a terminal if we Parser wanted to have an expression evaluated So the initial input is a tring of characters Evaluato Printer Stages of an interpreter input to each stage Slide 15.1. 8 Lexical analy: r(average 4(+ 55))" The first step is to use a lexical analyzer to convert that string of characters into units or words This is shown here. where the average Parser string gets converted into a set of words or isolated characters like"("and")"and+and numbers. Thus the input to the next Evaluator stage is an ordered sequence of these units or words Env⊥ ronment Printer Slide 15.1.9 Stages of an interpreter input to each stage The second stage then parses those words into a structure that Lexical analyzer) "(average 4(+55) we can use for evaluation. In particular, we convert the linear sequence of words into a tree structure. We are using pairs here Parser for convenience but that is not required we could use any other 回国国 representation of trees as well Evaluator As we do this, we are going convert the self-evaluating expressions into their internal values So notice the form we get Environment for the next stage: it's a tree structure, and hanging off of the symbol + leaves of the tree are numbers, symbols, or other objects that Printer are represented as basic words. This is the input to the next
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. Slide 15.1.6 To do that, let's talk about the input and output characteristics of each of them. By focusing on the input to each successive stage in the interpretation, we can get a sense of what should happen at each stage, and thus get a sense of how to build an interpreter. Slide 15.1.7 The initial input is a string of characters, which represents the typewritten version of the expression we want to evaluate. This is exactly the thing that we would type in at a terminal if we wanted to have an expression evaluated. So the initial input is a string of characters. Slide 15.1.8 The first step is to use a lexical analyzer to convert that string of characters into units or words. This is shown here, where the string gets converted into a set of words or isolated characters like "(" and ")" and "+" and numbers. Thus the input to the next stage is an ordered sequence of these units or words. Slide 15.1.9 The second stage then parses those words into a structure that we can use for evaluation. In particular, we convert the linear sequence of words into a tree structure. We are using pairs here for convenience but that is not required. We could use any other representation of trees as well. As we do this, we are going convert the self-evaluating expressions into their internal values. So notice the form we get for the next stage: it's a tree structure, and hanging off of the leaves of the tree are numbers, symbols, or other objects that are represented as basic words. This is the input to the next stage
6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology Stages of an interpreter input to each stage Slide 15.1.10 Lexical a (average 4 (+ 55)) Now comes the heart of the interpreter. We want to take that tree structure and deduce its value. first. notice the form of tree (Parser structure. We will talk about this in detail later, but you can 回回 the parser has converted to a tree Evaluator Every time we see an"(", indicating the beginning of a new Environment we build up a tree where each horizontal slice through the tree Printer corresponds to some combination. Now what is the evaluator m supposed to do? It wants to take that tree structure, plus an environment,and interpret it. And what does that mean? Thin of the environment as a way of associating names with more primitive values. It acts much like a dictionary. The evaluator will use a set of rules to walk through this tree, looking up values associated with symbols from the environment, i.e. from that dictionary, and then using the rules to reduce complex expressions to simpler things culminating in some simple value that we will return 15.L.11 Stages of an interpreter input to each stage] value becomes input to the final stage. The printer simply Lexical analyzer) "(average 4(+55)) converts things back into the appropriate form for display on the monitor and then Parse 凹回回 Eva1uat⊙ symbol Environment avera symbol+5 7 Stages of an interpreter input to each stage Slide 15.1.12 Lexical analyzer m(average 4(+ 55)) that just gets displayed to the user average arser Eva1uat。r 口→[团 ment) Bera Printer 7
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. Slide 15.1.10 Now comes the heart of the interpreter. We want to take that tree structure and deduce its value. First, notice the form of tree structure. We will talk about this in detail later, but you can already see how the parser has converted things into a tree. Every time we see an "(", indicating the beginning of a new combination, we have created a new list structure. If the parser is already inside a list structure, it drops down a level, so that we build up a tree where each horizontal slice through the tree corresponds to some combination. Now what is the evaluator supposed to do? It wants to take that tree structure, plus an environment, and interpret it. And what does that mean? Think of the environment as a way of associating names with more primitive values. It acts much like a dictionary. The evaluator will use a set of rules to walk through this tree, looking up values associated with symbols from the environment, i.e. from that dictionary, and then using the rules to reduce complex expressions to simpler things, culminating in some simple value that we will return. Slide 15.1.11 That value becomes input to the final stage. The printer simply converts things back into the appropriate form for display on the monitor, and then ... Slide 15.1.12 ... that just gets displayed to the user
6.001 Structure and Interpretation of Computer Programs. Copyright o 2004 by Massachusetts Institute of Technology Slide 15.1.13 So here is a summary of that process in words Role of each part of the interpreter break up input string into"words"called tokens Parser also convert self-evaluating tokens to their internal values #f is converted to the internal false value Printe convert value to human-readable output string Goal of this lecture Slide 15.1.14 lement an interpreter for a programming language Our goal is to implement an interpreter. Actually, that's not quite right. Our goal is for you to understand what goes into an Only write evaluator and environment use schemes reader for lexical analysis and parsing interpreter, which we will explore by implementing one. Since use scheme's printer for output the key part of an interpreter, the crucial part, is the evaluator, to do this, our language must look like scheme we are going to concentrate almost exclusively on that. We are going to use Scheme for all the rest of the pieces, that is, we All names end with a star will use Scheme's lexical analyzer and parser, and Scheme's Start with a simple calculator for arithmeti printer, rather than building them from scratch. This means of Progressively add scheme* features course that we will need to create an evaluator for a language that looks a lot like scheme in the sense of having a tree structure as the output of its parser and a set of rules for manipulating that tree structure, as the way of dealing with the actual evaluation. It also says that the syntax of the language we are going to use to demonstrate an interpreter will need to look at lot like Scheme, in terms of things lke using parentheses to delimit expressions and other related issues We say this because we don,'t want you to get confused between what is going in Scheme and the general ideas of building an evaluator and interpreter. Our goal is to build an interpreter, especially the evaluator part, and let you going to build our own simple language and its evaluator For convenience, we are going to call this language are see how that occurs and use that to explore the idea of how these things implement the rules for a language. w Scheme*. It has a lot of the characteristics of Scheme, but we will use the convention that a* will be placed at the end of every expression in our language, to distinguish it from the corresponding Scheme expression We'll start with a simple evaluator and work our way up The first simple evaluator will be one that handles simple arithmetic expressions 6.001 Notes: Section 15.2
6.001 Structure and Interpretation of Computer Programs. Copyright © 2004 by Massachusetts Institute of Technology. Slide 15.1.13 So here is a summary of that process in words. Slide 15.1.14 Our goal is to implement an interpreter. Actually, that's not quite right. Our goal is for you to understand what goes into an interpreter, which we will explore by implementing one. Since the key part of an interpreter, the crucial part, is the evaluator, we are going to concentrate almost exclusively on that. We are going to use Scheme for all the rest of the pieces, that is, we will use Scheme's lexical analyzer and parser, and Scheme's printer, rather than building them from scratch. This means of course that we will need to create an evaluator for a language that looks a lot like Scheme in the sense of having a tree structure as the output of its parser and a set of rules for manipulating that tree structure, as the way of dealing with the actual evaluation. It also says that the syntax of the language we are going to use to demonstrate an interpreter will need to look at lot like Scheme, in terms of things like using parentheses to delimit expressions and other related issues. We say this because we don't want you to get confused between what is going in Scheme and the general ideas of building an evaluator and interpreter. Our goal is to build an interpreter, especially the evaluator part, and let you see how that occurs and use that to explore the idea of how these things implement the rules for a language. We are going to build our own simple language and its evaluator. For convenience, we are going to call this language, Scheme*. It has a lot of the characteristics of Scheme, but we will use the convention that a * will be placed at the end of every expression in our language, to distinguish it from the corresponding Scheme expression. We'll start with a simple evaluator and work our way up. The first simple evaluator will be one that handles simple arithmetic expressions. 6.001 Notes: Section 15.2