p1.10(20) 1.2.MODELING COMPUTATION AND EFFICIENCY of M's tapes:for every cell in M's tape we will have log I cells in the corresponding tape of M (see Figure 1.3). To simulate one step of M,the machine M will:(1)use log I steps to read from each tape the log I bits encoding a symbol of I(2)use its state register to store the symbols read,(3)use M's transition function to compute the symbols M writes and M's new state given this information, (3)store this information in its state register,and (4)use log T steps to write the encodings of these symbols on its tapes. One can verify that this can be carried out if M has access to registers that can store M's state, k symbols in I and a counter from 1 to k.Thus,there is such a machine M utilizing no more than 10Qk states.(In general,we can always simulate several registers using one register with a larger state space.For example,we can simulate three registers taking values in the sets A,B and C respectively with one register taking a value in the set A x B x C which is of size ABC.) It is not hard to see that for every input z {0,1]",if on input z the TM M outputs f(x) within T(n)steps,then M will output the same value within less than 4log rT(n)steps. CLAIM 1.9 For every f {0,1*{0,1,time-constructible T:N-N,if f is computable in time T(n) by a TM M using k tapes(plus additional input and output tapes)then it is computable in time 5kT(n)2 by a TM M using only a single work tape (plus additional input and output tapes). M's 3 work tapes: Tape1:c ompie▣eyT Tape 2: Encoding this in one tape of M: 123123123123123123 cr moeamc plhl a ie cn Figure 1.4:Simulating a machine M with 3 work tapes using a machine M with a single work tape (in addition to the input and output tapes). PROOF SKETCH:Again the idea is simple:the TM M encodes the k tapes of M on a single tape by using locations 1,k+1,2k+1,...to encode the first tape,locations 2,k+2,2k+2,...to encode the second tape etc..(see Figure 1.4).For every symbol a in M's alphabet,M will contain both the symbol a and the symbol a.In the encoding of each tape,exactly one symbol will be of the" type",indicating that the corresponding head of M is positioned in that location(see figure).M uses the input and output tape in the same way M does.To simulate one step of M,the machine M makes two sweeps of its work tape:first it sweeps the tape in the left-to-right direction and Web draft2007-01-0821:59
DRAFT p1.10 (20) 1.2. MODELING COMPUTATION AND EFFICIENCY of M’s tapes: for every cell in M’s tape we will have log |Γ| cells in the corresponding tape of M˜ (see Figure 1.3). To simulate one step of M, the machine M˜ will: (1) use log |Γ| steps to read from each tape the log |Γ| bits encoding a symbol of Γ (2) use its state register to store the symbols read, (3) use M’s transition function to compute the symbols M writes and M’s new state given this information, (3) store this information in its state register, and (4) use log |Γ| steps to write the encodings of these symbols on its tapes. One can verify that this can be carried out if M˜ has access to registers that can store M’s state, k symbols in Γ and a counter from 1 to k. Thus, there is such a machine M˜ utilizing no more than 10|Q||Γ| kk states. (In general, we can always simulate several registers using one register with a larger state space. For example, we can simulate three registers taking values in the sets A,B and C respectively with one register taking a value in the set A × B × C which is of size |A||B||C|.) It is not hard to see that for every input x ∈ {0, 1} n , if on input x the TM M outputs f(x) within T(n) steps, then M˜ will output the same value within less than 4 log |Γ|T(n) steps. Claim 1.9 For every f : {0, 1} ∗ → {0, 1}, time-constructible T : N → N, if f is computable in time T(n) by a TM M using k tapes (plus additional input and output tapes) then it is computable in time 5kT(n) 2 by a TM M˜ using only a single work tape (plus additional input and output tapes). M’s 3 work tapes: c o m p l e t e l y r e p l a c e d m a c h i n e s Encoding this in one tape of M: c r m o e a m p c p l h l a i e c n 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 Tape 1: Tape 2: Tape 3: ~ ^ ^ ^ Figure 1.4: Simulating a machine M with 3 work tapes using a machine M˜ with a single work tape (in addition to the input and output tapes). Proof Sketch: Again the idea is simple: the TM M˜ encodes the k tapes of M on a single tape by using locations 1, k + 1, 2k + 1, . . . to encode the first tape, locations 2, k + 2, 2k + 2, . . . to encode the second tape etc.. (see Figure 1.4). For every symbol a in M’s alphabet, M˜ will contain both the symbol a and the symbol ˆa. In the encoding of each tape, exactly one symbol will be of the “ˆ type”, indicating that the corresponding head of M is positioned in that location (see figure). M˜ uses the input and output tape in the same way M does. To simulate one step of M, the machine M˜ makes two sweeps of its work tape: first it sweeps the tape in the left-to-right direction and Web draft 2007-01-08 21:59
1.2.MODELING COMPUTATION AND EFFICIENCY p1.11(21) records to its register the k symbols that are marked by "Then M uses M's transition function to determine the new state,symbols,and head movements and sweeps the tape back in the right- to-left direction to update the encoding accordingly.Clearly,M will have the same output as M. Also,since on n-length inputs M never reaches more than location T(n)of any of its tapes,M will never need to reach more than location kT(n)of its work tape,meaning that for each the at most T(n)steps of M,M performs at most 5kT(n)work (sweeping back and forth requires about 2T(n) steps,and some additional steps may be needed for updating head movement and book keeping). ■ REMARK 1.10 With a bit of care,one can ensure that the proof of Claim 1.9 yields a TM M with the following property:the head movements of M are independent of the contents of its tapes but only on the input length (i.e.,M always performs a sequence of left to right and back sweeps of the same form regardless of what is the input).A machine with this property is called oblivious and the fact that every TM can be simulated by an oblivious TM will be useful for us later on (see Exercises 8 and 9 and the proof of Theorem 2.10). CLAIM 1.11 Define a bidirectional TM to be a TM whose tapes are infinite in both directions.For every f {0,1*10,1*and time constructible T:N-N,if f is computable in time T(n)by a bidirectional TM M then it is computable in time 4T(n)by a standard (unidirectional)TM M. M's tape is infinite in both directions: cmy匹 ey■ wp匹 Muses a larger alphabet to represent it on a standard tape: en ud m to ye Figure 1.5:To simulate a machine M with alphabet I that has tapes infinite in both directions,we use a machine M with alphabet I2 whose tapes encode the "folded"version of M's tapes. PROOF SKETCH:The idea behind the proof is illustrated in Figure 1.5.If M uses alphabet T then M will use the alphabet I2(i.e.,each symbol in M's alphabet corresponds to a pair of symbols in M's alphabet).We encode a tape of M that is infinite in both direction using a standard (infinite in one direction)tape by "folding"it in an arbitrary location,with each location of M's tape encoding two locations of M's tape.At first,M will ignore the second symbol in the cell it reads and act according to M's transition function.However,if this transition function instructs M to go "over the edge"of its tape then instead it will start ignoring the first symbol in each cell and use only the second symbol.When it is in this mode,it will translate left movements into right movements and vice versa.If it needs to go "over the edge"again then it will go back to reading the first symbol of each cell,and translating movements normally. Web draft2007-01-0821:59
DRAFT 1.2. MODELING COMPUTATION AND EFFICIENCY p1.11 (21) records to its register the k symbols that are marked by ˆ. Then M˜ uses M’s transition function to determine the new state, symbols, and head movements and sweeps the tape back in the rightto-left direction to update the encoding accordingly. Clearly, M˜ will have the same output as M. Also, since on n-length inputs M never reaches more than location T(n) of any of its tapes, M˜ will never need to reach more than location kT(n) of its work tape, meaning that for each the at most T(n) steps of M, M˜ performs at most 5kT(n) work (sweeping back and forth requires about 2T(n) steps, and some additional steps may be needed for updating head movement and book keeping). Remark 1.10 With a bit of care, one can ensure that the proof of Claim 1.9 yields a TM M˜ with the following property: the head movements of M˜ are independent of the contents of its tapes but only on the input length (i.e., M˜ always performs a sequence of left to right and back sweeps of the same form regardless of what is the input). A machine with this property is called oblivious and the fact that every TM can be simulated by an oblivious TM will be useful for us later on (see Exercises 8 and 9 and the proof of Theorem 2.10). Claim 1.11 Define a bidirectional TM to be a TM whose tapes are infinite in both directions. For every f : {0, 1} ∗ → {0, 1} ∗ and time constructible T : N → N, if f is computable in time T(n) by a bidirectional TM M then it is computable in time 4T(n) by a standard (unidirectional) TM M˜ . c o m p l e t e l y M’s tape is infinite in both directions: e t e l y l p m o c > e/l t/d e/m l/o y/c M uses a larger alphabet to represent it on a standard tape: ~ Figure 1.5: To simulate a machine M with alphabet Γ that has tapes infinite in both directions, we use a machine M˜ with alphabet Γ2 whose tapes encode the “folded” version of M’s tapes. Proof Sketch: The idea behind the proof is illustrated in Figure 1.5. If M uses alphabet Γ then M˜ will use the alphabet Γ2 (i.e., each symbol in M˜ ’s alphabet corresponds to a pair of symbols in M’s alphabet). We encode a tape of M that is infinite in both direction using a standard (infinite in one direction) tape by “folding” it in an arbitrary location, with each location of M˜ ’s tape encoding two locations of M’s tape. At first, M˜ will ignore the second symbol in the cell it reads and act according to M’s transition function. However, if this transition function instructs M˜ to go “over the edge” of its tape then instead it will start ignoring the first symbol in each cell and use only the second symbol. When it is in this mode, it will translate left movements into right movements and vice versa. If it needs to go “over the edge” again then it will go back to reading the first symbol of each cell, and translating movements normally. Web draft 2007-01-08 21:59
p1.12(22 1.3.MACHINES AS STRINGS AND THE UNIVERSAL TURING MACHINES. Other changes that will not have a very significant effect include having two or three dimensional tapes,allowing the machine random access to its tape,and making the output tape write only(see Exercises 11 and 12;also the texts [SIP96,HMU01]contain more examples).In particular none of these modifications will change the class P of polynomial-time computable decision problems defined below in Section 1.5. 1.2.3 The expressive power of Turing machines. When you encounter Turing machines for the first time,it may not be clear that they do indeed fully encapsulate our intuitive notion of computation.It may be useful to work through some simple examples,such as expressing the standard algorithms for addition and multiplication in terms of Turing machines computing the corresponding functions (see Exercise 4).You can also verify that you can simulate a program in your favorite programming language using a Turing machine.(The reverse direction also holds:most programming languages can simulate a Turing machine. EXAMPLE 1.12 (This example assumes some background in computing.)We give a hand-wavy proof that Turing machines can simulate any program written in any of the familiar programming languages such as C or Java.First,recall that programs in these programming languages can be translated (the technical term is compiled)into an equivalent machine language program.This is a sequence of simple instructions to read from memory into one of a finite number of registers,write a register's contents to memory,perform basic arithmetic operations,such as adding two registers,and control instructions that perform actions conditioned on,say,whether a certain register is equal to zero. All these operations can be easily simulated by a Turing machine.The memory and register can be implemented using the machine's tapes,while the instructions can be encoded by the machine's transition function.For example,it's not hard to show TM's that add or multiply two numbers, or a two-tape TM that,if its first tape contains a number i in binary representation,can move the head of its second tape to the ith location. Exercise 13 asks you to give a more rigorous proof of such a simulation for a simple tailor-made programming language. 1.3 Machines as strings and the universal Turing machines. It is almost obvious that a Turing machine can be represented as a string:since we can write the description of any TM M on paper,we can definitely encode this description as a sequence of zeros and ones.Yet this simple observation-that we can treat programs as data-has had far reaching consequences on both the theory and practice of computing.Without it,we would not have had general purpose electronic computers,that,rather than fixed to performing one task,can execute arbitrary programs. Web draft2007-01-0821:59
DRAFT p1.12 (22) 1.3. MACHINES AS STRINGS AND THE UNIVERSAL TURING MACHINES. Other changes that will not have a very significant effect include having two or three dimensional tapes, allowing the machine random access to its tape, and making the output tape write only (see Exercises 11 and 12; also the texts [SIP96, HMU01] contain more examples). In particular none of these modifications will change the class P of polynomial-time computable decision problems defined below in Section 1.5. 1.2.3 The expressive power of Turing machines. When you encounter Turing machines for the first time, it may not be clear that they do indeed fully encapsulate our intuitive notion of computation. It may be useful to work through some simple examples, such as expressing the standard algorithms for addition and multiplication in terms of Turing machines computing the corresponding functions (see Exercise 4). You can also verify that you can simulate a program in your favorite programming language using a Turing machine. (The reverse direction also holds: most programming languages can simulate a Turing machine.) Example 1.12 (This example assumes some background in computing.) We give a hand-wavy proof that Turing machines can simulate any program written in any of the familiar programming languages such as C or Java. First, recall that programs in these programming languages can be translated (the technical term is compiled) into an equivalent machine language program. This is a sequence of simple instructions to read from memory into one of a finite number of registers, write a register’s contents to memory, perform basic arithmetic operations, such as adding two registers, and control instructions that perform actions conditioned on, say, whether a certain register is equal to zero. All these operations can be easily simulated by a Turing machine. The memory and register can be implemented using the machine’s tapes, while the instructions can be encoded by the machine’s transition function. For example, it’s not hard to show TM’s that add or multiply two numbers, or a two-tape TM that, if its first tape contains a number i in binary representation, can move the head of its second tape to the i th location. Exercise 13 asks you to give a more rigorous proof of such a simulation for a simple tailor-made programming language. 1.3 Machines as strings and the universal Turing machines. It is almost obvious that a Turing machine can be represented as a string: since we can write the description of any TM M on paper, we can definitely encode this description as a sequence of zeros and ones. Yet this simple observation— that we can treat programs as data— has had far reaching consequences on both the theory and practice of computing. Without it, we would not have had general purpose electronic computers, that, rather than fixed to performing one task, can execute arbitrary programs. Web draft 2007-01-08 21:59
1.3.MACHINES AS STRINGS AND THE UNIVERSAL TURING MACHINES. p1.13(23) Because we will use this notion of representing TM's as strings quite extensively,it may be worth to spell out our representation out a bit more concretely.Since the behavior of a Turing machine is determined by its transition function,we will use the list of all inputs and outputs of this function (which can be easily encoded as a string in {0,1*)as the encoding of the Turing machine.We will also find it convenient to assume that our representation scheme satisfies the following properties: 1.Every string in {0,1}*represents some Turing machine. This is easy to ensure by mapping strings that are not valid encodings into some canonical trivial TM,such as the TM that immediately halts and outputs zero on any input. 2.Every TM is represented by infinitely many strings. This can be ensured by specifying that the representation can end with an arbitrary number of 1's,that are ignored.This has somewhat similar effect as the comments of many programming languages (e.g.,the /*...*construct in C,C++and Java)that allows to add superfluous symbols to any program. If M is a Turing machine,then we use M,to denotes M's representation as a binary string. If o is a string then we denote the TM that a represents by Ma.As is our convention,we will also often use M to denote both the TM and its representation as a string.Exercise 14 asks you to fully specify a representation scheme for Turing machines with the above properties. 1.3.1 The Universal Turing Machine It was Turing that first observed that general purpose computers are possible,by showing a universal Turing machine that can simulate the execution of every other TM M given M's description as input.Of course,since we are so used to having a universal computer on our desktops or even in our pockets,today we take this notion for granted.But it is good to remember why it was once counterintuitive.The parameters of the universal TM are fixed -alphabet size,number of states,and number of tapes.The corresponding parameters for the machine being simulated could be much larger.The reason this is not a hurdle is,of course,the ability to use encodings. Even if the universal TM has a very simple alphabet,say {0,1},this is sufficient to allow it to represent the other machine's state and and transition table on its tapes,and then follow along in the computation step by step. Now we state a computationally efficient version of Turing's construction due to Hennie and Stearns [HS66].To give the essential idea we first prove a slightly relaxed variant where the term Tlog T below is replaced with T2.But since the efficient version is needed a few times in the book, a full proof is also given at the end of the chapter (see Section 1.A). 4Note that the size of the alphabet,the number of tapes,and the size of the state space can be deduced from the transition function's table.We can also reorder the table to ensure that the special states qatant,hat are the first 2 states of the TM.Similarly,we may assume that the symbols D,D,0,1 are the first 4 symbols of the machine's alphabet Wab.draft2007-01-0821:59
DRAFT 1.3. MACHINES AS STRINGS AND THE UNIVERSAL TURING MACHINES. p1.13 (23) Because we will use this notion of representing TM’s as strings quite extensively, it may be worth to spell out our representation out a bit more concretely. Since the behavior of a Turing machine is determined by its transition function, we will use the list of all inputs and outputs of this function (which can be easily encoded as a string in {0, 1} ∗ ) as the encoding of the Turing machine.4 We will also find it convenient to assume that our representation scheme satisfies the following properties: 1. Every string in {0, 1} ∗ represents some Turing machine. This is easy to ensure by mapping strings that are not valid encodings into some canonical trivial TM, such as the TM that immediately halts and outputs zero on any input. 2. Every TM is represented by infinitely many strings. This can be ensured by specifying that the representation can end with an arbitrary number of 1’s, that are ignored. This has somewhat similar effect as the comments of many programming languages (e.g., the /*...*/ construct in C,C++ and Java) that allows to add superfluous symbols to any program. If M is a Turing machine, then we use xMy to denotes M’s representation as a binary string. If α is a string then we denote the TM that α represents by Mα. As is our convention, we will also often use M to denote both the TM and its representation as a string. Exercise 14 asks you to fully specify a representation scheme for Turing machines with the above properties. 1.3.1 The Universal Turing Machine It was Turing that first observed that general purpose computers are possible, by showing a universal Turing machine that can simulate the execution of every other TM M given M’s description as input. Of course, since we are so used to having a universal computer on our desktops or even in our pockets, today we take this notion for granted. But it is good to remember why it was once counterintuitive. The parameters of the universal TM are fixed —alphabet size, number of states, and number of tapes. The corresponding parameters for the machine being simulated could be much larger. The reason this is not a hurdle is, of course, the ability to use encodings. Even if the universal TM has a very simple alphabet, say {0, 1}, this is sufficient to allow it to represent the other machine’s state and and transition table on its tapes, and then follow along in the computation step by step. Now we state a computationally efficient version of Turing’s construction due to Hennie and Stearns [HS66]. To give the essential idea we first prove a slightly relaxed variant where the term T log T below is replaced with T 2 . But since the efficient version is needed a few times in the book, a full proof is also given at the end of the chapter (see Section 1.A). 4Note that the size of the alphabet, the number of tapes, and the size of the state space can be deduced from the transition function’s table. We can also reorder the table to ensure that the special states qstart, qhalt are the first 2 states of the TM. Similarly, we may assume that the symbols B, , 0, 1 are the first 4 symbols of the machine’s alphabet. Web draft 2007-01-08 21:59
p1.14(24) 1.3.MACHINES AS STRINGS AND THE UNIVERSAL TURING MACHINES. THEOREM 1.13 (EFFICIENT UNIVERSAL TURING MACHINE) There exists a TMu such that for every a 10,1",u(t,a)=Ma(x),where Ma denotes the TM represented by a. Furthermore,if Mo halts on input x within T steps then u(x,a)halts within CTlogT steps,where C is a number independent of r and depending only on Mo's alphabet size,number of tapes,and number of states. REMARK 1.14 A common exercise in programming courses is to write an interpreter for a particular programming language using the same language.(An interpreter takes a program P as input and outputs the result of executing the program P.)Theorem 1.13 can be considered a variant of this exercise. PROOF:Our universal TM u is given an input z,a,where a represents some TM M,and needs to output M(x).A crucial observation is that we may assume that M(1)has a single work tape (in addition to the input and output tape)and (2)uses the alphabet ,,0,1}.The reason is that u can transform a representation of every TM M into a representation of an equivalent TM M that satisfies these properties as shown in the proofs of Claims 1.8 and 1.9.Note that these transformations may introduce a quadratic slowdown(i.e.,transform M from running in T time to running in C'T2 time where C'depends on M's alphabet size and number of tapes). Input tape >o0011010001000o (used in the same way as M) Work tapes Simulation of M's work tape used in the same way as M) Description of M Current state of M Output tape 01 sed n the same way as博 Figure 1.6:The universal TMu has in addition to the input and output tape,three work tapes.One work tape will have the same contents as the simulated machine M,another tape includes the description M(converted to an equivalent one-work-tape form),and another tape contains the current state of M. The TM u uses the alphabet ,,0,1}and three work tapes in addition to its input and output tape (see Figure 1.6).l uses its input tape,output tape,and one of the work tapes in the same way M uses its three tapes.In addition,W will use its first extra work tape to store the table of values of M's transition function(after applying the transformations of Claims 1.8 and 1.9 as noted above),and its other extra work tape to store the current state of M.To simulate one computational step of M,u scans the table of M's transition function and the current state to find out the new state,symbols to be written and head movements,which it then executes.We see that each computational step of M is simulated using C steps of u,where C is some number depending on the size of the transition function's table. RA Web draft2007-01-0821:59
DRAFT p1.14 (24) 1.3. MACHINES AS STRINGS AND THE UNIVERSAL TURING MACHINES. Theorem 1.13 (Efficient Universal Turing machine) There exists a TM U such that for every x, α ∈ {0, 1} ∗ , U(x, α) = Mα(x), where Mα denotes the TM represented by α. Furthermore, if Mα halts on input x within T steps then U(x, α) halts within CT log T steps, where C is a number independent of |x| and depending only on Mα’s alphabet size, number of tapes, and number of states. Remark 1.14 A common exercise in programming courses is to write an interpreter for a particular programming language using the same language. (An interpreter takes a program P as input and outputs the result of executing the program P.) Theorem 1.13 can be considered a variant of this exercise. Proof: Our universal TM U is given an input x, α, where α represents some TM M, and needs to output M(x). A crucial observation is that we may assume that M (1) has a single work tape (in addition to the input and output tape) and (2) uses the alphabet {B, , 0, 1}. The reason is that U can transform a representation of every TM M into a representation of an equivalent TM M˜ that satisfies these properties as shown in the proofs of Claims 1.8 and 1.9. Note that these transformations may introduce a quadratic slowdown (i.e., transform M from running in T time to running in C 0T 2 time where C 0 depends on M’s alphabet size and number of tapes). Input tape Work tapes Output tape > 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 > 0 1 Description of M Current state of M Simulation of M’s work tape. (used in the same way as M) (used in the same way as M) (used in the same way as M) Figure 1.6: The universal TM U has in addition to the input and output tape, three work tapes. One work tape will have the same contents as the simulated machine M, another tape includes the description M (converted to an equivalent one-work-tape form), and another tape contains the current state of M. The TM U uses the alphabet {B, , 0, 1} and three work tapes in addition to its input and output tape (see Figure 1.6). U uses its input tape, output tape, and one of the work tapes in the same way M uses its three tapes. In addition, U will use its first extra work tape to store the table of values of M’s transition function (after applying the transformations of Claims 1.8 and 1.9 as noted above), and its other extra work tape to store the current state of M. To simulate one computational step of M, U scans the table of M’s transition function and the current state to find out the new state, symbols to be written and head movements, which it then executes. We see that each computational step of M is simulated using C steps of U, where C is some number depending on the size of the transition function’s table. Web draft 2007-01-08 21:59