2.Algorithms and Data 21 sense when it has reached the end of the list.The resulting algorithm would look very much like the version given,but would use the form"repeat the following until end of list reached"in clause (2).You should try writing down the full algorithm for this case. Notice how iteration constructs make it possible for a short portion of an algo- rithm's text to prescribe very long processes,the length being dictated by the size of the inputs-in this case the length of the employee list.Iteration,therefore,is the key to the seeming paradox of a single,fixed algorithm performing tasks of ever-longer duration. Combining Control Structures An algorithm can contain many control-flow constructs in nontrivial combinations. Sequencing,branching,and iteration can be interleaved and nested within each other.For example,algorithms can contain nested iterations,more commonly called nested loops.A loop inside a loop can take on the form"do A exactly N times," where A itself is,say,of the form"repeat B until C."The processor executing such a segment has to work quite hard;each of the N times it carries out A-that is, each time the outer loop is traversed-the inner loop must be traversed repeatedly until C becomes true.Here the outer loop is bounded and the inner one conditional, but other combinations are possible too.The A part of the outer loop can contain many further segments,each of which can,in turn,employ additional sequencing, branching,and iteration constructs,and the same goes for the inner loop.Thus,there is no limit to the potential intricacy of algorithms. Let us consider a simple example of the power of nested iterations.Suppose that the problem was to sum salaries,but not of all employees,only of those who earn more than their direct managers.Of course it is assumed that (except for the true "boss")an employee's record contains the name of that employee's manager.An algorithm that solves this problem might be constructed so that an outer loop runs down the list as before,but for each employee "pointed at"an inner loop searches the list for the record of that employee's direct manager.When the manager has finally been found,a conditional construct is used to determine whether or not the employee's salary should be accumulated in the "noted number,"a decision that requires comparing the two salaries.Upon completing this "internal"activity,the outer loop resumes control and proceeds to the next employee,whose manager is then sought for,until the end of the list is reached.(See Figure 2.4 for a diagrammatic version of this algorithm.) Bubblesort:An Example To further illustrate control structures,let us examine a sorting algorithm.Sorting is one of the most interesting topics in algorithmics,and many important developments are connected with it in one way or another.The input to a sorting problem is an unordered list of elements,say numbers.Our task is to produce the list sorted in
P1: GIG PE002-02drv PE002-Harel PE002-Harel-v4.cls March 18, 2004 13:47 2. Algorithms and Data 21 sense when it has reached the end of the list. The resulting algorithm would look very much like the version given, but would use the form “repeat the following until end of list reached” in clause (2). You should try writing down the full algorithm for this case. Notice how iteration constructs make it possible for a short portion of an algorithm’s text to prescribe very long processes, the length being dictated by the size of the inputs—in this case the length of the employee list. Iteration, therefore, is the key to the seeming paradox of a single, fixed algorithm performing tasks of ever-longer duration. ■ Combining Control Structures An algorithm can contain many control-flow constructs in nontrivial combinations. Sequencing, branching, and iteration can be interleaved and nested within each other. For example, algorithms can contain nested iterations, more commonly called nested loops. A loop inside a loop can take on the form “do A exactly N times,” where A itself is, say, of the form “repeat B until C.” The processor executing such a segment has to work quite hard; each of the N times it carries out A—that is, each time the outer loop is traversed—the inner loop must be traversed repeatedly until C becomes true. Here the outer loop is bounded and the inner one conditional, but other combinations are possible too. The A part of the outer loop can contain many further segments, each of which can, in turn, employ additional sequencing, branching, and iteration constructs, and the same goes for the inner loop. Thus, there is no limit to the potential intricacy of algorithms. Let us consider a simple example of the power of nested iterations. Suppose that the problem was to sum salaries, but not of all employees, only of those who earn more than their direct managers. Of course it is assumed that (except for the true “boss”) an employee’s record contains the name of that employee’s manager. An algorithm that solves this problem might be constructed so that an outer loop runs down the list as before, but for each employee “pointed at” an inner loop searches the list for the record of that employee’s direct manager. When the manager has finally been found, a conditional construct is used to determine whether or not the employee’s salary should be accumulated in the “noted number,” a decision that requires comparing the two salaries. Upon completing this “internal” activity, the outer loop resumes control and proceeds to the next employee, whose manager is then sought for, until the end of the list is reached. (See Figure 2.4 for a diagrammatic version of this algorithm.) ■ Bubblesort: An Example To further illustrate control structures, let us examine a sorting algorithm. Sorting is one of the most interesting topics in algorithmics, and many important developments are connected with it in one way or another. The input to a sorting problem is an unordered list of elements, say numbers. Our task is to produce the list sorted in
22 I.Preliminaries Figure 2.1 dog dog dog typical typical typical typical Two bubblesort body body typical dog dog dog sun traversals on five typical typical body body body sun dog elements.(Arrows dogma sun sun sun sun body body indicate elements sun dogma dogma dogma dogma dogma dogma exchanged,not elements compared.) start 而 end start 88 end First traversal Second traversal (a) (b) ascending order.The problem can be phrased more generally by substituting,say, lists of words for lists of numbers,with the intention that they be sorted by their lexicographic ordering (that is,as in a dictionary or telephone book).It is assumed that the list of elements is preceded by its length,N,and that the only way to obtain information concerning the magnitude of these elements is to perform binary comparisons;that is,to compare two elements and act according to the outcome of the comparison. One of the many known sorting algorithms is called bubblesort.Actually,bubble- sort is considered to be a bad sorting algorithm,for reasons explained in Chapter 6. It is used here only to illustrate control structures. The bubblesort algorithm is based on the following observation.If the jumbled list is traversed in sequence,one element at a time,and whenever two adjacent elements are found to be in the wrong order (that is,the first is larger than the second)they are exchanged,then on completion of the traversal,the largest element is in its rightful place;namely,at the end of the list. Figure 2.1(a)illustrates such a traversal for a simple five-element list.(The list has been drawn from bottom to top:the first element is the lowest in the picture. The arrows show only the elements exchanged,not those compared.)Clearly,the traversal might correct other incorrect orderings besides placing the maximal ele- ment in its final position.However,Figure 2.1(a)shows that one traversal does not necessarily sort the list.Now,a second traversal will bring the second largest ele- ment to its proper resting point,the penultimate position in the list,as can be seen in Figure 2.1(b).This leads to an algorithm that carries out N-1 such traversals(why not N?),resulting in the sorted list.The name"bubblesort"stems from the way in which large elements "bubble up"to the top of the list as the algorithm proceeds, exchanging places with smaller elements that are pushed lower down. Before writing down the algorithm in more detail,it should be pointed out that the second traversal need not extend to the last element,since by the time the second traversal starts,the last position in the list already contains its rightful tenant-the largest element in the list.Similarly,the third traversal need not go any further than the first N-2 elements.This means that a more efficient algorithm would traverse only the first N elements in its first traversal,the first N-1 in its second,N-2 in its third,and so on.We shall return to the bubblesort algorithm and this improvement 1 There is some subtlety to this.If we knew in advance,for example,that the input list consisted precisely of half of the integers between 1 and N jumbled in some unknown way,a trivial sorting algorithm could be written that simply prepared a new list of length N,initially containing blanks in all locations. then directly inserted each number encountered in the input list into its proper place in the new list. and finally simply reading out the contents of the non-blank places from beginning to end
P1: GIG PE002-02drv PE002-Harel PE002-Harel-v4.cls March 18, 2004 13:47 22 I. Preliminaries start end First traversal Second traversal typical dog body sun dogma (a) (b) dog body typical dogma sun dog body typical sun dogma dog typical body sun dogma typical dog body sun dogma typical dog sun body dogma typical sun dog body dogma start end Figure 2.1 Two bubblesort traversals on five elements. (Arrows indicate elements exchanged, not elements compared.) ascending order. The problem can be phrased more generally by substituting, say, lists of words for lists of numbers, with the intention that they be sorted by their lexicographic ordering (that is, as in a dictionary or telephone book). It is assumed that the list of elements is preceded by its length, N, and that the only way to obtain information concerning the magnitude of these elements is to perform binary comparisons; that is, to compare two elements and act according to the outcome of the comparison.1 One of the many known sorting algorithms is called bubblesort. Actually, bubblesort is considered to be a bad sorting algorithm, for reasons explained in Chapter 6. It is used here only to illustrate control structures. The bubblesort algorithm is based on the following observation. If the jumbled list is traversed in sequence, one element at a time, and whenever two adjacent elements are found to be in the wrong order (that is, the first is larger than the second) they are exchanged, then on completion of the traversal, the largest element is in its rightful place; namely, at the end of the list. Figure 2.1(a) illustrates such a traversal for a simple five-element list. (The list has been drawn from bottom to top: the first element is the lowest in the picture. The arrows show only the elements exchanged, not those compared.) Clearly, the traversal might correct other incorrect orderings besides placing the maximal element in its final position. However, Figure 2.1(a) shows that one traversal does not necessarily sort the list. Now, a second traversal will bring the second largest element to its proper resting point, the penultimate position in the list, as can be seen in Figure 2.1(b). This leads to an algorithm that carries out N − 1 such traversals (why not N?), resulting in the sorted list. The name “bubblesort” stems from the way in which large elements “bubble up” to the top of the list as the algorithm proceeds, exchanging places with smaller elements that are pushed lower down. Before writing down the algorithm in more detail, it should be pointed out that the second traversal need not extend to the last element, since by the time the second traversal starts, the last position in the list already contains its rightful tenant—the largest element in the list. Similarly, the third traversal need not go any further than the first N − 2 elements. This means that a more efficient algorithm would traverse only the first N elements in its first traversal, the first N − 1 in its second, N − 2 in its third, and so on. We shall return to the bubblesort algorithm and this improvement 1 There is some subtlety to this. If we knew in advance, for example, that the input list consisted precisely of half of the integers between 1 and N jumbled in some unknown way, a trivial sorting algorithm could be written that simply prepared a new list of length N, initially containing blanks in all locations, then directly inserted each number encountered in the input list into its proper place in the new list, and finally simply reading out the contents of the non-blank places from beginning to end
2.Algorithms and Data 23 Figure 2.2 The main stages in 24 78 78 78 78 78 bubblesort of eight 24 69 69 elements.(The first w 69 element of each list is 78 12 24 46 46 46 at the bottom;arrows 14 69 12 24 26 26 indicate elements changing places.) 26 14 46 12 24 24 26 14 26 12 69 26 14 14 12 46 46 8 8 start end in Chapter 6,but for now the unimproved version will suffice.The algorithm reads as follows: (1)do the following N-1 times: (1.1)point to the first element; (1.2)do the following N-I times: (1.2.1)compare the element pointed to with the next element; (1.2.2)if the compared elements are in the wrong order,exchange them; (1.2.3)point to the next element. Notice how two-level indentation is used here.The first "following,"on line (1), involves all lines starting with 1,and the second,on line(1.2),involves those starting with 1.2.In this way,the nested nature of the looping constructs is clearly visible. The main steps taken by the algorithm on an eight-item list are illustrated in Figure 2.2,where the situation is depicted just before each execution of clause(1.2). The elements appearing above the line are in their final positions.Notice that in this particular example the last two traversals(not shown)are redundant;the list becomes sorted after five,not seven,traversals.However,observe that if,for example,the smallest element happens to be last in the original list (that is,at the top in our illustrations),then all N-1 traversals are in fact necessary,since elements that are to be "bubbled down"(elbbubed?...)cause more trouble than those that "bubble p” The "Goto"Statement There is another important control instruction that is generally called the goto state- ment.It has the general form"goto G,"where G marks some point in the text of the algorithm.In our examples we could write,say,"goto (1.2),"an instruction that causes the processor Runaround to literally go to line (1.2)of the algorithm and re- sume execution from there.This construct is controversial for a number of reasons, the most obvious of which is pragmatic in nature.An algorithm that contains many "goto"statements directing control backwards and forwards in a tangled fashion
P1: GIG PE002-02drv PE002-Harel PE002-Harel-v4.cls March 18, 2004 13:47 2. Algorithms and Data 23 start end 24 12 78 14 26 8 69 46 78 24 12 69 14 26 8 46 78 69 24 12 46 14 26 8 78 69 46 24 12 26 14 8 78 69 46 26 24 12 14 8 78 69 46 26 24 14 12 8 Figure 2.2 The main stages in bubblesort of eight elements. (The first element of each list is at the bottom; arrows indicate elements changing places.) in Chapter 6, but for now the unimproved version will suffice. The algorithm reads as follows: (1) do the following N − 1 times: (1.1) point to the first element; (1.2) do the following N − 1 times: (1.2.1) compare the element pointed to with the next element; (1.2.2) if the compared elements are in the wrong order, exchange them; (1.2.3) point to the next element. Notice how two-level indentation is used here. The first “following,” on line (1), involves all lines starting with 1, and the second, on line (1.2), involves those starting with 1.2. In this way, the nested nature of the looping constructs is clearly visible. The main steps taken by the algorithm on an eight-item list are illustrated in Figure 2.2, where the situation is depicted just before each execution of clause (1.2). The elements appearing above the line are in their final positions. Notice that in this particular example the last two traversals (not shown) are redundant; the list becomes sorted after five, not seven, traversals. However, observe that if, for example, the smallest element happens to be last in the original list (that is, at the top in our illustrations), then all N − 1 traversals are in fact necessary, since elements that are to be “bubbled down” (elbbubed? ...) cause more trouble than those that “bubble up.” ■ The “Goto” Statement There is another important control instruction that is generally called the goto statement. It has the general form “goto G,” where G marks some point in the text of the algorithm. In our examples we could write, say, “goto (1.2),” an instruction that causes the processor Runaround to literally go to line (1.2) of the algorithm and resume execution from there. This construct is controversial for a number of reasons, the most obvious of which is pragmatic in nature. An algorithm that contains many “goto” statements directing control backwards and forwards in a tangled fashion
24 I.Preliminaries quickly becomes very difficult to understand.Clarity,as we shall argue later,is a very important consideration in algorithmic design. Besides potentially reducing our ability to understand an algorithm,"goto"state- ments can also introduce technical difficulties.What happens if a"goto"statement directs the processor into the midst of a loop?Inserting the instruction"goto(1.2.1)" between(1.1)and(1.2)in the bubblesort algorithm is an example of this.Is the pro- cessor to execute (1.2.1)through(1.2.3)and then halt,or should these be executed N-1 times?What if the same instruction appears within the (1.2.1)-(1.2.3)se- quence?This kind of problem is rooted in the ambiguity resulting from an attempt to match the text of an algorithm with the process it prescribes.Clearly,there is such a match,but since fixed algorithms can prescribe processes of varying length, a single point in the text of the algorithm can be associated with many points in the execution of the corresponding process.Consequently,"goto"statements are in a sense inherently ambiguous creatures,and many researchers are opposed to using them freely in algorithms. Diagrams for Algorithms Visual,diagrammatic techniques are one way of presenting the control flow of an algorithm in a clear and readable fashion.There are various ways of"drawing" algorithms,as opposed to writing them down.One of the best known of these involves writing the elementary instructions in rectangular boxes and the tests in diamond- shaped ones,and using arrows to describe how the processor Runaround runs around executing the algorithm.The resulting objects are called flowcharts.Figure 2.3 shows a flowchart of the regular salary summation algorithm,and Figure 2.4 shows Figure 2.3 start Flowchart for salary summation. note 0: point to first salary add salary pointed 3 to noted number Loop YES at output NO point to end of noted number list? next salary stop
P1: GIG PE002-02drv PE002-Harel PE002-Harel-v4.cls March 18, 2004 13:47 24 I. Preliminaries quickly becomes very difficult to understand. Clarity, as we shall argue later, is a very important consideration in algorithmic design. Besides potentially reducing our ability to understand an algorithm, “goto” statements can also introduce technical difficulties. What happens if a “goto” statement directs the processor into the midst of a loop? Inserting the instruction “goto (1.2.1)” between (1.1) and (1.2) in the bubblesort algorithm is an example of this. Is the processor to execute (1.2.1) through (1.2.3) and then halt, or should these be executed N − 1 times? What if the same instruction appears within the (1.2.1)–(1.2.3) sequence? This kind of problem is rooted in the ambiguity resulting from an attempt to match the text of an algorithm with the process it prescribes. Clearly, there is such a match, but since fixed algorithms can prescribe processes of varying length, a single point in the text of the algorithm can be associated with many points in the execution of the corresponding process. Consequently, “goto” statements are in a sense inherently ambiguous creatures, and many researchers are opposed to using them freely in algorithms. ■ Diagrams for Algorithms Visual, diagrammatic techniques are one way of presenting the control flow of an algorithm in a clear and readable fashion. There are various ways of “drawing” algorithms, as opposed to writing them down. One of the best known of these involves writing the elementary instructions in rectangular boxes and the tests in diamondshaped ones, and using arrows to describe how the processor Runaround runs around executing the algorithm. The resulting objects are called flowcharts. Figure 2.3 shows a flowchart of the regular salary summation algorithm, and Figure 2.4 shows start note 0; point to first salary add salary pointed at to noted number at end of list? output noted number stop point to next salary YES NO Loop Figure 2.3 Flowchart for salary summation
2.Algorithms and Data 25 Figure 2.4 start Flowchart for sophisticated salary summation. note 0: point with P to first employee point with O to first employee is o YES direct manager of P? is P's YES salary more NO than O's? iso NO at end advance O to add P's salary to NO noted number of list? next employee YES is P YES at end of list? NO output advance Pto noted number next employee stop one for the more sophisticated version that involves only employees earning more than their direct managers. Notice the way an iteration shows up visually as a cycle of boxes,diamonds, and arrows,and nested iterations show up as cycles within cycles.This explains the use of the term"looping."The flowcharts in Figures 2.3 and 2.4 also illustrate the appropriateness of the term "branching"that was associated with conditional instructions. Flowcharts also have disadvantages.One of these is rooted in the fact that it is more difficult to encourage people to adhere to a small number of"well-formed" control structures.When using flowcharts it is easy to succumb to the temptation to employ many"goto's,"since these are depicted simply as arrows just like those that
P1: GIG PE002-02drv PE002-Harel PE002-Harel-v4.cls March 18, 2004 13:47 2. Algorithms and Data 25 start note 0; point with P to first employee point with Q to first employee is Q direct manager of P? is Q at end of list? advance Q to next employee advance P to next employee output noted number is P at end of list? add P’s salary to noted number is P’s salary more than Q’s? stop NO YES YES YES YES NO NO NO Internal loop External loop Figure 2.4 Flowchart for sophisticated salary summation. one for the more sophisticated version that involves only employees earning more than their direct managers. Notice the way an iteration shows up visually as a cycle of boxes, diamonds, and arrows, and nested iterations show up as cycles within cycles. This explains the use of the term “looping.” The flowcharts in Figures 2.3 and 2.4 also illustrate the appropriateness of the term “branching” that was associated with conditional instructions. Flowcharts also have disadvantages. One of these is rooted in the fact that it is more difficult to encourage people to adhere to a small number of “well-formed” control structures. When using flowcharts it is easy to succumb to the temptation to employ many “goto’s,” since these are depicted simply as arrows just like those that