S12 ANALY SIS O F ALGORITHM S II The running time of mergesort as implemented here depends only on the number of elements in the array being sorted,not on the way they are arranged.For many other sorting methods,the running time may vary sub- stantially as a function of the initial ordering of the input.Typically,in the theory of algorithms,we are most interested in worst-case performance,since it can provide a guarantee on the performance characteristics of the algorithm no matter what the input is;in the analysis of particular algorithms,we are most interested in average-case performance for a reasonable input model, since that can provide a path to predict performance on"typical"input. We always seek better algorithms,and a natural question that arises is whether there might be a sorting algorithm with asymptotically better per- formance than mergesort.The following classical result from the theory of algorithms says,in essence,that there is not. Theorem 1.2 (Complexity of sorting).Every compare-based sorting pro- gram uses at least [IgN!>NlgN-N/(In2)compares for some input. Proof.A full proof of this fact may be found in [30]or [19].Intuitively the result follows from the observation that each compare can cut down the num- ber of possible arrangements of the elements to be considered by,at most,only a factor of 2.Since there are N!possible arrangements before the sort and the goal is to have just one possible arrangement(the sorted one)after the sort,the number of compares must be at least the number of times N!can be divided by 2 before reaching a number less than unity-that is,[lgN!].The theorem follows from Stirling's approximation to the factorial function(see the second corollary to Theorem 4.3). From a theoretical standpoint,this result demonstrates that NlogN is a"lower bound"on the intrinsic difficulty of the sorting problem: All compare-based sorting algorithms require time proportional to NlogN to sort some N-element input file. This is a general statement about an entire class of algorithms.We say that the "time complexity of sorting is (NlogN)."This lower bound is sig- nificant because it matches the upper bound of Theorem 1.1,thus showing that mergesort is optimal in the sense that no algorithm can have a better asymptotic running time.We say that the"time complexity of sorting is (NlogN)."From a theoretical standpoint,this completes the "solution"of the sorting "problem:"matching upper and lower bounds have been proved. www.it-ebooks.info
§Ș.ș A Ś ō Ř ť ş ŕ ş ś Œ A Ř œ ś Ş ŕ Š Ŕ ř ş ȘȘ Ļe running time of mergesort as implemented here depends only on the number of elements in the array being sorted, not on the way they are arranged. For many other sorting methods, the running time may vary substantially as a function of the initial ordering of the input. Typically, in the theory of algorithms, we are most interested in worst-case performance, since it can provide a guarantee on the performance characteristics of the algorithm no matter what the input is; in the analysis of particular algorithms, we are most interested in average-case performance for a reasonable input model, since that can provide a path to predict performance on “typical” input. We always seek better algorithms, and a natural question that arises is whether there might be a sorting algorithm with asymptotically better performance than mergesort. Ļe following classical result from the theory of algorithms says, in essence, that there is not. Ļeorem 1.2 (Complexity of sorting). Every compare-based sorting program uses at least ⌈lgN!⌉ > NlgN − N/(ln2) compares for some input. Proof. A full proof of this fact may be found in [30] or [19]. Intuitively the result follows from the observation that each compare can cut down the number of possible arrangements of the elements to be considered by, at most, only a factor of 2. Since there are N! possible arrangements before the sort and the goal is to have just one possible arrangement (the sorted one) after the sort, the number of compares must be at least the number of times N! can be divided by 2 before reaching a number less than unity—that is, ⌈lgN!⌉. Ļe theorem follows from Stirling’s approximation to the factorial function (see the second corollary to Ļeorem 4.3). From a theoretical standpoint, this result demonstrates that NlogN is a “lower bound” on the intrinsic difficulty of the sorting problem: All compare-based sorting algorithms require time proportional to NlogN to sort some N-element input ŀle. Ļis is a general statement about an entire class of algorithms. We say that the “time complexity of sorting is (NlogN).” Ļis lower bound is signiŀcant because it matches the upper bound of Ļeorem 1.1, thus showing that mergesort is optimal in the sense that no algorithm can have a better asymptotic running time. We say that the “time complexity of sorting is (NlogN).” From a theoretical standpoint, this completes the “solution” of the sorting “problem:” matching upper and lower bounds have been proved. www.it-ebooks.info
12 CHAPTER ONE S1.2 Again,these results hold under rather generous assumptions,though they are perhaps not as general as it might seem.For example,the results say nothing about sorting algorithms that do not use compares.Indeed,there exist sorting methods based on index calculation techniques(such as those discussed in Chapter 9)that run in linear time on average. Exercise 1.9 Suppose that it is known that each of the items in an N-item array has one of two distinct values.Give a sorting method that takes time proportional to N. Exercise 1.10 Answer the previous exercise for three distinct values. We have omitted many details that relate to proper modeling of comput- ers and programs in the proofs of Theorem 1.1 and Theorem 1.2.The essence of the theory of algorithms is the development of complete models within which the intrinsic difficulty of important problems can be assessed and"ef- ficient"algorithms representing upper bounds matching these lower bounds can be developed.For many important problem domains there is still a sig- nificant gap between the lower and upper bounds on asymptotic worst-case performance.The theory of algorithms provides guidance in the development of new algorithms for such problems.We want algorithms that can lower known upper bounds,but there is no point in searching for an algorithm that performs better than known lower bounds(except perhaps by looking for one that violates conditions of the model upon which a lower bound is based!). Thus,the theory of algorithms provides a way to classify algorithms according to their asymptotic performance.However,the very process of approximate analysis("within a constant factor")that extends the applicability of theoretical results often limits our ability to accurately predict the perfor- mance characteristics ofany particular algorithm.More important,the theory of algorithms is usually based on worst-case analysis,which can be overly pes- simistic and not as helpful in predicting actual performance as an average-case analysis.This is not relevant for algorithms like mergesort(where the running time is not so dependent on the input),but average-case analysis can help us discover that nonoptimal algorithms are sometimes faster in practice,as we will see.The theory of algorithms can help us to identify good algorithms, but then it is of interest to refine the analysis to be able to more intelligently compare and improve them.To do so,we need precise knowledge about the performance characteristics of the particular computer being used and math- ematical techniques for accurately determining the frequency of execution of fundamental operations.In this book,we concentrate on such techniques. www.it-ebooks.info
Șș C Ŕ ō Ŝ Š ő Ş O Ś ő §Ș.ș Again, these results hold under rather generous assumptions, though they are perhaps not as general as it might seem. For example, the results say nothing about sorting algorithms that do not use compares. Indeed, there exist sorting methods based on index calculation techniques (such as those discussed in Chapter 9) that run in linear time on average. Exercise 1.9 Suppose that it is known that each of the items in an N-item array has one of two distinct values. Give a sorting method that takes time proportional to N. Exercise 1.10 Answer the previous exercise for three distinct values. We have omitted many details that relate to proper modeling of computers and programs in the proofs of Ļeorem 1.1 and Ļeorem 1.2. Ļe essence of the theory of algorithms is the development of complete models within which the intrinsic difficulty of important problems can be assessed and “ef- ŀcient” algorithms representing upper bounds matching these lower bounds can be developed. For many important problem domains there is still a signiŀcant gap between the lower and upper bounds on asymptotic worst-case performance. Ļe theory of algorithms provides guidance in the development of new algorithms for such problems. We want algorithms that can lower known upper bounds, but there is no point in searching for an algorithm that performs better than known lower bounds (except perhaps by looking for one that violates conditions of the model upon which a lower bound is based!). Ļus, the theory of algorithms provides a way to classify algorithms according to their asymptotic performance. However, the very process of approximate analysis (“within a constant factor”) that extends the applicability of theoretical results often limits our ability to accurately predict the performance characteristics of any particular algorithm. More important, the theory of algorithms is usually based on worst-case analysis, which can be overly pessimistic and not as helpful in predicting actual performance as an average-case analysis. Ļis is not relevant for algorithms like mergesort (where the running time is not so dependent on the input), but average-case analysis can help us discover that nonoptimal algorithms are sometimes faster in practice, as we will see. Ļe theory of algorithms can help us to identify good algorithms, but then it is of interest to reŀne the analysis to be able to more intelligently compare and improve them. To do so, we need precise knowledge about the performance characteristics of the particular computer being used and mathematical techniques for accurately determining the frequency of execution of fundamental operations. In this book, we concentrate on such techniques. www.it-ebooks.info
S13 ANALY SIS O F ALGORITHMS 13 1.3 Analysis of Algorithms.Though the analysis of sorting and merge- sort that we considered in $1.2 demonstrates the intrinsic "difficulty"of the sorting problem,there are many important questions related to sorting (and to mergesort)that it does not address at all.How long might an implemen- tation of mergesort be expected to run on a particular computer?How might its running time compare to other O(NlogN)methods?(There are many.) How does it compare to sorting methods that are fast on average,but per- haps not in the worst case?How does it compare to sorting methods that are not based on compares among elements?To answer such questions,a more detailed analysis is required.In this section we briefly describe the process of doing such an analysis. To analyze an algorithm,we must first identify the resources of primary interest so that the detailed analysis may be properly focused.We describe the process in terms of studying the running time since it is the resource most rel- evant here.A complete analysis of the running time of an algorithm involves the following steps: Implement the algorithm completely. Determine the time required for each basic operation. Identify unknown quantities that can be used to describe the frequency of execution of the basic operations. Develop a realistic model for the input to the program. Analyze the unknown quantities,assuming the modeled input. Calculate the total running time by multiplying the time by the fre- quency for each operation,then adding all the products. The first step in the analysis is to carefully implement the algorithm on a particular computer.We reserve the term program to describe such an imple- mentation.One algorithm corresponds to many programs.A particular im- plementation not only provides a concrete object to study,but also can give useful empirical data to aid in or to check the analysis.Presumably the im- plementation is designed to make efficient use of resources,but it is a mistake to overemphasize efficiency too early in the process.Indeed,a primary appli- cation for the analysis is to provide informed guidance toward better imple- mentations. The next step is to estimate the time required by each component in- struction of the program.In principle and in practice,we can often do so with great precision,but the process is very dependent on the characteristics www.it-ebooks.info
§Ș.Ț A Ś ō Ř ť ş ŕ ş ś Œ A Ř œ ś Ş ŕ Š Ŕ ř ş ȘȚ 1.3 Analysis of Algorithms. Ļough the analysis of sorting and mergesort that we considered in §1.2 demonstrates the intrinsic “difficulty” of the sorting problem, there are many important questions related to sorting (and to mergesort) that it does not address at all. How long might an implementation of mergesort be expected to run on a particular computer? How might its running time compare to other O(NlogN) methods? (Ļere are many.) How does it compare to sorting methods that are fast on average, but perhaps not in the worst case? How does it compare to sorting methods that are not based on compares among elements? To answer such questions, a more detailed analysis is required. In this section we brieły describe the process of doing such an analysis. To analyze an algorithm, we must ŀrst identify the resources of primary interest so that the detailed analysis may be properly focused. We describe the process in terms of studying the running time since it is the resource most relevant here. A complete analysis of the running time of an algorithm involves the following steps: • Implement the algorithm completely. • Determine the time required for each basic operation. • Identify unknown quantities that can be used to describe the frequency of execution of the basic operations. • Develop a realistic model for the input to the program. • Analyze the unknown quantities, assuming the modeled input. • Calculate the total running time by multiplying the time by the frequency for each operation, then adding all the products. Ļe ŀrst step in the analysis is to carefully implement the algorithm on a particular computer. We reserve the term program to describe such an implementation. One algorithm corresponds to many programs. A particular implementation not only provides a concrete object to study, but also can give useful empirical data to aid in or to check the analysis. Presumably the implementation is designed to make efficient use of resources, but it is a mistake to overemphasize efficiency too early in the process. Indeed, a primary application for the analysis is to provide informed guidance toward better implementations. Ļe next step is to estimate the time required by each component instruction of the program. In principle and in practice, we can often do so with great precision, but the process is very dependent on the characteristics www.it-ebooks.info
14 CHAPTER ONE SI.3 of the computer system being studied.Another approach is to simply run the program for small input sizes to"estimate"the values of the constants,or to do so indirectly in the aggregate,as described in Exercise 1.7.We do not consider this process in detail;rather we focus on the"machine-independent" parts of the analysis in this book. Indeed,to determine the total running time of the program,it is neces- sary to study the branching structure of the program in order to express the frequency of execution of the component instructions in terms of unknown mathematical quantities.If the values of these quantities are known,then we can derive the running time of the entire program simply by multiplying the frequency and time requirements of each component instruction and adding these products.Many programming environments have tools that can sim- plify this task.At the first level of analysis,we concentrate on quantities that have large frequency values or that correspond to large costs;in principle the analysis can be refined to produce a fully detailed answer.We often refer to the "cost"of an algorithm as shorthand for the "value of the quantity in question"when the context allows. The next step is to model the input to the program,to form a basis for the mathematical analysis of the instruction frequencies.The values of the unknown frequencies are dependent on the input to the algorithm:the prob- lem size(usually we name that N)is normally the primary parameter used to express our results,but the order or value of input data items ordinarily af- fects the running time as well.By"model,"we mean a precise description of typical inputs to the algorithm.For example,for sorting algorithms,it is nor- mally convenient to assume that the inputs are randomly ordered and distinct, though the programs normally work even when the inputs are not distinct. Another possibility for sorting algorithms is to assume that the inputs are random numbers taken from a relatively large range.These two models can be shown to be nearly equivalent.Most often,we use the simplest available model of"random"inputs,which is often realistic.Several different models can be used for the same algorithm:one model might be chosen to make the analysis as simple as possible;another model might better reflect the actual situation in which the program is to be used. The last step is to analyze the unknown quantities,assuming the mod- eled input.For average-case analysis,we analyze the quantities individually, then multiply the averages by instruction times and add them to find the run- ning time of the whole program.For worst-case analysis,it is usually difficult www.it-ebooks.info
Șț C Ŕ ō Ŝ Š ő Ş O Ś ő §Ș.Ț of the computer system being studied. Another approach is to simply run the program for small input sizes to “estimate” the values of the constants, or to do so indirectly in the aggregate, as described in Exercise 1.7. We do not consider this process in detail; rather we focus on the “machine-independent” parts of the analysis in this book. Indeed, to determine the total running time of the program, it is necessary to study the branching structure of the program in order to express the frequency of execution of the component instructions in terms of unknown mathematical quantities. If the values of these quantities are known, then we can derive the running time of the entire program simply by multiplying the frequency and time requirements of each component instruction and adding these products. Many programming environments have tools that can simplify this task. At the ŀrst level of analysis, we concentrate on quantities that have large frequency values or that correspond to large costs; in principle the analysis can be reŀned to produce a fully detailed answer. We often refer to the “cost” of an algorithm as shorthand for the “value of the quantity in question” when the context allows. Ļe next step is to model the input to the program, to form a basis for the mathematical analysis of the instruction frequencies. Ļe values of the unknown frequencies are dependent on the input to the algorithm: the problem size (usually we name that N) is normally the primary parameter used to express our results, but the order or value of input data items ordinarily affects the running time as well. By “model,” we mean a precise description of typical inputs to the algorithm. For example, for sorting algorithms, it is normally convenient to assume that the inputs are randomly ordered and distinct, though the programs normally work even when the inputs are not distinct. Another possibility for sorting algorithms is to assume that the inputs are random numbers taken from a relatively large range. Ļese two models can be shown to be nearly equivalent. Most often, we use the simplest available model of “random” inputs, which is often realistic. Several different models can be used for the same algorithm: one model might be chosen to make the analysis as simple as possible; another model might better rełect the actual situation in which the program is to be used. Ļe last step is to analyze the unknown quantities, assuming the modeled input. For average-case analysis, we analyze the quantities individually, then multiply the averages by instruction times and add them to ŀnd the running time of the whole program. For worst-case analysis, it is usually difficult www.it-ebooks.info
S13 ANALY SIS O F ALGORITHM S 15 to get an exact result for the whole program,so we can only derive an upper bound,by multiplying worst-case values of the individual quantities by in- struction times and summing the results. This general scenario can successfully provide exact models in many sit- uations.Knuth's books [17][18][19][20]are based on this precept.Unfortu- nately,the details in such an exact analysis are often daunting.Accordingly, we typically seek approximate models that we can use to estimate costs. The first reason to approximate is that determining the cost details of all individual operations can be daunting in the context of the complex architec- tures and operating systems on modern computers.Accordingly,we typically study just a few quantities in the "inner loop"of our programs,implicitly hypothesizing that total cost is well estimated by analyzing just those quan- tities.Experienced programmers regularly"profile"their implementations to identify"bottlenecks,"which is a systematic way to identify such quantities. For example,we typically analyze compare-based sorting algorithms by just counting compares.Such an approach has the important side benefit that it is machine independent.Carefully analyzing the number of compares used by a sorting algorithm can enable us to predict performance on many different computers.Associated hypotheses are easily tested by experimentation,and we can refine them,in principle,when appropriate.For example,we might refine comparison-based models for sorting to include data movement,which may require taking caching effects into account. Exercise 1.11 Run experiments on two different computers to test the hypothesis that the running time of mergesort divided by the number of compares that it uses approaches a constant as the problem size increases. Approximation is also effective for mathematical models.The second reason to approximate is to avoid unnecessary complications in the mathe- matical formulae that we develop to describe the performance of algorithms. A major theme of this book is the development of classical approximation methods for this purpose,and we shall consider many examples.Beyond these,a major thrust of modern research in the analysis of algorithms is meth- ods of developing mathematical analyses that are simple,sufficiently precise that they can be used to accurately predict performance and to compare algo- rithms,and able to be refined,in principle,to the precision needed for the application at hand.Such techniques primarily involve complex analysis and are fully developed in our book [10]. www.it-ebooks.info
§Ș.Ț A Ś ō Ř ť ş ŕ ş ś Œ A Ř œ ś Ş ŕ Š Ŕ ř ş ȘȜ to get an exact result for the whole program, so we can only derive an upper bound, by multiplying worst-case values of the individual quantities by instruction times and summing the results. Ļis general scenario can successfully provide exact models in many situations. Knuth’s books [17][18][19][20] are based on this precept. Unfortunately, the details in such an exact analysis are often daunting. Accordingly, we typically seek approximate models that we can use to estimate costs. Ļe ŀrst reason to approximate is that determining the cost details of all individual operations can be daunting in the context of the complex architectures and operating systems on modern computers. Accordingly, we typically study just a few quantities in the “inner loop” of our programs, implicitly hypothesizing that total cost is well estimated by analyzing just those quantities. Experienced programmers regularly “proŀle” their implementations to identify “bottlenecks,” which is a systematic way to identify such quantities. For example, we typically analyze compare-based sorting algorithms by just counting compares. Such an approach has the important side beneŀt that it is machine independent. Carefully analyzing the number of compares used by a sorting algorithm can enable us to predict performance on many different computers. Associated hypotheses are easily tested by experimentation, and we can reŀne them, in principle, when appropriate. For example, we might reŀne comparison-based models for sorting to include data movement, which may require taking caching effects into account. Exercise 1.11 Run experiments on two different computers to test the hypothesis that the running time of mergesort divided by the number of compares that it uses approaches a constant as the problem size increases. Approximation is also effective for mathematical models. Ļe second reason to approximate is to avoid unnecessary complications in the mathematical formulae that we develop to describe the performance of algorithms. A major theme of this book is the development of classical approximation methods for this purpose, and we shall consider many examples. Beyond these, a major thrust of modern research in the analysis of algorithms is methods of developing mathematical analyses that are simple, sufficiently precise that they can be used to accurately predict performance and to compare algorithms, and able to be reŀned, in principle, to the precision needed for the application at hand. Such techniques primarily involve complex analysis and are fully developed in our book [10]. www.it-ebooks.info