当前位置：和泉文库 > 计算机 > 浏览文档

《计算机问题求解》课程教学资源：《An Introduction to the Analysis of Algorithms》参考书籍教材（Second Edition，Robert Sedgewick、Philippe Flajolet）

CHAPTER ONE:ANALYSIS OF ALGORITHMS CHAPTER TWO:RECURRENCE RELATIONS CHAPTER THREE:GENERATING FUNCTIONS CHAPTER FOUR:ASYMPTOTIC APPROXIMATIONS CHAPTER FIVE:ANALYTIC COMBINATORICS CHAPTER SIX:TREES CHAPTER SEVEN:PERMUTATIONS CHAPTER EIGHT:STRINGS AND TRIES CHAPTER NINE:WORDS AND MAPPINGS

文件格式：PDF，文件大小：5.56MB，售价：68.67元

共574页，可试读40页，点击往前阅读 ↑↑

文档详细内容（约574页）

26 CHAPTER ONE §1.5 of compares required by this"median-of-three"quicksort is described by the recurrence CN=N+1+ N-)k-(Ck-1+CN-) for N>3 (4) 1≤k≤N ( where (is the binomial coefficient that counts the number of ways to choose 3 out of N items.This is true because the probability that the kth smallest element is the partitioning element is now (N-k)(-1)/()(as opposed to 1/N for regular quicksort).We would like to be able to solve re- currences of this nature to be able to determine how large a sample to use and when to switch to insertion sort.However,such recurrences require more sophisticated techniques than the simple ones used so far.In Chapters 2 and 3,we will see methods for developing precise solutions to such recur- rences,which allow us to determine the best values for parameters such as the sample size and the cutoff for small subarrays.Extensive studies along these lines have led to the conclusion that median-of-three quicksort with a cutoff point in the range 10 to 20 achieves close to optimal performance for typical implementations. Radix-exchange sort.Another variant of quicksort involves taking advan- tage of the fact that the keys may be viewed as binary strings.Rather than comparing against a key from the file for partitioning,we partition the file so that all keys with a leading 0 bit precede all those with a leading 1 bit. Then these subarrays can be independently subdivided in the same way using the second bit,and so forth.This variation is referred to as"radix-exchange sort"or "radix quicksort."How does this variation compare with the basic algorithm?To answer this question,we first have to note that a different mathematical model is required,since keys composed of random bits are es- sentially different from random permutations.The "random bitstring"model is perhaps more realistic,as it reflects the actual representation,but the mod- els can be proved to be roughly equivalent.We will discuss this issue in more detail in Chapter 8.Using a similar argument to the one given above,we can show that the average number of bit compares required by this method is described by the recurrence =v+(）a+G for N>1 with Co=C1=0. www.it-ebooks.info

șȝ C Ŕ ō Ŝ Š ő Ş O Ś ő §Ș.Ȝ of compares required by this “median-of-three” quicksort is described by the recurrence CN = N +1+ ∑ 1≤k≤N (N − k)(k − 1) (N 3 ) (Ck−1 +CN−k) for N > 3 (4) where (N 3 ) is the binomial coefficient that counts the number of ways to choose 3 out of N items. Ļis is true because the probability that the kth smallest element is the partitioning element is now (N − k)(k − 1)/ (N 3 ) (as opposed to 1/N for regular quicksort). We would like to be able to solve recurrences of this nature to be able to determine how large a sample to use and when to switch to insertion sort. However, such recurrences require more sophisticated techniques than the simple ones used so far. In Chapters 2 and 3, we will see methods for developing precise solutions to such recurrences, which allow us to determine the best values for parameters such as the sample size and the cutoff for small subarrays. Extensive studies along these lines have led to the conclusion that median-of-three quicksort with a cutoff point in the range 10 to 20 achieves close to optimal performance for typical implementations. Radix-exchange sort. Another variant of quicksort involves taking advantage of the fact that the keys may be viewed as binary strings. Rather than comparing against a key from the ŀle for partitioning, we partition the ŀle so that all keys with a leading 0 bit precede all those with a leading 1 bit. Ļen these subarrays can be independently subdivided in the same way using the second bit, and so forth. Ļis variation is referred to as “radix-exchange sort” or “radix quicksort.” How does this variation compare with the basic algorithm? To answer this question, we ŀrst have to note that a different mathematical model is required, since keys composed of random bits are essentially different from random permutations. Ļe “random bitstring” model is perhaps more realistic, as it rełects the actual representation, but the models can be proved to be roughly equivalent. We will discuss this issue in more detail in Chapter 8. Using a similar argument to the one given above, we can show that the average number of bit compares required by this method is described by the recurrence CN = N + 1 2N ∑ k ( N k ) (Ck + CN−k) for N > 1 with C0 = C1 = 0. www.it-ebooks.info

§1.6 ANALY SIS OF ALGO RITH MS 27 This turns out to be a rather more difficult recurrence to solve than the one given earlier-we will see in Chapter 3 how generating functions can be used to transform the recurrence into an explicit formula for CN,and in Chapters 4 and 8,we will see how to develop an approximate solution. One limitation to the applicability of this kind of analysis is that all of the preceding recurrence relations depend on the "randomness preservation" property of the algorithm:if the original file is randomly ordered,it can be shown that the subarrays after partitioning are also randomly ordered.The implementor is not so restricted,and many widely used variants of the algo- rithm do not have this property.Such variants appear to be extremely difficult to analyze.Fortunately(from the point ofview of the analyst),empirical stud- ies show that they also perform poorly.Thus,though it has not been analyt- ically quantified,the requirement for randomness preservation seems to pro- duce more elegant and efficient quicksort implementations.More important, the versions that preserve randomness do admit to performance improve- ments that can be fully quantified mathematically,as described earlier. Mathematical analysis has played an important role in the development of practical variants of quicksort,and we will see that there is no shortage of other problems to consider where detailed mathematical analysis is an important part of the algorithm design process. 1.6 Asymptotic Approximations.The derivation of the average running time of quicksort given earlier yields an exact result,but we also gave a more concise approximate expression in terms of well-known functions that still can be used to compute accurate numerical estimates.As we will see,it is often the case that an exact result is not available,or at least an approximation is far easier to derive and interpret.Ideally,our goal in the analysis of an algorithm should be to derive exact results;from a pragmatic point of view,it is perhaps more in line with our general goal of being able to make useful performance predications to strive to derive concise but precise approximate answers. To do so,we will need to use classical techniques for manipulating such approximations.In Chapter 4,we will examine the Euler-Maclaurin sum- mation formula,which provides a way to estimate sums with integrals.Thus, we can approximate the harmonic numbers by the calculation w* -dx =InN <k<N www.it-ebooks.info

§Ș.ȝ A Ś ō Ř ť ş ŕ ş ś Œ A Ř œ ś Ş ŕ Š Ŕ ř ş șȞ Ļis turns out to be a rather more difficult recurrence to solve than the one given earlier—we will see in Chapter 3 how generating functions can be used to transform the recurrence into an explicit formula for CN , and in Chapters 4 and 8, we will see how to develop an approximate solution. One limitation to the applicability of this kind of analysis is that all of the preceding recurrence relations depend on the “randomness preservation” property of the algorithm: if the original ŀle is randomly ordered, it can be shown that the subarrays after partitioning are also randomly ordered. Ļe implementor is not so restricted, and many widely used variants of the algorithm do not have this property. Such variants appear to be extremely difficult to analyze. Fortunately (from the point of view of the analyst), empirical studies show that they also perform poorly. Ļus, though it has not been analytically quantiŀed, the requirement for randomness preservation seems to produce more elegant and efficient quicksort implementations. More important, the versions that preserve randomness do admit to performance improvements that can be fully quantiŀed mathematically, as described earlier. Mathematical analysis has played an important role in the development of practical variants of quicksort, and we will see that there is no shortage of other problems to consider where detailed mathematical analysis is an important part of the algorithm design process. 1.6 Asymptotic Approximations. Ļe derivation of the average running time of quicksort given earlier yields an exact result, but we also gave a more concise approximate expression in terms of well-known functions that still can be used to compute accurate numerical estimates. As we will see, it is often the case that an exact result is not available, or at least an approximation is far easier to derive and interpret. Ideally, our goal in the analysis of an algorithm should be to derive exact results; from a pragmatic point of view, it is perhaps more in line with our general goal of being able to make useful performance predications to strive to derive concise but precise approximate answers. To do so, we will need to use classical techniques for manipulating such approximations. In Chapter 4, we will examine the Euler-Maclaurin summation formula, which provides a way to estimate sums with integrals. Ļus, we can approximate the harmonic numbers by the calculation HN = ∑ 1≤k≤N 1 k ≈ ∫ N 1 1 x dx = lnN. www.it-ebooks.info

28 CHAPTER ONE $r.6 But we can be much more precise about the meaning of ~and we can con- clude (for example)that HN InN+y+1/(2N)+O(1/N2)where =.57721...is a constant known in analysis as Euler's constant.Though the constants implicit in the O-notation are not specified,this formula provides a way to estimate the value of HN with increasingly improving accuracy as N increases.Moreover,if we want even better accuracy,we can derive a formula for HN that is accurate to within O(N-3)or indeed to within O(N)for any constant k.Such approximations,called asymptotic expansions,are at the heart of the analysis of algorithms,and are the subject of Chapter 4. The use of asymptotic expansions may be viewed as a compromise be- tween the ideal goal of providing an exact result and the practical requirement of providing a concise approximation.It turns out that we are normally in the situation of,on the one hand,having the ability to derive a more accurate expression if desired,but,on the other hand,not having the desire,because expansions with only a few terms(like the one for HN above)allow us to com- pute answers to within several decimal places.We typically drop back to using the notation to summarize results without naming irrational constants,as, for example,in Theorem 1.3. Moreover,exact results and asymptotic approximations are both subject to inaccuracies inherent in the probabilistic model(usually an idealization of reality)and to stochastic fluctuations.Table 1.1 shows exact,approximate, and empirical values for number of compares used by quicksort on random files of various sizes.The exact and approximate values are computed from the formulae given in Theorem 1.3;the "empirical"is a measured average, taken over 100 files consisting of random positive integers less than 10;this tests not only the asymptotic approximation that we have discussed,but also the"approximation"inherent in our use of the random permutation model, ignoring equal keys.The analysis of quicksort when equal keys are present is treated in Sedgewick [28]. Exercise 1.20 How many keys in a file of 10 random integers less than 106 are likely to be equal to some other key in the file?Run simulations,or do a mathematical analysis(with the help of a system for mathematical calculations),or do both. Exercise 1.21 Experiment with files consisting of random positive integers less than M for M =10,000,1000,100 and other values.Compare the performance of quick- sort on such files with its performance on random permutations of the same size. Characterize situations where the random permutation model is inaccurate. www.it-ebooks.info

șȟ C Ŕ ō Ŝ Š ő Ş O Ś ő §Ș.ȝ But we can be much more precise about the meaning of ≈, and we can conclude (for example) that HN = lnN + γ + 1/(2N) + O(1/N2 ) where γ = .57721 · · · is a constant known in analysis as Euler’s constant. Ļough the constants implicit in the O-notation are not speciŀed, this formula provides a way to estimate the value of HN with increasingly improving accuracy as N increases. Moreover, if we want even better accuracy, we can derive a formula for HN that is accurate to within O(N −3 ) or indeed to within O(N −k ) for any constant k. Such approximations, called asymptotic expansions, are at the heart of the analysis of algorithms, and are the subject of Chapter 4. Ļe use of asymptotic expansions may be viewed as a compromise between the ideal goal of providing an exact result and the practical requirement of providing a concise approximation. It turns out that we are normally in the situation of, on the one hand, having the ability to derive a more accurate expression if desired, but, on the other hand, not having the desire, because expansions with only a few terms (like the one for HN above) allow us to compute answers to within several decimal places. We typically drop back to using the ≈ notation to summarize results without naming irrational constants, as, for example, in Ļeorem 1.3. Moreover, exact results and asymptotic approximations are both subject to inaccuracies inherent in the probabilistic model (usually an idealization of reality) and to stochastic łuctuations. Table 1.1 shows exact, approximate, and empirical values for number of compares used by quicksort on random ŀles of various sizes. Ļe exact and approximate values are computed from the formulae given in Ļeorem 1.3; the “empirical” is a measured average, taken over 100 ŀles consisting of random positive integers less than 106 ; this tests not only the asymptotic approximation that we have discussed, but also the “approximation” inherent in our use of the random permutation model, ignoring equal keys. Ļe analysis of quicksort when equal keys are present is treated in Sedgewick [28]. Exercise 1.20 How many keys in a ŀle of 104 random integers less than 106 are likely to be equal to some other key in the ŀle? Run simulations, or do a mathematical analysis (with the help of a system for mathematical calculations), or do both. Exercise 1.21 Experiment with ŀles consisting of random positive integers less than M for M = 10,000, 1000, 100 and other values. Compare the performance of quicksort on such ŀles with its performance on random permutations of the same size. Characterize situations where the random permutation model is inaccurate. www.it-ebooks.info

点击进入文档下载页（PDF格式）

共574页，试读已结束，阅读完整版请下载

您可能感兴趣的文档

点击购买下载（PDF）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录