当前位置：和泉文库 > 计算机 > 浏览文档

《并行与分布式程序设计》课程教学参考书：并行与并发编程《An Introduction to Parallel Programming》

文件格式：PDF，文件大小：3.66MB，售价：34.28元

文档详细内容（约377页）

Acknowledgments In the course of working on this book,I've received considerable help from many individuals.Among them I'd like to thank the reviewers who read and commented on the initial proposal:Fikret Ercal (Missouri University of Science and Technology), Dan Harvey (Southern Oregon University),Joel Hollingsworth (Elon University), Jens Mache (Lewis and Clark College),Don McLaughlin (West Virginia Uni- versity),Manish Parashar(Rutgers University),Charlie Peck (Earlham College), Stephen C.Renk (North Central College),Rolfe Josef Sassenfeld (The University of Texas at El Paso),Joseph Sloan (Wofford College),Michela Taufer(University of Delaware),Pearl Wang(George Mason University),Bob Weems(University of Texas at Arlington),and Cheng-Zhong Xu(Wayne State University). I'm also deeply grateful to the following individuals for their reviews of vari- ous chapters of the book:Duncan Buell (University of South Carolina),Matthias Gobbert(University of Maryland,Baltimore County),Krishna Kavi(University of North Texas),Hong Lin(University of Houston-Downtown),Kathy Liszka(Univer- sity of Akron),Leigh Little (The State University of New York),Xinlian Liu (Hood College),Henry Tufo(University of Colorado at Boulder),Andrew Sloss(Consul- tant Engineer,ARM),and Gengbin Zheng (University of Illinois).Their comments and suggestions have made the book immeasurably better.Of course,I'm solely responsible for remaining errors and omissions. Kathy Liszka is also preparing slides that can be used by faculty who adopt the text,and a former student,Jinyoung Choi,is working on preparing a solutions manual.Thanks to both of them. The staff of Morgan Kaufmann has been very helpful throughout this project.I'm especially grateful to the developmental editor,Nate McFadden.He gave me much valuable advice,and he did a terrific job arranging for the reviews.He's also been tremendously patient with all the problems I've encountered over the past few years. Thanks also to Marilyn Rash and Megan Guiney,who have been very prompt and efficient during the production process. My colleagues in the computer science and mathematics departments at USF have been extremely helpful during my work on the book.I'd like to single out Professor Gregory Benson for particular thanks:his understanding of parallel computing- especially Pthreads and semaphores-has been an invaluable resource for me.I'm also very grateful to our system administrators,Alexey Fedosov and Colin Bean. They've patiently and efficiently dealt with all of the "emergencies"that cropped up while I was working on programs for the book. I would never have been able to finish this book without the encouragement and moral support of my friends Holly Cohn,John Dean,and Robert Miller.They helped me through some very difficult times,and I'll be eternally grateful to them My biggest debt is to my students.They showed me what was too easy,and what was far too difficult.In short,they taught me how to teach parallel computing.My deepest thanks to all of them

Acknowledgments In the course of working on this book, I’ve received considerable help from many individuals. Among them I’d like to thank the reviewers who read and commented on the initial proposal: Fikret Ercal (Missouri University of Science and Technology), Dan Harvey (Southern Oregon University), Joel Hollingsworth (Elon University), Jens Mache (Lewis and Clark College), Don McLaughlin (West Virginia University), Manish Parashar (Rutgers University), Charlie Peck (Earlham College), Stephen C. Renk (North Central College), Rolfe Josef Sassenfeld (The University of Texas at El Paso), Joseph Sloan (Wofford College), Michela Taufer (University of Delaware), Pearl Wang (George Mason University), Bob Weems (University of Texas at Arlington), and Cheng-Zhong Xu (Wayne State University). I’m also deeply grateful to the following individuals for their reviews of various chapters of the book: Duncan Buell (University of South Carolina), Matthias Gobbert (University of Maryland, Baltimore County), Krishna Kavi (University of North Texas), Hong Lin (University of Houston–Downtown), Kathy Liszka (University of Akron), Leigh Little (The State University of New York), Xinlian Liu (Hood College), Henry Tufo (University of Colorado at Boulder), Andrew Sloss (Consultant Engineer, ARM), and Gengbin Zheng (University of Illinois). Their comments and suggestions have made the book immeasurably better. Of course, I’m solely responsible for remaining errors and omissions. Kathy Liszka is also preparing slides that can be used by faculty who adopt the text, and a former student, Jinyoung Choi, is working on preparing a solutions manual. Thanks to both of them. The staff of Morgan Kaufmann has been very helpful throughout this project. I’m especially grateful to the developmental editor, Nate McFadden. He gave me much valuable advice, and he did a terrific job arranging for the reviews. He’s also been tremendously patient with all the problems I’ve encountered over the past few years. Thanks also to Marilyn Rash and Megan Guiney, who have been very prompt and efficient during the production process. My colleagues in the computer science and mathematics departments at USF have been extremely helpful during my work on the book. I’d like to single out Professor Gregory Benson for particular thanks: his understanding of parallel computing— especially Pthreads and semaphores—has been an invaluable resource for me. I’m also very grateful to our system administrators, Alexey Fedosov and Colin Bean. They’ve patiently and efficiently dealt with all of the “emergencies” that cropped up while I was working on programs for the book. I would never have been able to finish this book without the encouragement and moral support of my friends Holly Cohn, John Dean, and Robert Miller. They helped me through some very difficult times, and I’ll be eternally grateful to them. My biggest debt is to my students. They showed me what was too easy, and what was far too difficult. In short, they taught me how to teach parallel computing. My deepest thanks to all of them

CHAPTER Why Parallel Computing? From 1986 to 2002 the performance of microprocessors increased,on average,50% per year [27].This unprecedented increase meant that users and software develop- ers could often simply wait for the next generation of microprocessors in order to obtain increased performance from an application program.Since 2002,however, single-processor performance improvement has slowed to about 20%per year.This difference is dramatic:at 50%per year,performance will increase by almost a factor of 60 in 10 years,while at 20%,it will only increase by about a factor of 6. Furthermore,this difference in performance increase has been associated with a dramatic change in processor design.By 2005,most of the major manufacturers of microprocessors had decided that the road to rapidly increasing performance lay in the direction of parallelism.Rather than trying to continue to develop ever-faster monolithic processors,manufacturers started putting multiple complete processors on a single integrated circuit. This change has a very important consequence for software developers:simply adding more processors will not magically improve the performance of the vast majority of serial programs,that is,programs that were written to run on a single processor.Such programs are unaware of the existence of multiple processors,and the performance of such a program on a system with multiple processors will be effectively the same as its performance on a single processor of the multiprocessor system. All of this raises a number of questions: 1.Why do we care?Aren't single processor systems fast enough?After all,20%per year is still a pretty significant performance improvement. 2.Why can't microprocessor manufacturers continue to develop much faster sin- gle processor systems?Why build parallel systems?Why build systems with multiple processors? 3.Why can't we write programs that will automatically convert serial programs into parallel programs,that is,programs that take advantage of the presence of multiple processors? Let's take a brief look at each of these questions.Keep in mind,though,that some of the answers aren't carved in stone.For example,20%per year may be more than adequate for many applications. An Introduction to Parallel Programming.DOl:10.1016/B978-0-12-374260-5.00001-4 1 Copyright 2011 Elsevier Inc.All rights reserved

CHAPTER 1 Why Parallel Computing? From 1986 to 2002 the performance of microprocessors increased, on average, 50% per year [27]. This unprecedented increase meant that users and software developers could often simply wait for the next generation of microprocessors in order to obtain increased performance from an application program. Since 2002, however, single-processor performance improvement has slowed to about 20% per year. This difference is dramatic: at 50% per year, performance will increase by almost a factor of 60 in 10 years, while at 20%, it will only increase by about a factor of 6. Furthermore, this difference in performance increase has been associated with a dramatic change in processor design. By 2005, most of the major manufacturers of microprocessors had decided that the road to rapidly increasing performance lay in the direction of parallelism. Rather than trying to continue to develop ever-faster monolithic processors, manufacturers started putting multiple complete processors on a single integrated circuit. This change has a very important consequence for software developers: simply adding more processors will not magically improve the performance of the vast majority of serial programs, that is, programs that were written to run on a single processor. Such programs are unaware of the existence of multiple processors, and the performance of such a program on a system with multiple processors will be effectively the same as its performance on a single processor of the multiprocessor system. All of this raises a number of questions: 1. Why do we care? Aren’t single processor systems fast enough? After all, 20% per year is still a pretty significant performance improvement. 2. Why can’t microprocessor manufacturers continue to develop much faster single processor systems? Why build parallel systems? Why build systems with multiple processors? 3. Why can’t we write programs that will automatically convert serial programs into parallel programs, that is, programs that take advantage of the presence of multiple processors? Let’s take a brief look at each of these questions. Keep in mind, though, that some of the answers aren’t carved in stone. For example, 20% per year may be more than adequate for many applications. Copyright c 2011 Elsevier Inc. All rights reserved. An Introduction to Parallel Programming. DOI: 10.1016/B978-0-12-374260-5.00001-4 1

2 CHAPTER 1 Why Parallel Computing? 1.1 WHY WE NEED EVER-INCREASING PERFORMANCE The vast increases in computational power that we've been enjoying for decades now have been at the heart of many of the most dramatic advances in fields as diverse as science,the Internet,and entertainment.For example,decoding the human genome,ever more accurate medical imaging,astonishingly fast and accurate Web searches,and ever more realistic computer games would all have been impossi- ble without these increases.Indeed,more recent increases in computational power would have been difficult,if not impossible,without earlier increases.But we can never rest on our laurels.As our computational power increases,the number of prob- lems that we can seriously consider solving also increases.The following are a few examples: Climate modeling.In order to better understand climate change,we need far more accurate computer models,models that include interactions between the atmo- sphere,the oceans,solid land,and the ice caps at the poles.We also need to be able to make detailed studies of how various interventions might affect the global climate. Protein folding.It's believed that misfolded proteins may be involved in dis- eases such as Huntington's,Parkinson's,and Alzheimer's,but our ability to study configurations of complex molecules such as proteins is severely limited by our current computational power. Drug discovery.There are many ways in which increased computational power can be used in research into new medical treatments.For example,there are many drugs that are effective in treating a relatively small fraction of those suffering from some disease.It's possible that we can devise alternative treatments by care- ful analysis of the genomes of the individuals for whom the known treatment is ineffective.This,however,will involve extensive computational analysis of genomes. Energy research.Increased computational power will make it possible to program much more detailed models of technologies such as wind turbines,solar cells,and batteries.These programs may provide the information needed to construct far more efficient clean energy sources. Data analysis.We generate tremendous amounts of data.By some estimates,the quantity of data stored worldwide doubles every two years [28],but the vast majority of it is largely useless unless it's analyzed.As an example,knowing the sequence of nucleotides in human DNA is,by itself,of little use.Understand- ing how this sequence affects development and how it can cause disease requires extensive analysis.In addition to genomics,vast quantities of data are generated by particle colliders such as the Large Hadron Collider at CERN,medical imaging, astronomical research,and Web search engines-to name a few. These and a host of other problems won't be solved without vast increases in computational power

2 CHAPTER 1 Why Parallel Computing? 1.1 WHY WE NEED EVER-INCREASING PERFORMANCE The vast increases in computational power that we’ve been enjoying for decades now have been at the heart of many of the most dramatic advances in fields as diverse as science, the Internet, and entertainment. For example, decoding the human genome, ever more accurate medical imaging, astonishingly fast and accurate Web searches, and ever more realistic computer games would all have been impossible without these increases. Indeed, more recent increases in computational power would have been difficult, if not impossible, without earlier increases. But we can never rest on our laurels. As our computational power increases, the number of problems that we can seriously consider solving also increases. The following are a few examples: . Climate modeling. In order to better understand climate change, we need far more accurate computer models, models that include interactions between the atmosphere, the oceans, solid land, and the ice caps at the poles. We also need to be able to make detailed studies of how various interventions might affect the global climate. . Protein folding. It’s believed that misfolded proteins may be involved in diseases such as Huntington’s, Parkinson’s, and Alzheimer’s, but our ability to study configurations of complex molecules such as proteins is severely limited by our current computational power. . Drug discovery. There are many ways in which increased computational power can be used in research into new medical treatments. For example, there are many drugs that are effective in treating a relatively small fraction of those suffering from some disease. It’s possible that we can devise alternative treatments by careful analysis of the genomes of the individuals for whom the known treatment is ineffective. This, however, will involve extensive computational analysis of genomes. . Energy research. Increased computational power will make it possible to program much more detailed models of technologies such as wind turbines, solar cells, and batteries. These programs may provide the information needed to construct far more efficient clean energy sources. . Data analysis. We generate tremendous amounts of data. By some estimates, the quantity of data stored worldwide doubles every two years [28], but the vast majority of it is largely useless unless it’s analyzed. As an example, knowing the sequence of nucleotides in human DNA is, by itself, of little use. Understanding how this sequence affects development and how it can cause disease requires extensive analysis. In addition to genomics, vast quantities of data are generated by particle colliders such as the Large Hadron Collider at CERN, medical imaging, astronomical research, and Web search engines—to name a few. These and a host of other problems won’t be solved without vast increases in computational power

1.3 Why We Need to Write Parallel Programs 3 1.2 WHY WE'RE BUILDING PARALLEL SYSTEMS Much of the tremendous increase in single processor performance has been driven by the ever-increasing density of transistors-the electronic switches-on integrated circuits.As the size of transistors decreases,their speed can be increased,and the overall speed of the integrated circuit can be increased.However,as the speed of transistors increases,their power consumption also increases.Most of this power is dissipated as heat,and when an integrated circuit gets too hot,it becomes unreli- able.In the first decade of the twenty-first century,air-cooled integrated circuits are reaching the limits of their ability to dissipate heat [26]. Therefore,it is becoming impossible to continue to increase the speed of inte- grated circuits.However,the increase in transistor density can continue-at least for a while.Also,given the potential of computing to improve our existence,there is an almost moral imperative to continue to increase computational power.Finally,if the integrated circuit industry doesn't continue to bring out new and better products,it will effectively cease to exist. How then,can we exploit the continuing increase in transistor density?The answer is parallelism.Rather than building ever-faster,more complex,monolithic processors,the industry has decided to put multiple,relatively simple,complete processors on a single chip.Such integrated circuits are called multicore proces- sors,and core has become synonymous with central processing unit,or CPU.In this setting a conventional processor with one CPU is often called a single-core system. 1.3 WHY WE NEED TO WRITE PARALLEL PROGRAMS Most programs that have been written for conventional,single-core systems cannot exploit the presence of multiple cores.We can run multiple instances of a program on a multicore system,but this is often of little help.For example,being able to run multiple instances of our favorite game program isn't really what we want-we want the program to run faster with more realistic graphics.In order to do this,we need to either rewrite our serial programs so that they're parallel,so that they can make use of multiple cores,or write translation programs,that is,programs that will automatically convert serial programs into parallel programs.The bad news is that researchers have had very limited success writing programs that convert serial programs in languages such as C and C++into parallel programs. This isn't terribly surprising.While we can write programs that recognize com- mon constructs in serial programs,and automatically translate these constructs into efficient parallel constructs,the sequence of parallel constructs may be terribly inef- ficient.For example,we can view the multiplication of two n x n matrices as a sequence of dot products,but parallelizing a matrix multiplication as a sequence of parallel dot products is likely to be very slow on many systems

1.3 Why We Need to Write Parallel Programs 3 1.2 WHY WE’RE BUILDING PARALLEL SYSTEMS Much of the tremendous increase in single processor performance has been driven by the ever-increasing density of transistors—the electronic switches—on integrated circuits. As the size of transistors decreases, their speed can be increased, and the overall speed of the integrated circuit can be increased. However, as the speed of transistors increases, their power consumption also increases. Most of this power is dissipated as heat, and when an integrated circuit gets too hot, it becomes unreliable. In the first decade of the twenty-first century, air-cooled integrated circuits are reaching the limits of their ability to dissipate heat [26]. Therefore, it is becoming impossible to continue to increase the speed of integrated circuits. However, the increase in transistor density can continue—at least for a while. Also, given the potential of computing to improve our existence, there is an almost moral imperative to continue to increase computational power. Finally, if the integrated circuit industry doesn’t continue to bring out new and better products, it will effectively cease to exist. How then, can we exploit the continuing increase in transistor density? The answer is parallelism. Rather than building ever-faster, more complex, monolithic processors, the industry has decided to put multiple, relatively simple, complete processors on a single chip. Such integrated circuits are called multicore processors, and core has become synonymous with central processing unit, or CPU. In this setting a conventional processor with one CPU is often called a single-core system. 1.3 WHY WE NEED TO WRITE PARALLEL PROGRAMS Most programs that have been written for conventional, single-core systems cannot exploit the presence of multiple cores. We can run multiple instances of a program on a multicore system, but this is often of little help. For example, being able to run multiple instances of our favorite game program isn’t really what we want—we want the program to run faster with more realistic graphics. In order to do this, we need to either rewrite our serial programs so that they’re parallel, so that they can make use of multiple cores, or write translation programs, that is, programs that will automatically convert serial programs into parallel programs. The bad news is that researchers have had very limited success writing programs that convert serial programs in languages such as C and C++ into parallel programs. This isn’t terribly surprising. While we can write programs that recognize common constructs in serial programs, and automatically translate these constructs into efficient parallel constructs, the sequence of parallel constructs may be terribly inef- ficient. For example, we can view the multiplication of two n × n matrices as a sequence of dot products, but parallelizing a matrix multiplication as a sequence of parallel dot products is likely to be very slow on many systems

点击进入文档下载页（PDF格式）

共377页，可试读40页，点击继续阅读 ↓↓

您可能感兴趣的文档

点击购买下载（PDF）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录