当前位置：和泉文库 > 计算机 > 浏览文档

《并行与分布式程序设计》课程教学参考书：CUDA《Programming Massively Parallel Processors》A Hands-on Approach（美，David B. Kirk and Wen-mei W. Hwu，英文版）

文件格式：PDF，文件大小：4.74MB，售价：26.78元

文档详细内容（约277页）

Acknowledgments We especially acknowledge Ian Buck,the father of CUDA and John Nickolls,the lead architect of Tesla GPU Computing Architecture.Their teams created an excellent infrastructure for this course.Ashutosh Rege and the NVIDIA DevTech team contributed to the original slides and contents used in ECE498AL course.Bill Bean,Simon Green,Mark Harris,Manju Hedge,Nadeem Mohammad,Brent Oster,Peter Shirley,Eric Young,and Cyril Zeller provided review comments and corrections to the manuscripts. Nadeem Mohammad organized the NVIDIA review efforts and also helped to plan Chapter 11 and Appendix B.Calisa Cole helped with cover. Nadeem's heroic efforts have been critical to the completion of this book. We also thank Jensen Huang for providing a great amount of financial and human resources for developing the course.Tony Tamasi's team con- tributed heavily to the review and revision of the book chapters.Jensen also took the time to read the early drafts of the chapters and gave us valuable feedback.David Luebke has facilitated the GPU computing resources for the course.Jonah Alben has provided valuable insight.Michael Shebanow and Michael Garland have given guest lectures and contributed materials. John Stone and Sam Stone in Illinois contributed much of the base material for the case study and OpenCL chapters.John Stratton and Chris Rodrigues contributed some of the base material for the computational thinking chapter.I-Jui "Ray"Sung,John Stratton,Xiao-Long Wu,Nady Obeid contributed to the lab material and helped to revise the course material as they volunteered to serve as teaching assistants on top of their research. Laurie Talkington and James Hutchinson helped to dictate early lectures that served as the base for the first five chapters.Mike Showerman helped build two generations of GPU computing clusters for the course.Jeremy Enos worked tirelessly to ensure that students have a stable,user-friendly GPU computing cluster to work on their lab assignments and projects. We acknowledge Dick Blahut who challenged us to create the course in Illinois.His constant reminder that we needed to write the book helped keep us going.Beth Katsinas arranged a meeting between Dick Blahut and NVIDIA Vice President Dan Vivoli.Through that gathering,Blahut was introduced to David and challenged David to come to Illinois and create the course with Wen-mei. We also thank Thom Dunning of the University of Illinois and Sharon Glotzer of the University of Michigan,Co-Directors of the multiuniversity Virtual School of Computational Science and Engineering,for graciously xvii

Acknowledgments We especially acknowledge Ian Buck, the father of CUDA and John Nickolls, the lead architect of Tesla GPU Computing Architecture. Their teams created an excellent infrastructure for this course. Ashutosh Rege and the NVIDIA DevTech team contributed to the original slides and contents used in ECE498AL course. Bill Bean, Simon Green, Mark Harris, Manju Hedge, Nadeem Mohammad, Brent Oster, Peter Shirley, Eric Young, and Cyril Zeller provided review comments and corrections to the manuscripts. Nadeem Mohammad organized the NVIDIA review efforts and also helped to plan Chapter 11 and Appendix B. Calisa Cole helped with cover. Nadeem’s heroic efforts have been critical to the completion of this book. We also thank Jensen Huang for providing a great amount of financial and human resources for developing the course. Tony Tamasi’s team contributed heavily to the review and revision of the book chapters. Jensen also took the time to read the early drafts of the chapters and gave us valuable feedback. David Luebke has facilitated the GPU computing resources for the course. Jonah Alben has provided valuable insight. Michael Shebanow and Michael Garland have given guest lectures and contributed materials. John Stone and Sam Stone in Illinois contributed much of the base material for the case study and OpenCL chapters. John Stratton and Chris Rodrigues contributed some of the base material for the computational thinking chapter. I-Jui “Ray” Sung, John Stratton, Xiao-Long Wu, Nady Obeid contributed to the lab material and helped to revise the course material as they volunteered to serve as teaching assistants on top of their research. Laurie Talkington and James Hutchinson helped to dictate early lectures that served as the base for the first five chapters. Mike Showerman helped build two generations of GPU computing clusters for the course. Jeremy Enos worked tirelessly to ensure that students have a stable, user-friendly GPU computing cluster to work on their lab assignments and projects. We acknowledge Dick Blahut who challenged us to create the course in Illinois. His constant reminder that we needed to write the book helped keep us going. Beth Katsinas arranged a meeting between Dick Blahut and NVIDIA Vice President Dan Vivoli. Through that gathering, Blahut was introduced to David and challenged David to come to Illinois and create the course with Wen-mei. We also thank Thom Dunning of the University of Illinois and Sharon Glotzer of the University of Michigan, Co-Directors of the multiuniversity Virtual School of Computational Science and Engineering, for graciously xvii

CHAPTER Introduction 1 CHAPTER CONTENTS 1.1 GPUs as Parallel Computers ................................................................................. 2 1.2 Architecture of a Modern GPU .............................................................................. 8 1.3 Why More Speed or Parallelism? ........................................................................ 10 1.4 Parallel Programming Languages and Models...................................................... 13 1.5 Overarching Goals.............................................................................................. 15 1.6 Organization of the Book .................................................................................... 16 References and Further Reading ............................................................................. 18 INTRODUCTION Microprocessors based on a single central processing unit (CPU), such as those in the Intel Pentium family and the AMD Opteron family, drove rapid performance increases and cost reductions in computer applications for more than two decades. These microprocessors brought giga (billion) floating-point operations per second (GFLOPS) to the desktop and hundreds of GFLOPS to cluster servers. This relentless drive of performance improvement has allowed application software to provide more functionality, have better user interfaces, and generate more useful results. The users, in turn, demand even more improvements once they become accustomed to these improvements, creating a positive cycle for the computer industry. During the drive, most software developers have relied on the advances in hardware to increase the speed of their applications under the hood; the same software simply runs faster as each new generation of processors is introduced. This drive, however, has slowed since 2003 due to energyconsumption and heat-dissipation issues that have limited the increase of the clock frequency and the level of productive activities that can be performed in each clock period within a single CPU. Virtually all microprocessor vendors have switched to models where multiple processing units, referred to as processor cores, are used in each chip to increase the 1

点击进入文档下载页（PDF格式）

共277页，可试读40页，点击继续阅读 ↓↓

您可能感兴趣的文档

点击购买下载（PDF）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录