当前位置：和泉文库 > 计算机 > 《并行与分布式程序设计》课程教学参考书：NVIDIA《CUDA C PROGRAMMING GUIDE》（Design Guide，CHANGES FROM VERSION 9.0）

《并行与分布式程序设计》课程教学参考书：NVIDIA《CUDA C PROGRAMMING GUIDE》（Design Guide，CHANGES FROM VERSION 9.0）

文件格式：PDF，文件大小：5.69MB，售价：28.95元

共306页，可试读40页，点击往前阅读 ↑↑

文档详细内容（约306页）

Chapter 3 PROGRAMMING INTERFACE CUDA C provides a simple path for users familiar with the C programming language to easily write programs for execution by the device. It consists of a minimal set of extensions to the C language and a runtime library. The core language extensions have been introduced in Programming Model.They allow programmers to define a kernel as a C function and use some new syntax to specify the grid and block dimension each time the function is called.A complete description of all extensions can be found in C Language Extensions.Any source file that contains some of these extensions must be compiled with nvce as outlined in Compilation with NVCC. The runtime is introduced in Compilation Workflow.It provides C functions that execute on the host to allocate and deallocate device memory,transfer data between host memory and device memory,manage systems with multiple devices,etc.A complete description of the runtime can be found in the CUDA reference manual. The runtime is built on top of a lower-level C API,the CUDA driver API,which is also accessible by the application.The driver API provides an additional level of control by exposing lower-level concepts such as CUDA contexts-the analogue of host processes for the device-and CUDA modules-the analogue of dynamically loaded libraries for the device.Most applications do not use the driver API as they do not need this additional level of control and when using the runtime,context and module management are implicit,resulting in more concise code.The driver API is introduced in Driver API and fully described in the reference manual. 3.1.Compilation with NVCC Kernels can be written using the CUDA instruction set architecture,called PTX,which is described in the PTX reference manual.It is however usually more effective to use a high-level programming language such as C.In both cases,kernels must be compiled into binary code by nvee to execute on the device. nvee is a compiler driver that simplifies the process of compiling C or PTX code:It provides simple and familiar command line options and executes them by invoking the collection of tools that implement the different compilation stages.This section gives www.nvidia.com CUDA C Programming Guide PG-02829-001_v9.2|16

www.nvidia.com CUDA C Programming Guide PG-02829-001_v9.2 | 16 Chapter 3. PROGRAMMING INTERFACE CUDA C provides a simple path for users familiar with the C programming language to easily write programs for execution by the device. It consists of a minimal set of extensions to the C language and a runtime library. The core language extensions have been introduced in Programming Model. They allow programmers to define a kernel as a C function and use some new syntax to specify the grid and block dimension each time the function is called. A complete description of all extensions can be found in C Language Extensions. Any source file that contains some of these extensions must be compiled with nvcc as outlined in Compilation with NVCC. The runtime is introduced in Compilation Workflow. It provides C functions that execute on the host to allocate and deallocate device memory, transfer data between host memory and device memory, manage systems with multiple devices, etc. A complete description of the runtime can be found in the CUDA reference manual. The runtime is built on top of a lower-level C API, the CUDA driver API, which is also accessible by the application. The driver API provides an additional level of control by exposing lower-level concepts such as CUDA contexts - the analogue of host processes for the device - and CUDA modules - the analogue of dynamically loaded libraries for the device. Most applications do not use the driver API as they do not need this additional level of control and when using the runtime, context and module management are implicit, resulting in more concise code. The driver API is introduced in Driver API and fully described in the reference manual. 3.1. Compilation with NVCC Kernels can be written using the CUDA instruction set architecture, called PTX, which is described in the PTX reference manual. It is however usually more effective to use a high-level programming language such as C. In both cases, kernels must be compiled into binary code by nvcc to execute on the device. nvcc is a compiler driver that simplifies the process of compiling C or PTX code: It provides simple and familiar command line options and executes them by invoking the collection of tools that implement the different compilation stages. This section gives

点击进入文档下载页（PDF格式）

共306页，可试读40页，点击继续阅读 ↓↓

您可能感兴趣的文档

点击购买下载（PDF）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录