上通大字 SHANGHAI JLAO TONG UNIVERSITY CS427 Multicore Architecture and Parallel Computing Lecture 7 CUDA Prof Xiaoyao Liang 201210/15
CS427 Multicore Architecture and Parallel Computing Lecture 7 CUDA Prof. Xiaoyao Liang 2012/10/15 1
⑨CUDA "Compute Unified Device Architecture General purpose programming model > User kicks off batches of threads on the gel Targeted software stack Compute oriented drivers, language, and tools Driver for loading computation programs into GPU Standalone Driver -Optimized for computation Interface designed for compute- graphics-free API Data sharing with OpengL buffer objects Guaranteed maximum download &z readback speeds Explicit eU memory management
CUDA 2 • “Compute Unified Device Architecture” ➢General purpose programming model ➢User kicks off batches of threads on the GPU •Targeted software stack ➢Compute oriented drivers, language, and tools •Driver for loading computation programs into GPU ➢Standalone Driver -Optimized for computation ➢Interface designed for compute –graphics-free API ➢Data sharing with OpenGL buffer objects ➢Guaranteed maximum download & readback speeds ➢Explicit GPU memory management
D)GPU Location CPU FSB 画图 AGP Northbridge (RAM NB CPU Southbridge
GPU Location 3
S GPU VS CPU Con trol ALU ALU ALU ALU Cache DRAM DRAM CPU GPU
GPU Vs. CPU 4
⑨ CUDA Execution Model E△xL1e1 cene1 Device Block (0, o) 3落 01 Device en1<4<>>>《 1,o》 Block (11)
CUDA Execution Model 5