GPUS:THROUGHPUT ORIENTED DESIGN Small caches To boost memory throughput GPU Simple control No branch prediction No data forwarding Energy efficient ALUs DRAM Many,long latency but heavily pipelined for high throughput .Require massive number of threads to tolerate latencies Threading logic .Thread state 电子料发女学 Universityof ElectriScience and TachnolopfChina O
GPUS: THROUGHPUT ORIENTED DESIGN ▪ Small caches ▪ To boost memory throughput ▪ Simple control ▪ No branch prediction ▪ No data forwarding ▪ Energy efficient ALUs ▪ Many, long latency but heavily pipelined for high throughput ▪ Require massive number of threads to tolerate latencies ▪ Threading logic ▪ Thread state DRAM GPU
WINNING APPLICATIONS USE BOTH CPU AND GPU .GPUs for parallel CPUs for sequential parts where parts where latency throughput wins matters .GPUs can be 10X+faster .CPUs can be 10X+faster than CPUs for parallel than GPUs for sequential code code 电子科发女学 Universityof Electri Science and Tachnolopf China O
WINNING APPLICATIONS USE BOTH CPU AND GPU ▪ GPUs for parallel parts where throughput wins ▪ GPUs can be 10X+ faster than CPUs for parallel code ▪ CPUs for sequential parts where latency matters ▪ CPUs can be 10X+ faster than GPUs for sequential code
Introduction to Heterogeneous Parallel Computing CUDA C vs.CUDA Libs vs.OpenACC Memory Allocation and Data Movement API Functions Data Parallelism and Threads 十件发女亨 University of Electrei Science and TachnolopChina
Introduction to Heterogeneous Parallel Computing CUDA C vs. CUDA Libs vs. OpenACC Memory Allocation and Data Movement API Functions Data Parallelism and Threads
OBJECTIVE .To learn the main venues and developer resources for GPU computing Where CUDA C fits in the big picture 电子料皮女学 niversitof Electr Science and TachnoloChina O
OBJECTIVE ▪To learn the main venues and developer resources for GPU computing ▪ Where CUDA C fits in the big picture
3 WAYS TO ACCELERATE APPLICATIONS Applications Libraries Compiler Programming Directives Languages Easy to use Easy to use Most Performance Most Performance Portable code Most Flexibility 电子料烛女学 University of Electricience and TachnolopChina
3 WAYS TO ACCELERATE APPLICATIONS Applications Libraries Easy to use Most Performance Programming Languages Most Performance Most Flexibility Easy to use Portable code Compiler Directives