Features(Contd Operations per step Read/write a word from to the memory Local operation An instruction could perform the following three operations in one cycle Fetch one or two words from the memory as operands Perform an arithmetic/ logic operation Store the result back in memory
Features (Cont’d) • Operations per step – Read/write a word from/to the memory – Local operation – An instruction could perform the following three operations in one cycle • Fetch one or two words from the memory as operands • Perform an arithmetic/logic operation • Store the result back in memory 11
Problems with pram accurate description of real-world parallel systems Unaccounted costs Latency, bandwidth, non-local memory access, memory access contention issues, synchronization costs, etc Algorithms perceived to work well in Pram may have poor performance in practice
Problems with PRAM • Inaccurate description of real-world parallel systems – Unaccounted costs • Latency, bandwidth, non-local memory access, memory access contention issues, synchronization costs, etc • Algorithms perceived to work well in PRAM may have poor performance in practice 12
PRAM Variants Variants arise to model some of these costs Each introduces some practical aspect of machine Gives algorithm designer better idea for optimization Variants can be grouped into 4 categories Memory access Synchronization Latenc Bandwidth
PRAM Variants • Variants arise to model some of these costs • Each introduces some practical aspect of machine – Gives algorithm designer better idea for optimization • Variants can be grouped into 4 categories – Memory access – Synchronization – Latency – Bandwidth 13
Memory access Impractical to have concurrent read and write to same memory location Contention issues CRCW PRAM CPRAM-CRCW(Common PRAM-CRCW):仅允许写入相同数据 PPRAM-CRCW( Priority PRAM-CRCW):仅允许优先级最高的处理器写入 APRAM-CRCW(Arbitrary PRAM-CRCW):允许任意处理器自由写入 · EREW or creW Pran QRQW(queue-read, queue-write Expensive Multiple ports required for concurrent access maybe prohibitively expensive
Memory Access • Impractical to have concurrent read and write to same memory location – Contention issues • CRCW PRAM – CPRAM-CRCW(Common PRAM-CRCW):仅允许写入相同数据 – PPRAM-CRCW(Priority PRAM-CRCW):仅允许优先级最高的处理器写入 – APRAM-CRCW(Arbitrary PRAM-CRCW):允许任意处理器自由写入 • EREW or CREW PRAM • QRQW (queue-read, queue-write ) – Expensive • Multiple ports required for concurrent access maybe prohibitively expensive. 14
Synchronization Standard PRam globally synchronized Standard Pram model do not charge a cost for synchronization Unrealistic! Synchronization is necessary and expensive in practical parallel systems Variants model cost of synchronization APRAM( asynchrony Pram每个处理器有其局部存储器、局部时钟 局部程序;无全局时钟,各处理器异步执行;处理器通过SM进行 通讯;处理器间依赖关系,需在并行程序中显式地加入同步路障 XPRAM(bulk synchronous PRAM, also known as bSP model Provides an incentive for algorithm designers to synchronize only when necessary
Synchronization • Standard PRAM globally synchronized • Standard PRAM model do not charge a cost for synchronization • Unrealistic! Synchronization is necessary and expensive in practical parallel systems • Variants model cost of synchronization – APRAM (asynchrony PRAM ):每个处理器有其局部存储器、局部时钟、 局部程序;无全局时钟,各处理器异步执行;处理器通过SM进行 通讯;处理器间依赖关系,需在并行程序中显式地加入同步路障 – XPRAM (bulk synchronous PRAM, also known as BSP model) – Provides an incentive for algorithm designers to synchronize only when necessary 15