Synchronization(cont BPRAM Block-Parallel RAM) assumes n nodes, each containing a processor and a memory module, interconnected by a communication medium A computation is a sequence of phases called supersteps: in one superstep, each processor can execute operations on data residing in the local memory; send messages; execute a global synchronization instruction Charge L units to access 1st message and b units for each subsequent contiguous block
Synchronization (cont.) – BPRAM (Block-Parallel RAM) • assumes n nodes, each containing a processor and a memory module, interconnected by a communication medium. • A computation is a sequence of phases, called supersteps: in one superstep, each processor can execute operations on data residing in the local memory; send messages; execute a global synchronization instruction. • Charge L units to access 1st message and b units for each subsequent contiguous block 16
Latency Standard Pram assumes unit-cost for non-local memory access In practice, non-local memory access has severe effect on performance PRAM variant LPRAM (Local-memory PRAm) A set of nodes each with a processor and a local memory the nodes can communicate through a globally shared memory Two types of steps are defined and separately accounted for computation steps, where each processor performs one operation on local data, and communication steps, where each processor can write, and then read a word from global memory Charge a cost of l units to access global memory
Latency • Standard PRAM assumes unit-cost for non-local memory access • In practice, non-local memory access has severe effect on performance • PRAM variant – LPRAM (Local-memory PRAM) • A set of nodes each with a processor and a local memory; • the nodes can communicate through a globally shared memory. • Two types of steps are defined and separately accounted for: computation steps, where each processor performs one operation on local data, and communication steps, where each processor can write, and then read a word from global memory • Charge a cost of L units to access global memory 17
Bandwidth Standard pram assumes unlimited bandwidth In practice, bandwidth is limited PRAM Variant DRAM Distribution random access machine 2 level memory hierarchy Access to global memory is charged a cost based on possible data congestion PRAM( Global memory segmented into modules Any given step only m memory accesses can be serviced 18
Bandwidth • Standard PRAM assumes unlimited bandwidth • In practice, bandwidth is limited • PRAM Variant – DRAM (Distribution random access machine) • 2 level memory hierarchy • Access to global memory is charged a cost based on possible data congestion – PRAM(m) • Global memory segmented into modules • Any given step, only m memory accesses can be serviced 18
Other distributed models Distributed Memory Model No global memory Each processor associated with some local memory Postal model Processor sends request for non-local memory Instead of stalling it continues working while data is en-route
Other Distributed Models • Distributed Memory Model – No global memory – Each processor associated with some local memory • Postal Model – Processor sends request for non-local memory – Instead of stalling, it continues working while data is en-route 19
Network models Focus on impact of topology of communications network Early focus of parallel computation Distributed memory model? Cost of remote memory access is a function of both topology and the access pattern Provides incentives for efficient Data mappings Communications routing
Network Models • Focus on impact of topology of communications network • Early focus of parallel computation • Distributed Memory Model? • Cost of remote memory access is a function of both topology and the access pattern • Provides incentives for efficient – Data mappings – Communications routing 20