LogP Model sender receiver 0: 26
LogP Model 26 sender receiver o L g o t
Bulk Synchronous parallel Bulk Synchronous Parallel(BSP P processors with local memory Router Facilities for periodic global synchronization Every I steps Models Bandwidth limitations agency Synchronization costs Does not model Communication overhead Processor topology 27
Bulk Synchronous Parallel • Bulk Synchronous Parallel(BSP) – P processors with local memory – Router – Facilities for periodic global synchronization • Every l steps – Models • Bandwidth limitations • Latency • Synchronization costs – Does not model • Communication overhead • Processor topology 27
BSP Computer Distributed memory architecture 3 components Nod Processor · Local memory Router(Communication Network Point-to-point, message passing(or shared variable) Barrier synchronizing facility All or subset
BSP Computer • Distributed memory architecture • 3 components – Node • Processor • Local memory – Router (Communication Network) • Point-to-point, message passing (or shared variable) – Barrier synchronizing facility • All or subset 28
lustration of bsp Node(w) node ode P B arrier Communication Network(g)
Illustration of BSP 29 Communication Network (g) P M P M P M Node (w) Node Node Barrier (l)
Three parameters w parameter laximum computation time within each superstep Computation operation takes at most w cycles g parameter of cycles for communication of unit message when all processors are involved in communication -network bandwidth h The maximum number of incoming or outgoing messages for a superstep Communication operation takes gh cycles °/ parameter Barrier synchronization takes cycles
Three Parameters • w parameter – Maximum computation time within each superstep – Computation operation takes at most w cycles. • g parameter – # of cycles for communication of unit message when all processors are involved in communication - network bandwidth – h The maximum number of incoming or outgoing messages for a superstep – Communication operation takes gh cycles. • l parameter – Barrier synchronization takes l cycles. 30