高级计算机体系结构设计及其在数据中心和云计算的应用Organizing Point-To-Point NetworksNetwork topology: organization of network- Tradeoff perf. (connectivity, latency, bandwidth)<> costRouterchips-Networks w/separate router chips areindirect-Networksw/processor/memory/routerinchiparedirectFewercomponents,"GluelessMp"RCPU(S)CPU($)MemRRMemRRMemMemMemRMemMemRMemRRRRCPU(S)CPU(S)CPU(S)CPU($)CPU(S)CPU($)
高级计算机体系结构设计及其在数据中心和云计算的应用 Organizing Point-To-Point Networks • Network topology: organization of network – Tradeoff perf. (connectivity, latency, bandwidth) cost • Router chips – Networks w/separate router chips are indirect – Networks w/ processor/memory/router in chip are direct • Fewer components, “Glueless MP” CPU($) Mem CPU($) Mem CPU($) Mem CPU($) R R R Mem R R R R CPU($) Mem R CPU($) Mem R CPU($) R Mem CPU($) R Mem
高级计算机体系结构设计及其在数据中心和云计算的应用Issues for Shared Memory SystemsTwo big ones-Cachecoherence-Memoryconsistency modelClosely relatedOften confused
高级计算机体系结构设计及其在数据中心和云计算的应用 Issues for Shared Memory Systems • Two big ones – Cache coherence – Memory consistency model • Closely related • Often confused
高级计算机体系结构设计及其在数据中心和云计算的应用Cache Coherence: The Problem (1/2)Variable A initiallyhas valueOP1 stores value 1 into AP2 loads A from memory and sees old value 0P1P2t1: Store A=1t2: Load A?A:0.1L1L1BusA:0MainMemoryNeedto do something to keep P2's cache coherent
高级计算机体系结构设计及其在数据中心和云计算的应用 Cache Coherence: The Problem (1/2) • Variable A initially has value 0 • P1 stores value 1 into A • P2 loads A from memory and sees old value 0 P1 t1: Store A=1 P2 t2: Load A? A: 0 Bus t1: Store A=1 A: 0 A: 0 1 A: 0 Main Memory L1 t2: Load A? L1 Need to do something to keep P2’s cache coherent
高级计算机体系结构设计及其在数据中心和云计算的应用Cache Coherence: The Problem (2/2)P1 and P2 have variable A (value O) in their cachesP1 stores value 1 into AP2 loads A from its cache and sees old value 0P1P2t1: Store A=1t2: Load A?A: 0A: 0.1L1L1BusA:0MainMemoryNeedto do something to keep P2's cache coherent
高级计算机体系结构设计及其在数据中心和云计算的应用 Cache Coherence: The Problem (2/2) • P1 and P2 have variable A (value 0) in their caches • P1 stores value 1 into A • P2 loads A from its cache and sees old value 0 P1 t1: Store A=1 P2 t2: Load A? A: 0 Bus t1: Store A=1 A: 0 A: 0 1 A: 0 Main Memory L1 t2: Load A? L1 Need to do something to keep P2’s cache coherent
高级计算机体系结构设计及其在数据中心和云计算的应用Approaches to Cache CoherenceSoftware-basedsolutions- Mechanisms::Mark cacheblocks/memorypages as cacheable/non-cacheable·Add“Flush"and"Invalidate"instructions-Couldbedonebycompilerorrun-timesystem- Difficult to get perfect (e.g., what about memory aliasing?)Hardware solutions are far more common-Systemensureseveryonealwaysseesthelatestvalue
高级计算机体系结构设计及其在数据中心和云计算的应用 Approaches to Cache Coherence • Software-based solutions – Mechanisms: • Mark cache blocks/memory pages as cacheable/non-cacheable • Add “Flush” and “Invalidate” instructions – Could be done by compiler or run-time system – Difficult to get perfect (e.g., what about memory aliasing?) • Hardware solutions are far more common – System ensures everyone always sees the latest value