Sequential Consistency ·SC约束了所有的存储器操作的序 Write→Read Write→ Write ·Read→)Read ·Read→) Write 是有关并行程序执行的简单模型 但是,直觉上在单处理器上的合理的存储器操作的重排 序会违反SC模型 现代微处理器设计中一直都在应用重排序操作来获得性 能提升 (write buffers, overlapped writes,non- blocking reads .). Question:如何协调性能提升与SC的约束? 2021/2/11 计算机体系结构
Sequential Consistency • SC 约束了所有的存储器操作的序: • Write → Read • Write → Write • Read → Read • Read → Write • 是有关并行程序执行的简单模型 • 但是, 直觉上在单处理器上的合理的存储器操作的重排 序会违反SC模型 • 现代微处理器设计中一直都在应用重排序操作来获得性 能提升(write buffers, overlapped writes, nonblocking reads…). • Question: 如何协调性能提升与SC的约束? 2021/2/11 计算机体系结构 7
sues in Implementing Sequential Consistency IPPPPIPI M 现代计算机系统实现SC的两个问题 Out-of-order execution capability oad(a; Load(b) yes Load(a): store(b) yes if a* b Store(a; Load(b yes if a* b Store(a: store(b) yes if a* b o Caches, write buffer Cache使得某一处理器的 store操作不能被另一处理器即时看到 No common commercial architecture has a sequentially consistent memory model!!! 2021/2/11 计算机体系结构 8
Issues in Implementing Sequential Consistency 8 现代计算机系统实现SC 的两个问题 • Out-of-order execution capability Load(a); Load(b) yes Load(a); Store(b) yes if a b Store(a); Load(b) yes if a b Store(a); Store(b) yes if a b • Caches. Write buffer Cache使得某一处理器的store操作不能被另一处理器即时看到 M P P P P P P No common commercial architecture has a sequentially consistent memory model !!! 2021/2/11 计算机体系结构
Relaxed Consistency Models Rules. X-Y: Operation X must complete before operation y is done Sequential consistency requires(sC): R→W,R→R.W→R.W→W IBM-370 Relax w→R(TSO) TSO Total store ordering"(X86) PS Relax W→W(Pso) 1---------- “ Partia| store order (V RMO PowerPC RCI Re|axR→ Wandr→R Figure 2.24: Relationship among models according to the "stricter"relation Weak ordering" and"release consistency Relax→R,R→W,W-R,W→W(RMO) Release Memory Ordering Maintains the program order to access the same location W→R.W→W 2021/2/11 计算机体系结构
Relaxed Consistency Models • Rules: – X → Y :Operation X must complete before operation Y is done 2021/2/11 计算机体系结构 9 Relax R → W and R → R “Weak ordering” and “release consistency” Relax R → R , R → W , W-R, W → W (RMO) “Release Memory Ordering” Maintains the program order to access the same location: W →R, W → W Relax W → R (TSO) “Total store ordering” (X86) Relax W → W (PSO) “Partial store order” Sequential consistency requires (SC) : R → W, R → R, W → R, W → W
O Simple categorization of relaxed models Relaxation W→R|W→W|R→RW‖ Read Others, Read Own‖ Safety net Order Order Order Write Early Write Early SC[6] IBM370[4] serialization instructions TSO [20] RMW RMW 匚Psop0 ‖RMw, STBAR synchronization RCsc[13,12] release, acquire, nsync, RMW RCpc[13,12] I release, acquire, nsync, RMW Alpha 19 MB. WMB RMO [21] various members PowerPC[7,4]√ SYNC Figure 8: Simple categorization of relaxed models. AV indicates that the corresponding relaxation is allowed by straightforward implementations of the corresponding model. It also indicates that the relaxation can be detected by the programmer(by affecting the results of the program) except for the following cases. The"Read Own Write early?relaxation is not detectable with the SC, wo. alpha, and powerPC models. The "Read Others'Write early" relaxation is possible and detectable with complex implementations of RCsc 2021/2/11 计算机体系结构
Simple categorization of relaxed models 2021/2/11 计算机体系结构 10
TABLE 4.1: Can both rl and r2 be Set to o? Core cl Core c2 Comments SI: x= NEW: S2: y=NEW: Initially, x=0&y=0*/ LI: rl=y L2:r2 program order(<p) of Core CI memory order(<m program order(<p) of Core C2 SI:x=NEw:体NEW S2: y= NEW; /*NEW%/ LI: rl =y: /NEW*/ L2 n2=x:/* NEw 8/ Outcome:(rl r2)= (NEW, NEW) (a) TSO SC Execution 1 SI: X=NEW: F NEW # LI: rl=y: /0/ S2y= NEW: A NEW材 L2:2=x,体NEW Outcome:(rl, r2)=(0, NEw) (b) Tso SC Execution 2 2021/2/11 计算机体系结构 11
2021/2/11 计算机体系结构 11