2-bit Bht 解决办法:2位记录分支历史 NT Predict Taken Predict Taken N T NT Predict not redict Not Taken Taken NT Red: stop, not taken Green: go, taken 计算机体系结构 Chapter4 3.11
计算机体系结构 Chapter4_3.11 ▪ 解决办法: 2位记录分支历史 ▪ Red: stop, not taken ▪ Green: go, taken 2-bit BHT NT T T NT Predict Taken Predict Not Taken Predict Taken Predict Not Taken T NT T NT
nasa ma*rix33 0% 1% doduc 5% SPAce benchmarks fp 9% gcc 5% eqntott 10% 0%2%4%6%8%1012%14%16%18% FIGURE 3.8 Prediction accuracy of a 4096-entry two-bit prediction buffer for the SPEC89 benchmarks. The misprediction rate for the integer benchmarks(gcc, espresso eqntott, and li) is substantially higher (average of 11%)than that for the FP programs(aver- age of 4 %). Even omitting the FP kernels(nasa7, matrix 300, and tomcatv) still yields a higher accuracy for the FP benchmarks than for the integer benchmarks. These data, as well as the rest of the data in this section, are taken from a branch prediction study done using the IBM Power architecture and optimized code for that system. See Pan et al. [1992]. 计算机体系结构 Chapter4 3.12
计算机体系结构 Chapter4_3.12
7353/ D marro 300 0% doduc SPEC89 benchmarks fpppp 12% 18 18% 10% 0%2%4%6%8%10%12%14%16%18% Frequency of mispredictions 口4096 entes:口 Unlmited entries oits per entry 2 bits per entry FIGURE 3.9 Prediction accuracy of a 4096-entry two-bit prediction buffer versus an infinite buffer for the sPEc89 benchmarks 计算机体系结构 Chapter4 3.13
计算机体系结构 Chapter4_3.13
BHT Accuracy ■分支预测错误的原因 预测错误 ·由于使用PC的低位查找BHT表,可能得到错误的分支历史记录 BHT表的大小问题 4096项的表分支预测错误的比例为1%(nasa7, tomcat)to18% (eqntott), spice at 9% and gcc at 12% 再增加项数,对提高预测准确率几乎没有效果 (in alpha 21164) 计算机体系结构 Chapter4 3.14
计算机体系结构 Chapter4_3.14 BHT Accuracy ▪ 分支预测错误的原因: • 预测错误 • 由于使用PC的低位查找BHT表,可能得到错误的分支历史记录 ▪ BHT表的大小问题 • 4096 项的表分支预测错误的比例为1% (nasa7, tomcatv) to 18% (eqntott), spice at 9% and gcc at 12% • 再增加项数,对提高预测准确率几乎没有效果 (in Alpha 21164)
Correlating Branch Predicator ■例如: if(aa==2)aa=0 if(bb==2)b=0; if(aa!=bb)i 翻译为DLX SUBI R3R1.#2 BNEZ R3 L1 branch b1(aa! =2) ADDI R1.RORO ;aa=0 L1: SUBI R3R2.#2 BNEZ R3 L2 branch b2(bb! =2) ADDI R2. RORO bb=o 2 SUBI R3R1 R2 R3=aa-bb BEQZ R3 L3 branch b3(aa==bb) 计算机体系结构 Chapter4 3.15
计算机体系结构 Chapter4_3.15 Correlating Branch Predicator ▪ 例如: if (aa==2) aa=0; if (bb==2) bb=0; if (aa!=bb) { ▪ 翻译为DLX SUBI R3,R1,#2 BNEZ R3,L1 ; branch b1 (aa!=2) ADDI R1,R0,R0 ;aa=0 L1: SUBI R3,R2,#2 BNEZ R3,L2 ;branch b2(bb!=2) ADDI R2,R0,R0 ; bb=0 L2: SUBI R3,R1,R2 ;R3=aa-bb BEQZ R3,L3 ;branch b3 (aa==bb)