高级计算机体系结构设计及其在数据中心和云计算的应用BranchCorrelationinOSCode2N0x80007dd4:<exception_ip12>andi $ko,$ko.0x7chandle vcedbe1i $k1,124beq $k0,$k1,0x80007d0c <handle_vced>handle vcei1i$k1.56beq$k0,$k1,0x80007cec<handle_vcei>intrapNNT1i $k1,32beqz$k0,0x800080f0<inttrap>NNNTsw $at,-24524($zero)systrapbeq$k0,$k1,0x80008770<systrap>li/sat.8kmissbeg$k0,$at,0x80007e78<kmiss>Nli $at,12Lbeg.$k0,$at,0x80007e78<kmiss>kmissbegN1i $at,92beq$k0,$at,0x80007e60<exception_ip12+8c>exception_ip12+8cBneli $at,36Nbne $k0,$at,0x80008274<1ongway>Vmfc0$k0,$12longwaybnezNandi$ko,$k0,0x18bnez$k0,0x80008274<1ongway>longwayNPgemfc0$ko,$13bgez$k0,0x80007e48<exception_ip12+74>exception_ip/2+74jr sat(a)OSAssemblyCodetoPerformGeneral(b) Binary Decision Tree based BranchingException HandlingSequence Corresponding to Code Shown in (a)11
高级计算机体系结构设计及其在数据中心和云计算的应用 Branch Correlation in OS Code 0x80007dd4: <exception_ip12> andi $k0,$k0,0x7c li $k1,124 beq $k0,$k1,0x80007d0c <handle_vced> li $k1,56 beq $k0,$k1,0x80007cec <handle_vcei> li $k1,32 beqz $k0,0x800080f0 <inttrap> sw $at,-24524($zero) beq $k0,$k1,0x80008770 <systrap> li $at,8 beq $k0,$at,0x80007e78 <kmiss> li $at,12 beq $k0,$at,0x80007e78 <kmiss> li $at,92 beq beq handle_vced beqz handle_vcei beq inttrap beq beq beq systrap kmiss kmiss T T T T T N N N N N N N T T NNT NNNT 11 li $at,92 beq $k0,$at,0x80007e60 <exception_ip12+8c> li $at,36 bne $k0,$at,0x80008274 <longway> mfc0 $k0,$12 andi $k0,$k0,0x18 bnez $k0,0x80008274 <longway> mfc0 $k0,$13 bgez $k0,0x80007e48 <exception_ip12+74> . . . jr $at beq bne bnez kmiss exception_ip12+8c longway bgez longway . exception_ip12+74 T T T N N N N T (a) OS Assembly Code to Perform General Exception Handling (b) Binary Decision Tree based Branching Sequence Corresponding to Code Shown in (a)
高级计算机体系结构设计及其在数据中心和云计算的应用ControlFlowPredictionforUser/OSCodeBranchHistoryInstructionShift RegisterStreamBranchHistory(BHSR)Table(BHT)00User2-bit Counter002-bitCounter12-bit Counter2-bit CounterKernel2-bit CounterPC2-bit Counter2-bit Counter→2-bit CounterUser2-bit CounterTagTargetTagTargetTagTargetKernelTargetTagTagTargetTagTargetUserBranchBranchTargetPredictionBuffer(BTB)12
高级计算机体系结构设计及其在数据中心和云计算的应用 Control Flow Prediction for User/OS Code User Kernel Branch History Shift Register (BHSR) 2-bit Counter 2-bit Counter 2-bit Counter 2-bit Counter 2-bit Counter Branch History Table (BHT) Instruction Stream 1 0 1 1 1 0 1 1 0 0 1 0 0 1 1 1 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1 1 1 0 1 0 1 0 0 1 PC 12 User Kernel User 2-bit Counter 2-bit Counter 2-bit Counter 2-bit Counter 2-bit Counter Branch Target Buffer (BTB) Tag Target Tag Target Tag Target Tag Target Tag Target Tag Target Branch Prediction PC
高级计算机体系结构设计及其在数据中心和云计算的应用ImpactsofUser/KernelExecutiononGshareBranchPredictor12ExtraCausedbyOSExecution(% ey ueipedsi0jessUserOnly866dbjavacjack4mtrtcompress2n10002Predictor(Gshare)Size(#ofBHTEntries)007ExtraCausedbyUserExecution)s654321KernelOnlyjackjessjavacmtrtdbcompressCPredictor(Gshare)Size(#ofBHTEntries)
高级计算机体系结构设计及其在数据中心和云计算的应用 0 2 4 6 8 10 12 16k 64k 128k 256k 16k 64k 128k 256k 16k 64k 128k 256k 16k 64k 128k 256k 16k 64k 128k 256k 16k 64k 128k 256k Misprediction Rate (%) Extra Caused by OS Execution User Only db jess javac jack mtrt compress Impacts of User/Kernel Execution on Gshare Branch Predictor 13 Predictor (Gshare) Size (# of BHT Entries) 0 1 2 3 4 5 6 7 8 16k 64k 128k 256k 16k 64k 128k 256k 16k 64k 128k 256k 16k 64k 128k 256k 16k 64k 128k 256k 16k 64k 128k 256k Misprediction Rate (%) Extra Caused by User Execution Kernel Only Predictor (Gshare) Size (# of BHT Entries) db jess javac jack mtrt compress
高级计算机体系结构设计及其在数据中心和云计算的应用Impactsof User/KernelExecutiononOtherPredictorssystem(user)1616useronlysystem (user)jessjavac1414system(kernel)useronlyeaekernel only1212system(kernel)kernelonly1010868644220C2bcGAgGAs2bcGAsGshareSAsGAgGshareSAs1616system (user)dbsystem (user)1414useronlyjackuseronly0system(kernel)1212system (kernel)kernel onlykernel only101088664422002bcGAgGASGshareSAs2bcGAgGAsGshareSAs14
高级计算机体系结构设计及其在数据中心和云计算的应用 Impacts of User/Kernel Execution on Other Predictors 16 javac 0 2 4 6 8 10 12 14 16 2bc GAg GAs Gshare SAs Misprediction Rate (%) system (user) user only system (kernel) kernel only jess 0 2 4 6 8 10 12 14 16 2bc GAg GAs Gshare SAs Misprediction Rate (%) system (user) user only system (kernel) kernel only 14 db 0 2 4 6 8 10 12 14 16 2bc GAg GAs Gshare SAs Misprediction Rate (%) system (user) user only system (kernel) kernel only jack 0 2 4 6 8 10 12 14 16 2bc GAg GAs Gshare SAs Misprediction Rate (%) system (user) user only system (kernel) kernel only
高级计算机体系结构设计及其在数据中心和云计算的应用HybridizingOptimized BranchPredictorsfor UserandKernelCodeUnifiedSplit+OptimizedireueseOSHSHNASsenaunaneunaiunpaiunT5.612.15.77.742!4.04.34.04.34.53.93.83.94.34kdb16k4.211.65.16.93.33.53.53.73.22.72.82.92.9U3.39.94k9.57.56.45.75.17.66.86.05.45.811.46.4?jess16k8.86.77.79.44.64.45.64.74.04.04.93.944k10.68.98.06.05.86.55.97.75.97.055.908-Oajavac16k56.66.425.210.24.95.25.44.34.57.4754k5.39.654.35.95.04.611.24.74.44.5jack16k3.62.94.79.03.66.754.3262.82+4k4.84.86.77.85.34.13.74.53.54.774.44.Omtrt16k7.63.44.63.63.86.4273.83.84.42.86-34.0aMispredictionRatesonHybridUserlOSPredictors15
高级计算机体系结构设计及其在数据中心和云计算的应用 Hybridizing Optimized Branch Predictors for User and Kernel Code Unified Split + Optimized Benchmarks BHT Entries Unified 2bc Unified GAg Unified GAs Unified Gshare Unified SAg Unified SAs 2bc(U) +Gshare(K) 2bc(U) +GAg(K) 2bc(U) +SAg(K) GAg(U) +GAg(K) Gshare(U) +Gshare(K) SAs(U)+ Gshare(K) GAs(U)+ Gshare(K) SAs(U)+ SAg(K) SAs(U)+ GAg(K) 4k 12.1 5.7 7.7 5.6 4.2 4.8 4.0 4.3 4.0 4.3 4.5 3.9 3.8 3.9 4.3 db 16k 11.6 5.1 6.9 4.2 3.3 3.9 3.3 3.5 3.5 3.7 3.2 2.7 2.8 2.9 2.9 4k 9.5 11.4 7.5 9.9 6.4 5.9 5.3 5.7 5.1 7.6 6.8 6.0 5.4 5.8 6.4 jess 16k 8.8 9.4 6.7 7.7 4.9 4.4 4.3 4.6 4.4 5.6 4.7 4.0 3.9 4.0 4.2 15 jess 16k 8.8 9.4 6.7 7.7 4.9 4.4 4.3 4.6 4.4 5.6 4.7 4.0 3.9 4.0 4.2 4k 10.6 8.9 7.7 8.0 6.0 6.0 5.8 5.9 5.8 7.0 6.5 5.9 5.6 5.9 6.0 javac 16k 10.2 7.5 6.6 6.4 4.9 4.7 5.1 5.2 5.2 5.4 4.7 4.3 4.2 4.5 4.4 4k 5.3 11.2 4.7 9.6 5.2 4.3 4.3 4.4 4.3 5.9 5.0 4.6 4.1 4.5 4.7 jack 16k 4.7 9.0 3.6 6.7 3.5 2.6 3.6 3.7 3.6 4.3 2.9 2.6 2.8 2.7 2.7 4k 7.8 5.3 4.8 4.8 3.5 4.4 6.7 6.7 6.7 4.1 3.7 4.7 4.5 4.7 4.7 mtrt 16k 7.6 4.4 4.6 3.6 2.8 3.8 6.3 6.3 6.4 3.4 2.7 3.8 4.0 3.8 3.8 Misprediction Rates on Hybrid User/OS Predictors