Review:FetchStagewithBTBDirectionpredictor (2-bitcounters)taken?PC+inst sizeNextFetchAddressProgramhit?Counter/Addressofthecurrent instructiontarget addressCacheofTargetAddresses(BTB:BranchTargetBuffer)A/ways-takenCPl=[1+(0.20*0.3)*2]=1.12(70% of branches taken)ComputerArchitecture
Computer Architecture target address Review: Fetch Stage with BTB Direction predictor (2-bit counters) Cache of Target Addresses (BTB: Branch Target Buffer) Program Counter PC + inst size taken? Next Fetch Address hit? Address of the current instruction Always-taken CPI = [ 1 + (0.20*0.3) * 2 ] = 1.12 (70% of branches taken) 6
Simple Branch Direction Prediction Schemes: Compile time (static)- Always not taken-Always taken- BTFN (Backward taken, forward not taken)- Profile based (likely direction): Run time (dynamic) Last time prediction (single-bit)ComputerArchitecture
Computer Architecture Simple Branch Direction Prediction Schemes • Compile time (static) – Always not taken – Always taken – BTFN (Backward taken, forward not taken) – Profile based (likely direction) • Run time (dynamic) – Last time prediction (single-bit) 7
More Sophisticated Direction Prediction: Compile time (static)- Always not taken Always taken- BTFN (Backward taken, forward not taken)- Profile based (likely direction)- Program analysis based (likely direction): Run time (dynamic)- Last time prediction (single-bit)-Two-bit counterbased predictionTwo-level prediction (global vs. local)- HybridComputerArchitecture
Computer Architecture More Sophisticated Direction Prediction • Compile time (static) – Always not taken – Always taken – BTFN (Backward taken, forward not taken) – Profile based (likely direction) – Program analysis based (likely direction) • Run time (dynamic) – Last time prediction (single-bit) – Two-bit counter based prediction – Two-level prediction (global vs. local) – Hybrid 8
Static Branch Prediction (T). Always not-taken- Simple to implement: no need for BTB, no direction prediction- Low accuracy: ~30-40%- Compiler can layout code such that the likely path is the “not-taken" path.Always taken-Nodirectionprediction- Better accuracy: ~60-70%:Backwardbranches(i.e.loopbranches)areusuallytaker? Backward branch: target address lower than branch PCBackward taken, forward not taken (BTFN) Predict backward (loop) branches as taken, others not-takenComputerArchitecturet
Computer Architecture Static Branch Prediction (I) • Always not-taken – Simple to implement: no need for BTB, no direction prediction – Low accuracy: ~30-40% – Compiler can layout code such that the likely path is the “nottaken” path • Always taken – No direction prediction – Better accuracy: ~60-70% • Backward branches (i.e. loop branches) are usually taken • Backward branch: target address lower than branch PC • Backward taken, forward not taken (BTFN) – Predict backward (loop) branches as taken, others not-taken 9
Static Branch Prediction (II.Profile-based- Idea: Compiler determines likely direction for each branchusing profile run. Encodes that direction as a hint bit in thebranch instruction format.+ Per branch prediction (more accurate than schemes inprevious slide) → accurate if profile is representative!-- Reguires hint bits in the branch instruction format-- Accuracy depends on dynamic branch behavior:TTTTTTTTTTNNNNNNNNNN > 50% accuraCyTNTNTNTNTNTNTNTNTNTN →> 50% accuracy-- Accuracy depends on the representativeness of profileinput setComputerArchitecture10
Computer Architecture Static Branch Prediction (II) • Profile-based – Idea: Compiler determines likely direction for each branch using profile run. Encodes that direction as a hint bit in the branch instruction format. + Per branch prediction (more accurate than schemes in previous slide) à accurate if profile is representative! - Requires hint bits in the branch instruction format - Accuracy depends on dynamic branch behavior: TTTTTTTTTTNNNNNNNNNN à 50% accuracy TNTNTNTNTNTNTNTNTNTN à 50% accuracy - Accuracy depends on the representativeness of profile input set 10