Return Example demo -retys 0x000: irmovl Stack, esp Intialize stack pointer 0x006 nop Avoid hazard on esp 0x007: nop 0x008: nop 0x009 callp Procedure call 0x00e irmovl $5, esi Return point 0x014: ha1七 0x020:.pos0x20 Ox020: p:nop #pr。 cedure 0x021: nop 0x022: nop 0x023: re七 0x024: irmoⅴ1$1,号eax i Should not be executed 0x02a: irmovl $2, ecx Should not be executed 0x030: irmovl $3, edx Should not be executed 0x036: 1 Mov1$4,号ebx i Should not be executed 0x100:.pos0x100 0x100: Stack Stack: Stack pointer a Require lots of nops to avoid data hazards Processor
– 6 – Processor 0x000: irmovl Stack,%esp # Intialize stack pointer 0x006: nop # Avoid hazard on %esp 0x007: nop 0x008: nop 0x009: call p # Procedure call 0x00e: irmovl $5,%esi # Return point 0x014: halt 0x020: .pos 0x20 0x020: p: nop # procedure 0x021: nop 0x022: nop 0x023: ret 0x024: irmovl $1,%eax # Should not be executed 0x02a: irmovl $2,%ecx # Should not be executed 0x030: irmovl $3,%edx # Should not be executed 0x036: irmovl $4,%ebx # Should not be executed 0x100: .pos 0x100 0x100: Stack: # Stack: Stack pointer Return Example ◼ Require lots of nops to avoid data hazards demo-ret.ys
Incorrect Return Example ret 0x023 ret FDEMW 0x024 irmovl $1,,, eax Oops! FDEM V 0x02 FDEMV 0x030: i工mov1$3,号edx#Oops FDEM 0x00e irmovl $5, esi Return FDEM W a Incorrectly execute 3 instructions following ret W vaIM= 0x0e valE =1 dstE=号 valE dstE=告e D dstE= edx va|C←5 rB←各 Processe
– 7 – Processor Incorrect Return Example 0x023: ret F D E M 0x024: irmovl $1,%eax # Oops! F D E M W W 0x02a: irmovl $2,%ecx # Oops! F D E M W 0x030: irmovl $3,%edx # Oops! F D E M W 0x00e: irmovl $5,%esi # Return F D E M W # demo-ret F D E M W E valE 2 dstE = %ecx M valE = 1 dstE = %eax D valC = 3 dstE = %edx F valC 5 rB %esi W valM = 0x0e 0x023: ret F D E M 0x024: irmovl $1,%eax # Oops! F D E M W W 0x02a: irmovl $2,%ecx # Oops! F D E M W 0x030: irmovl $3,%edx # Oops! F D E M W 0x00e: irmovl $5,%esi # Return F D E M W # demo-ret F D E M W E valE 2 dstE = %ecx E valE 2 dstE = %ecx M valE = 1 dstE = %eax M valE = 1 dstE = %eax D valC = 3 dstE = %edx D valC = 3 dstE = %edx F valC 5 rB %esi F valC 5 rB %esi W valM = 0x0e ◼ Incorrectly execute 3 instructions following ret
Handling Misprediction 123456 10 s 0x000: xor1 geax %eax FDEM 0x002 jne target Not taken FDEMW 0x01l: t: irmovl $2, edx Target FD bubble EM W 0x017: irmovl $3, %ebx Target+1 F bubble DEMW 0x007: irmovl $l, eax Fall through FDEMW 0x00d: nop TFDTEMWI Predict branch as taken Figure 4.63 P346 a Fetch 2 instructions at target Cancel when mispredicted a Detect branch not-taken in execute stage a On following cycle, replace instructions in execute and decode by bubbles a No side effects have occurred yet 8 Processor
– 8 – Processor Handling Misprediction Predict branch as taken ◼ Fetch 2 instructions at target Cancel when mispredicted ◼ Detect branch not-taken in execute stage ◼ On following cycle, replace instructions in execute and decode by bubbles ◼ No side effects have occurred yet 0x000: xorl %eax,%eax 1 2 3 4 5 6 7 8 9 F D E M W 0x002: jne target # Not taken F D E M W E M W 10 # demo-j.ys 0x011: t: irmovl $2,%edx # Target bubble 0x017: irmovl $3,%ebx # Target+1 F D E M W D F bubble 0x007: irmovl $1,%eax # Fall through 0x00d: nop F D E M W F D E M W Figure 4.63 P346
Detecting Mispredicted Branch valA dstM ALU Figure 4.64 P347 Condition Trigger Mispredicted Branch E icode =lJXX& le Bch -9 Processor
– 9 – Processor Detecting Mispredicted Branch Condition Trigger Mispredicted Branch E_icode = IJXX & !e_Bch M F D Instruction memory PC increment Register file CC ALU Data memory Select PC rB dstE dstM ALU A ALU B Mem. control Addr srcA srcB read write ALU fun. Fetch Decode Execute Memory Write back data out data in A B M E M_valA W _valE W _valM W _valE M_valA W _valM f_PC Predict PC icode Bch valE valA dstE dstM E icode ifun valC valA valB dstE dstM srcA srcB icode ifun rA valC valP predPC d_srcA d_srcB e_Bch M_Bch Sel+Fwd A Fwd B W icode valE valM dstE dstM m_valM W _valM M_valE e_valE M F D Instruction memory PC increment Register file CC ALU Data memory Select PC rB dstE dstM ALU A ALU B Mem. control Addr srcA srcB read write ALU fun. Fetch Decode Execute Memory Write back data out data in A B M E M_valA W _valE W _valM W _valE M_valA W _valM f_PC Predict PC icode Bch valE valA dstE dstM E icode ifun valC valA valB dstE dstM srcA srcB icode ifun rA valC valP predPC d_srcA d_srcB e_Bch M_Bch Sel+Fwd A Fwd B W icode valE valM dstE dstM m_valM W _valM M_valE e_valE m_valM W _valM M_valE e_valE M F D Instruction memory PC increment Register file CC ALU Data memory Select PC rB dstE dstM ALU A ALU B Mem. control Addr srcA srcB read write ALU fun. Fetch Decode Execute Memory Write back data out data in A B M E M_valA W _valE W _valM W _valE M_valA W _valM f_PC Predict PC icode Bch valE valA dstE dstM E icode ifun valC valA valB dstE dstM srcA srcB icode ifun rA valC valP predPC d_srcA d_srcB e_Bch M_Bch Sel+Fwd A Fwd B W icode valE valM dstE dstM m_valM W _valM M_valE e_valE m_valM W _valM M_valE e_valE M F D Instruction memory PC increment Register file CC ALU Data memory Select PC rB dstE dstM ALU A ALU B Mem. control Addr srcA srcB read write ALU fun. Fetch Decode Execute Memory Write back data out data in A B M E M_valA W _valE W _valM W _valE M_valA W _valM f_PC Predict PC icode Bch valE valA dstE dstM E icode ifun valC valA valB dstE dstM srcA srcB icode ifun rA valC valP predPC d_srcA d_srcB e_Bch M_Bch Sel+Fwd A Fwd B W icode valE valM dstE dstM m_valM W _valM M_valE e_valE m_valM W _valM M_valE e_valE m_valM W _valM M_valE e_valE m_valM W _valM M_valE e_valE Figure 4.64 P347
Control for Misprediction demo -i y 2 3456 8910 0x000 xor Seax, Seax FDE MW 0x002: jne t# Not taken FDE MW 0x0l1: t: irmovl $2,% edx Target F D bubble E M W 0x017 irmovl $3, %ebx Target+1 F bubble D E M W 0x007: irmovl $l, geax Fall through FD E M W 0x00d nop I EMW Figure 4.63 P346 Condition F D E M W Mispredicted Branch normal bubblebubble normal normal Figure 4.66 P348 Processor
– 10 – Processor Control for Misprediction 0x000: xorl %eax,%eax 1 2 3 4 5 6 7 8 9 F D E M W 0x002: jne t # Not taken F D E M W E M W 10 # demo-j.ys 0x011: t: irmovl $2,%edx # Target bubble 0x017: irmovl $3,%ebx # Target+1 F D E M W D F bubble 0x007: irmovl $1,%eax # Fall through 0x00d: nop F D E M W F D E M W Condition F D E M W Mispredicted Branch normal bubble bubble normal normal Figure 4.63 P346 Figure 4.66 P348