/966 Edge mapping:The 3 fundamental edges corresponding to weight,input,and result w0- +w0 can be mapped to corresponding edges in x0 yl the systolic array according to the following yl-y0+w0x0 table: e pe In block diagram sTe In block diagram eweightT=(10) 0 Weights stay 1 With delay einputT=(01) 1 Input forward 0 Without delay eresulT=(1-1) -1 Result backward 1 With delay
Edge mapping: The 3 fundamental edges corresponding to weight, input, and result can be mapped to corresponding edges in the systolic array according to the following table: e p Te In block diagram s Te In block diagram eweight T=(1 0) 0 Weights stay 1 With delay einput T=(0 1) 1 Input forward 0 Without delay eresult T=(1 -1) -1 Result backward 1 With delay
/966 broadcast D D weight D W2 Xo X X3 X4 inputXd j'= axis D D result proc Processor axis WoXo WiXo W2X0 essor y3 WoX1+WiXo WX1+W2X0 W2X1 WoX2+W X+W2Xo WX2+W2X1 W2X2 0 2 3 4 t=i time WoX3+W X2+W2X1 W X3+W2X2 W2X3 input X(n】 Node IT-(ij)is mapped to processor pI-j. result Node IT=(i j)is executed at time sTI=i
broadcast time processor x0 x0 x0 w0 w1 w2 w0x0 w1x0 w2x0 w0x1+w1x0 w1x1+w2x0 w2x1 x1 x21 x1 w0x2+w1x1+w2x0 w1x2+w2x1 w2x2 x2 x3 x2 w0x3+w1x2+w2x1 w1x3+w2x2 w2x3 x3 x3 1 2 3 4 2 3 4 3 4 5 5 5 1 1 2 Node IT=(i j) is mapped to processor pTI=j. Node IT=(i j) is executed at time sTI=i