Link Analysis PageRank Existence and Uniqueness a central result from the theory of random walks aka markov processes) For graphs that satisfy certain conditions the stationary distribution is unique and eventually will be reached no matter what the initial probability distribution at time t=o
Link Analysis 16 Existence and Uniqueness A central result from the theory of random walks (aka Markov processes): For graphs that satisfy certain conditions, the stationary distribution is unique and eventually will be reached no matter what the initial probability distribution at time t = 0. PageRank
Link Analysis PageRank Spider traps a group of pages is a spider trap if there are no links from within the group to outside the group Random surfer gets trapped Spider traps violate the conditions needed for the random walk theorem 17
Link Analysis 17 Spider traps ▪ A group of pages is a spider trap if there are no links from within the group to outside the group ▪ Random surfer gets trapped ▪ Spider traps violate the conditions needed for the random walk theorem PageRank
Link Analysis PageRank Microsoft becomes a spider trap Yahoo y a m /2120 a1/200 m01/21 mazo Soft 3/45/8 a 21/23/8 3/2742 003
Link Analysis 18 Microsoft becomes a spider trap Yahoo Amazon M’soft y 1/2 1/2 0 a 1/2 0 0 m 0 1/2 1 y a m y a = m 1 1 1 1 1/2 3/2 3/4 1/2 7/4 5/8 3/8 2 0 0 3 . . . PageRank
Link Analysis PageRank Random teleports The google solution for spider traps At each time step, the random surfer has two options With probability B follow a link at random With probability 1-B, jump to some page uniformly at random Common values for b are in the range 0.8 to 0.9 Surfer will teleport out of spider trap within a few time steps
Link Analysis 19 Random teleports ▪ The Google solution for spider traps ▪ At each time step, the random surfer has two options: ▪ With probability , follow a link at random ▪ With probability 1-, jump to some page uniformly at random ▪ Common values for are in the range 0.8 to 0.9 ▪ Surfer will teleport out of spider trap within a few time steps PageRank
Link Analysis PageRank Random teleports (B=0. 8) 2 Yahoo Po.8*1/2 2 0.8*1/2+0.2*1/3 0 1/3 0.8 0.2*1/3 0.2*1/3 /21/20 /31/31/3 mazo Softy081200+021/31/313 01/21 1/31/31/3 7/157/151/15 7/151/151/15 m1/157/1513/15
Link Analysis 20 Random teleports ( = 0.8) Yahoo Amazon M’soft 1/2 1/2 0.8*1/2 0.8*1/2 0.2*1/3 0.2*1/3 0.2*1/3 y 1/2 a 1/2 m 0 y 1/2 1/2 0 y 0.8* 1/3 1/3 1/3 y + 0.2* 1/2 1/2 0 1/2 0 0 0 1/2 1 1/3 1/3 1/3 1/3 1/3 1/3 1/3 1/3 1/3 y 7/15 7/15 1/15 a 7/15 1/15 1/15 m 1/15 7/15 13/15 0.8 + 0.2 PageRank