6.042/18.] Mathematics for Computer Science May5,2005 Srini devadas and Eric Lehman Lecture notes ted value li 1 The Number-Picking game Here is a game that you and i could play that reveals a strange property of expectation First, you think of a probability density function on the natural numbers. Your distri bution can be absolutely anything you like. For example, you might choose a uniform distribution on 1, 2,...,6, like the outcome of a fair die roll. Or you might choose a bi nomial distribution on 0, 1,...,n. You can even give every natural number a non-zero probability, provided that the sum of all probabilities is 1 Next, I pick a random number z according to your distribution. Then, you pick a random number y1 according to the same distribution. If your number is bigger than mine(y1 2), then the game ends. Otherwise, if our numbers are equal or mine is bigger (22 y1), then you pick a new number y2 with the same distribution, and keep picking values y3, y4, etc. until you get a value that is strictly bigger than my number, z. What is the expected number of picks that you must make? Certainly, you always need at least one pick, so the expected number is greater than one. An answer like 2 or 3 sounds reasonable, though one might suspect that the answer depends on the distribution. Let' s find out whether or not this intuition is correct 1.1 Analyzing the Game The number of picks you must make is a natural-valued random variable. And, as weve seen, there is a nice formula for the expectation of a natural-valued random variable Ex(# times you pick)=>Pr(# times you pick>k) Suppose that I've picked my number z, and you have picked k numbers y1, y2, ..., yu There are two possibilities If there is a unique largest number among our picks, then my number is as likely to be it ars. So with probability 1/(+1) my number is larger than all of yours, and you must pick again
� 6.042/18.062J Mathematics for Computer Science May 5, 2005 Srini Devadas and Eric Lehman Lecture Notes Expected Value II 1 The NumberPicking Game Here is a game that you and I could play that reveals a strange property of expectation. First, you think of a probability density function on the natural numbers. Your distribution can be absolutely anything you like. For example, you might choose a uniform distribution on 1, 2, . . . , 6, like the outcome of a fair die roll. Or you might choose a binomial distribution on 0, 1, . . . , n. You can even give every natural number a nonzero probability, provided that the sum of all probabilities is 1. Next, I pick a random number z according to your distribution. Then, you pick a random number y1 according to the same distribution. If your number is bigger than mine (y1 > z), then the game ends. Otherwise, if our numbers are equal or mine is bigger (z ≥ y1), then you pick a new number y2 with the same distribution, and keep picking values y3, y4, etc. until you get a value that is strictly bigger than my number, z. What is the expected number of picks that you must make? Certainly, you always need at least one pick, so the expected number is greater than one. An answer like 2 or 3 sounds reasonable, though one might suspect that the answer depends on the distribution. Let’s find out whether or not this intuition is correct. 1.1 Analyzing the Game The number of picks you must make is a naturalvalued random variable. And, as we’ve seen, there is a nice formula for the expectation of a naturalvalued random variable: ∞ Ex (# times you pick) = Pr (# times you pick > k) (1) k=0 Suppose that I’ve picked my number z, and you have picked k numbers y1, y2, . . . , yk. There are two possibilities: • If there is a unique largest number among our picks, then my number is as likely to be it as any one of yours. So with probability 1/(k + 1) my number is larger than all of yours, and you must pick again
Expected Value I Otherwise, there are several numbers tied for largest. My number is as likely to be one of these as any of your numbers, so with probability greater than 1/(k+ 1)you must pick again In both cases, with probability at least 1/(k+1), you need more than k picks to beat me other words Pr(# times you pick>k)≥ k+ This suggests that in order to minimize your rolls, you should chose a distribution such that ties are very rare. For example, you might choose the uniform distribution on (1, 2,..., 10003. In this case, the probability that you need more than k picks to beat me is very close to 1/(k+) for moderate values of k. For example, the probability that you need more than 99 picks is almost exactly 1%. This sounds very promising for you; intuitively, you might expect to win within a reasonable number of picks on average! Unfortunately for intuition, there is a simple proof that the expected number of picks that you need in order to beat me is infinite, regardless of the distribution! Let's plug(2) into(1) Ex(# times you pick) ∑; k+1 This phenomenon can cause all sorts of confusion! For example, suppose you have a communication network where each packet of data has a 1/k chance of being delayed by k or more steps. This sounds good; there is only a 1% chance of being delayed by 100 or more steps. But the expected delay for the packet is actually infinite e, There is a larger point here as well: not every random variable has a well-defined expectation. This idea may be disturbing at first, but remember that an expected value is just a weighted average. And there are many sets of numbers that have no conventional average either, such as {1,-2,3,-4,5,-6,…} Strictly speaking, we should qualify virtually all theorems involving expectation with phrases such as".Provided all expectations exist. But going to leave that sumption implicit. Fortunately, random variables without expectations don't arise too often in practice 2 The Coupon Collector Problem Every time I purchase a kid's meal at Taco Bell, I am graciously presented with a miniature Racin Rocket"car together with a launching device which enables me to project my new
2 Expected Value II • Otherwise, there are several numbers tied for largest. My number is as likely to be one of these as any of your numbers, so with probability greater than 1/(k + 1) you must pick again. In both cases, with probability at least 1/(k + 1), you need more than k picks to beat me. In other words: 1 Pr (# times you pick > k) ≥ (2) k + 1 This suggests that in order to minimize your rolls, you should chose a distribution such that ties are very rare. For example, you might choose the uniform distribution on {1, 2, . . . , 10100}. In this case, the probability that you need more than k picks to beat me is very close to 1/(k+1) for moderate values of k. For example, the probability that you need more than 99 picks is almost exactly 1%. This sounds very promising for you; intuitively, you might expect to win within a reasonable number of picks on average! Unfortunately for intuition, there is a simple proof that the expected number of picks that you need in order to beat me is infinite, regardless of the distribution! Let’s plug (2) into (1): � 1 ∞ Ex (# times you pick) = k + 1 k=0 = ∞ This phenomenon can cause all sorts of confusion! For example, suppose you have a communication network where each packet of data has a 1/k chance of being delayed by k or more steps. This sounds good; there is only a 1% chance of being delayed by 100 or more steps. But the expected delay for the packet is actually infinite! There is a larger point here as well: not every random variable has a welldefined expectation. This idea may be disturbing at first, but remember that an expected value is just a weighted average. And there are many sets of numbers that have no conventional average either, such as: {1, −2, 3, −4, 5, −6, . . .} Strictly speaking, we should qualify virtually all theorems involving expectation with phrases such as “...provided all expectations exist.” But we’re going to leave that assumption implicit. Fortunately, random variables without expectations don’t arise too often in practice. 2 The Coupon Collector Problem Every time I purchase a kid’s meal at Taco Bell, I am graciously presented with a miniature “Racin’ Rocket” car together with a launching device which enables me to project my new
Expected value Il vehicle across any tabletop or smooth floor at high velocity. Truly, my delight knows no bounds There are n different types of Racin Rocket car(blue, green, red, gray, etc. ) The type of car awarded to me each day by the kind woman at the Taco Bell register appears to be elected uniformly and independently at random. What is the expected number of kids meals that I must purchase in order to acquire at least one of each type of Racin Rocket car The same mathematical question shows up in many for example, what is the expected number of people you must poll in order to find at least one person with each possible birthday? Here, instead of collecting RacinRocket cars, you re collecting birth days. The general question is commonly called the coupon collector problem after yet another interpretation 2.1 A Solution Using Linearity of Expectation Linearity of expectation is somewhat like induction and the pigeonhole principle; it's a simple idea that can be used in all sorts of ingenious ways. For example, we can use linearity of expectation in a clever way to solve the coupon collector problem. Suppose there are five different types of Racin'Rocket, and I receive this sequence blue greengreen red blue orange blue orange gray Let's partition the sequence into 5 segments blue green green red blue orange blue orange gray The rule is that a segment ends whenever I get a new kind of car. For example, the middle segment ends when I get a red car for the first time. In this way, we can break the problem of collecting every type of car into stages. Then we can analyze each stage individually nd assemble the results using linearity of expectation. ya Let's return to the general case where I'm collecting n Racin Rockets. Let X, be the length of the k-th segment. The total number of kid's meals I must purchase to get all n Racin Rockets is the sum of the lengths of all these segments T=X0+X1+ Now let's focus our attention of the Xk, the length of the k-th segment. At the begin ning of segment k, I have k different types of car, and the segment ends when I acquire a new type. When I own k types, each kid's meal contains a type that I already have with probability k/n. Therefore, each meal contains a new type of car with probability k/n=(n-k)/n. Thus, the expected number of meals until i get a new kind of car
Expected Value II 3 vehicle across any tabletop or smooth floor at high velocity. Truly, my delight knows no bounds. There are n different types of Racin’ Rocket car (blue, green, red, gray, etc.). The type of car awarded to me each day by the kind woman at the Taco Bell register appears to be selected uniformly and independently at random. What is the expected number of kids meals that I must purchase in order to acquire at least one of each type of Racin’ Rocket car? The same mathematical question shows up in many guises: for example, what is the expected number of people you must poll in order to find at least one person with each possible birthday? Here, instead of collecting Racin’ Rocket cars, you’re collecting birthdays. The general question is commonly called the coupon collector problem after yet another interpretation. 2.1 A Solution Using Linearity of Expectation Linearity of expectation is somewhat like induction and the pigeonhole principle; it’s a simple idea that can be used in all sorts of ingenious ways. For example, we can use linearity of expecatation in a clever way to solve the coupon collector problem. Suppose there are five different types of Racin’ Rocket, and I receive this sequence: blue green green red blue orange blue orange gray Let’s partition the sequence into 5 segments: blue green green red blue orange blue orange gray ���� � �� � � �� � � �� � � �� � X0 X1 X2 X3 X4 The rule is that a segment ends whenever I get a new kind of car. For example, the middle segment ends when I get a red car for the first time. In this way, we can break the problem of collecting every type of car into stages. Then we can analyze each stage individually and assemble the results using linearity of expectation. Let’s return to the general case where I’m collecting n Racin’ Rockets. Let Xk be the length of the kth segment. The total number of kid’s meals I must purchase to get all n Racin’ Rockets is the sum of the lengths of all these segments: T = X0 + X1 + . . . + Xn−1 Now let’s focus our attention of the Xk, the length of the kth segment. At the beginning of segment k, I have k different types of car, and the segment ends when I acquire a new type. When I own k types, each kid’s meal contains a type that I already have with probability k/n. Therefore, each meal contains a new type of car with probability 1 − k/n = (n − k)/n. Thus, the expected number of meals until I get a new kind of car
Expected Value I is n/(n-k)by the"mean time to failure"formula that we worked out last time. So we have Ex(xk)= Linearity of expectation together with this observation solves the coupon collector obler (T)=Ex(Xo+X1 =Ex(X0)+Ex(X1)+…+Ex(Xn-1) nnn n-0n-1 111 =n(-+ n n-1 nh order. As you may recall, this sum is denoted Hn and is approximately lnn ns in reverse The summation on the next-to-last line is the n-th harmonic sum with the terms in reverse Let's use this general solution to answer some concrete questions. For example, the expected number of die rolls required to see every number from 1 to 6 is 6H6=14.7 And the expected number of people you must poll to find at least one person with each possible birthday is 365H365=2364.6 3 Expected Value of a Product Enough with sums! What about the expected value of a product of random variables? If Ri and R2 are independent, then the expected value of their product is the product of their expected values Theorem 1. For independent random variables Ri and R2 Ex(R1·B2)=Ex(R1)·Ex(B2) Proof. Well transform the right side into the left side Ex(B)Ex(B)=(∑xP(B1=)∑vP(B=0) r∈ Range(R1) x∈ Range(R1) y Pr(ri=aPr(ri=y r∈ Range(R1)y∈ Range(R2) ∑∑xyP(R1=∩R1=y) rERange(R1)yERange(R2)
� � � � � � � � 4 Expected Value II is n/(n − k) by the “mean time to failure” formula that we worked out last time. So we have: n Ex (Xk) = n − k Linearity of expecatation, together with this observation, solves the coupon collector problem: Ex (T) = Ex (X0 + X1 + . . . + Xn−1) = Ex (X0) + Ex (X1) + . . . + Ex (Xn−1) n n n n n = + + . . . + + + n − 0 n − 1 3 2 1 1 1 1 1 1 = n + + . . . + + + n n − 1 3 2 1 = nHn The summation on the nexttolast line is the nth harmonic sum with the terms in reverse order. As you may recall, this sum is denoted Hn and is approximately ln n. Let’s use this general solution to answer some concrete questions. For example, the expected number of die rolls required to see every number from 1 to 6 is: 6H6 = 14.7 . . . And the expected number of people you must poll to find at least one person with each possible birthday is: 365H365 = 2364.6 . . . 3 Expected Value of a Product Enough with sums! What about the expected value of a product of random variables? If R1 and R2 are independent, then the expected value of their product is the product of their expected values. Theorem 1. For independent random variables R1 and R2: Ex (R1 · R2) = Ex (R1) · Ex (R2) Proof. We’ll transform the right side into the left side: ⎛ ⎞ ⎛ ⎞ Ex (R1) · Ex (R2) = ⎝ x · Pr (R1 = x)⎠ ⎝· y · Pr (R2 = y)⎠ x∈Range(R1) x∈Range(R1) = xy Pr (R1 = x) Pr (R1 = y) x∈Range(R1) y∈Range(R2) = xy Pr (R1 = x ∩ R1 = y) x∈Range(R1) y∈Range(R2)
Expected value Il The second line comes from multiplying out the product of sums. Then we used the fact that R1 and R2 are independent. Now let's group terms for which the product y is the same ∑∑xP(B1=x∩R1=) 2(2 Pru,-n R, -D) Pr(R1·R2=x) =Ex(R1·R2) 3.1 The Product of Two Independent dice Suppose we throw two independent, fair dice and multiply the numbers that come up What is the expected value of this product? Let random variables R, and Ro be the numbers shown on the two dice. We can com pute the expected value of the product as follows Ex(Ri. R2)=Ex(r1). Ex(r2) 12 On the first line, were using Theorem 1. Then we use the result from last lecture that the cted value of one die is 3 3.2 The Product of Two Dependent dice Suppose that the two dice are not independent; in fact, suppose that the second die is always the same as the first. Does this change the expected value of the product? Is the dependence condition in Theorem 1 really necessary? As before, let random variables Ri and R2 be the numbers shown on the two dice. We
� � � � � � � Expected Value II 5 The second line comes from multiplying out the product of sums. Then we used the fact that R1 and R2 are independent. Now let’s group terms for which the product xy is the same: = xy Pr (R1 = x ∩ R1 = y) z∈Range(R1·R2) x,y: xy=z = z Pr (R1 = x ∩ R1 = y) z∈Range(R1·R2) x,y: xy=z = z · Pr (R1 · R2 = z) z∈Range(R1·R2) = Ex (R1 · R2) 3.1 The Product of Two Independent Dice Suppose we throw two independent, fair dice and multiply the numbers that come up. What is the expected value of this product? Let random variables R1 and R2 be the numbers shown on the two dice. We can compute the expected value of the product as follows: Ex (R1 · R2) = Ex (R1) · Ex (R2) 1 1 = 3 3 2 · 2 1 = 12 4 On the first line, we’re using Theorem 1. Then we use the result from last lecture that the expected value of one die is 3 1 . 2 3.2 The Product of Two Dependent Dice Suppose that the two dice are not independent; in fact, suppose that the second die is always the same as the first. Does this change the expected value of the product? Is the independence condition in Theorem 1 really necessary? As before, let random variables R1 and R2 be the numbers shown on the two dice. We