6.042/18.] Mathematics for Computer Science May3,2005 Srini devadas and Eric Lehman Lecture notes Expected value I The expectation or expected value of a random variable is a single number that tells you a lot about the behavior of the variable. Roughly, the expectation is the average value, where each value is weighted according to the probability that it comes up. Formally, the expected value of a random variable r defined on a sample space s is: (B)=∑R()Pr(o) To appreciate its signficance, suppose S is the set of students in a class, and we select a student uniformly at random. Let r be the selected student's exam score. Then Ex(r)is just the class average-the first thing everyone want to know after getting their test back In the same way, expectation is usually the first thing one wants to determine about any random variable Let's work through an example. Let r be the number that comes up on a fair, six-sided ie. Then the expected value of R is Ex 1+2·+3:+4+6. 1 6 variable might never actually take on that value. You cant roll a 3) on an ordinary die/? This calculation shows that the name"expected value"is a little misleading the rando 1 Betting on Coins Dan, Eric, and Nick decide to play a fun game. Each player puts $2 on the table secretly writes down either heads "or tails". Then one of them tosses a fair coin $6 on the table is divided evenly among the players who correctly predicted the outcome of the coin toss. If everyone guessed incorrectly, then everyone takes their money back. After many repetitions of this game, Dan has lost a lot of money- more than can b explained by bad luck. What's going on? A tree diagram for this problem is worked out below, under the assumptions that everyone guesses correctly with probability 1 /2 and everyone is correct independently
� � � 6.042/18.062J Mathematics for Computer Science May 3, 2005 Srini Devadas and Eric Lehman Lecture Notes Expected Value I The expectation or expected value of a random variable is a single number that tells you a lot about the behavior of the variable. Roughly, the expectation is the average value, where each value is weighted according to the probability that it comes up. Formally, the expected value of a random variable R defined on a sample space S is: Ex (R) = R(w) Pr (w) w∈S To appreciate its signficance, suppose S is the set of students in a class, and we select a student uniformly at random. Let R be the selected student’s exam score. Then Ex (R) is just the class average— the first thing everyone want to know after getting their test back! In the same way, expectation is usually the first thing one wants to determine about any random variable. Let’s work through an example. Let R be the number that comes up on a fair, sixsided die. Then the expected value of R is: � 6 1 Ex (R) = k 6 k=1 1 1 1 1 1 1 = 1 · + 2 · + 3 · + 4 · + 5 · + 6 · 6 6 6 6 6 6 7 = 2 This calculation shows that the name “expected value” is a little misleading; the random variable might never actually take on that value. You can’t roll a 3 1 2 on an ordinary die! 1 Betting on Coins Dan, Eric, and Nick decide to play a fun game. Each player puts $2 on the table and secretly writes down either “heads” or “tails”. Then one of them tosses a fair coin. The $6 on the table is divided evenly among the players who correctly predicted the outcome of the coin toss. If everyone guessed incorrectly, then everyone takes their money back. After many repetitions of this game, Dan has lost a lot of money— more than can be explained by bad luck. What’s going on? A tree diagram for this problem is worked out below, under the assumptions that everyone guesses correctly with probability 1/2 and everyone is correct independently
Expected value I Nick right? Dan's payoff probability Eric right? 1/8 112 1/2 Dan right? $1 118 112 18 112 1/2 1/2N 1/2 1/8 112 N 1/2 $2 1/8 112 112 In the"payoff"column, were accounting for the fact that Dan has to put in $2 just to play So, for example, if he guesses correctly and Eric and Nick are wrong, then he takes all $6 n the table, but his net profit is only $4. Working from the tree diagram, Dans expected payoff is: Ex( payoff=0·3+1·+1·+4·+(-2)·。+(-2)·+(-2)·3+0 So the game perfectly fair! Over time, he should neither win nor lose money. The trick is that Nick and Eric are collaborating, in particular, they always make oppo site guesses. So our assumption everyone is correct independently is wrong; actually the events that Nick is correct and Eric is correct are mutually exclusive! As a result, Dan can never win all the money on the table. When he guesses correctly, he always has to split his winnings with someone else. This lowers his overall expectation, as the correct ted tree diagram below shows
2 Expected Value I 1/8 1/8 1/8 1/8 1/8 1/8 1/8 1/8 probability Dan right? Eric right? Nick right? Dan’s payoff Y N Y Y Y Y Y Y N N N N N N $0 −$2 −$2 −$2 $4 $1 $1 $0 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 In the “payoff” column, we’re accounting for the fact that Dan has to put in $2 just to play. So, for example, if he guesses correctly and Eric and Nick are wrong, then he takes all $6 on the table, but his net profit is only $4. Working from the tree diagram, Dan’s expected payoff is: 1 1 1 1 1 1 1 1 Ex (payoff) = 0 · + 1 · + 1 · + 4 · + (−2) · + (−2) · + (−2) · + 0 · 8 8 8 8 8 8 8 8 = 0 So the game perfectly fair! Over time, he should neither win nor lose money. The trick is that Nick and Eric are collaborating; in particular, they always make opposite guesses. So our assumption everyone is correct independently is wrong; actually the events that Nick is correct and Eric is correct are mutually exclusive! As a result, Dan can never win all the money on the table. When he guesses correctly, he always has to split his winnings with someone else. This lowers his overall expectation, as the corrected tree diagram below shows:
Expected value I Nick right? Dan's payoff probabi Eric right? Y Dan right? $1 114 112 114 1/2 1/2 N 1/2 $2 114 112 114 From this revised tree diagram, we can work out Dan's actual expected payoff Ex( payoff=0:0+1·元+1·元+4.0+(-2)·0+(-2)+(-2)·元+00 1 1 So he loses an average of a half-dollar per game! Similar opportunities for subtle cheating come up in many betting games. For exam le, a group of friends once organized a football pool where each participant would guess the outcome of every game each week relative to the spread. This may mean nothing to you, but the upshot is that everyone was effectively betting on the outcomes of 12 or 13 coin tosses each week. The person who correctly predicts the most coin tosses won a lot of thinking in terms of the first tree diagram, swore up and down that there was no way to get an unfair"edge". But actually the number of participants was small enough that just two players betting oppositely could gain a substantial advantage Anothe aple involves a former MIT professor of statistics, Herman Chernoff State lotteries are the worst gambling games around because the state pays out only a fraction of the money it takes in. But Chernoff figured out a way to win! Here are rules for a typical lottery All players pay $1 to play and select 4 numbers from 1 to 36 The state draws 4 numbers from 1 to 36 uniformly at random
Expected Value I 3 probability Dan right? Eric right? Nick right? Dan’s payoff Y N Y Y Y Y Y Y N N N N N N $0 −$2 −$2 −$2 $4 $1 $1 $0 1/2 1/2 1/2 1/2 1/2 1/2 0 1 0 1/4 0 0 1/4 1/4 0 1/4 0 1 0 1 1 0 From this revised tree diagram, we can work out Dan’s actual expected payoff: 1 1 1 1 Ex (payoff) = 0 · 0 + 1 · + 1 · + 4 · 0 + (−2) · 0 + (−2) · + (−2) · + 0 · 0 4 4 4 4 1 = −2 So he loses an average of a halfdollar per game! Similar opportunities for subtle cheating come up in many betting games. For example, a group of friends once organized a football pool where each participant would guess the outcome of every game each week relative to the spread. This may mean nothing to you, but the upshot is that everyone was effectively betting on the outcomes of 12 or 13 coin tosses each week. The person who correctly predicts the most coin tosses won a lot of money. The organizer, thinking in terms of the first tree diagram, swore up and down that there was no way to get an unfair “edge”. But actually the number of participants was small enough that just two players betting oppositely could gain a substantial advantage! Another example involves a former MIT professor of statistics, Herman Chernoff. State lotteries are the worst gambling games around because the state pays out only a fraction of the money it takes in. But Chernoff figured out a way to win! Here are rules for a typical lottery: • All players pay $1 to play and select 4 numbers from 1 to 36. • The state draws 4 numbers from 1 to 36 uniformly at random
Expected value I The state divides 1/2 the money collected among the people who guessed correctly and spends the other half repairing the Big Dig This is a lot like our betting game, except that there are more players and more choices Chernoff discovered that a small set of numbers was selected by a large fraction of the population- apparently many people think the same way. It was as if the players were collaborating to lose! If any one of them guessed correctly, then theyd have to split the pot with many other players. By selecting numbers uniformly at random, Chernoff was unlikely to get one of these favored sequences. So if he won, hed likely get the whole pot! By analyzing actual state lottery data, he determined that he could win an average of 7 cents on the dollar this way 2 Equivalent Definitions of expectation There are some other ways of writing the definition of expectation. Sometimes using one of these other formulations can make computing an expectation a lot easier. One option is to group together all outcomes on which the random variable takes on the same value Theorem 1 (B)=∑xP(R Proof. We'll transform the left side into the right. Let[R=r] be the event that R=r Ex(B)=∑()Pr(u) ∑R()Pr(a) r∈ range(B)u∈[R=r ∑∑xP r∈ range(B)t∈[R=r ∑Pr(m) rE range(r) x·Pr(R=x rE range(r) On the second line, we break the single sum into two. The outer sum runs over all possible values a that the random variable takes on and the inner sum runs over all outcomes taking on that value. Thus, were still summing over every outcome in the sample space exactly once. On the last line, we use the definition of the probability of the event [R
� � � � � � � � 4 Expected Value I • The state divides 1/2 the money collected among the people who guessed correctly and spends the other half repairing the Big Dig. This is a lot like our betting game, except that there are more players and more choices. Chernoff discovered that a small set of numbers was selected by a large fraction of the population— apparently many people think the same way. It was as if the players were collaborating to lose! If any one of them guessed correctly, then they’d have to split the pot with many other players. By selecting numbers uniformly at random, Chernoff was unlikely to get one of these favored sequences. So if he won, he’d likely get the whole pot! By analyzing actual state lottery data, he determined that he could win an average of 7 cents on the dollar this way! 2 Equivalent Definitions of Expectation There are some other ways of writing the definition of expectation. Sometimes using one of these other formulations can make computing an expectation a lot easier. One option is to group together all outcomes on which the random variable takes on the same value. Theorem 1. � Ex (R) = x · Pr (R = x) x∈ range(R) Proof. We’ll transform the left side into the right. Let [R = x] be the event that R = x. Ex (R) = R(w) Pr (w) w∈S = R(w) Pr (w) x∈ range(R) w∈[R=x] = x Pr (w) x∈ range(R) w∈[R=x] ⎛ = ⎝x ⎞ · Pr (w)⎠ x∈ range(R) w∈[R=x] = x · Pr (R = x) x∈ range(R) On the second line, we break the single sum into two. The outer sum runs over all possible values x that the random variable takes on, and the inner sum runs over all outcomes taking on that value. Thus, we’re still summing over every outcome in the sample space exactly once. On the last line, we use the definition of the probability of the event [R = x]
xpected value I Corollary 2. If R is a natural-valued random variable, then B)=∑iPr(R= b There is another way to write the expected value of a random variable that takes on lues only in the natural numbers, N=10, 1, 2,.. Theorem 3. If R is a natural-valued random variable, then Proof. Consider the sum Pr(R=1)+Pr(R=2)+Pr(R=3)+ +Pr(R=2)+Pr(R=3)+ Pr(R=3)+… The columns sum to 1. Pr(R=1), 2. Pr(R=2),3. Pr(R=3), etc. Thus, the whole sum ∑iP(R=i)=Bx(F) Here, were using Corollary 2. On the other hand, the rows sum to Pr(R>O), Pr(R> 1 Pr(R> 2),etc. Thus, the whole sum is also equal to ∑P(R>) These two expressions for the whole sum must be equal, which proves the theorem. D 2.1 Mean Time to failure Lets look at a problem where one of these alternative definitions of expected value is particularly helpful. A computer program crashes at the end of each hour of use with probability p, if it has not crashed already. What is the expected time until the program crashes? If we let R be the number of hours until the crash, then the answer to our problem is Ex(R). This is a natural-valued variable, so we can use the formula Ex(B)=∑Pr(R>i
� � � � � Expected Value I 5 Corollary 2. If R is a naturalvalued random variable, then: ∞ Ex (R) = i · Pr (R = i) i=0 There is another way to write the expected value of a random variable that takes on values only in the natural numbers, N = {0, 1, 2, . . .}. Theorem 3. If R is a naturalvalued random variable, then: ∞ Ex (R) = Pr (R > i) i=0 Proof. Consider the sum: Pr (R = 1) + Pr (R = 2) + Pr (R = 3) + · · · + Pr (R = 2) + Pr (R = 3) + · · · + Pr (R = 3) + · · · + · · · The columns sum to 1 · Pr (R = 1), 2 · Pr (R = 2), 3 · Pr (R = 3), etc. Thus, the whole sum is equal to: ∞ i · Pr (R = i) = Ex (R) i=0 Here, we’re using Corollary 2. On the other hand, the rows sum to Pr (R > 0), Pr (R > 1), Pr (R > 2), etc. Thus, the whole sum is also equal to: ∞ Pr (R > i) i=0 These two expressions for the whole sum must be equal, which proves the theorem. 2.1 Mean Time to Failure Let’s look at a problem where one of these alternative definitions of expected value is particularly helpful. A computer program crashes at the end of each hour of use with probability p, if it has not crashed already. What is the expected time until the program crashes? If we let R be the number of hours until the crash, then the answer to our problem is Ex (R). This is a naturalvalued variable, so we can use the formula: ∞ Ex (R) = Pr (R > i) i=0