6.042/18.062] Mathematics for Computer Science February 24, 2005 Srini devadas and Eric Lehman Lecture notes Number theory ll Image of Alan Turing removed for copyright reasons s The man pictured above is Alan Turing, the most important figure in the history of mputer science. For decades, his fascinating life story was shrouded by government secrecy, societal taboo, and even his own deceptions At 24 Turing wrote a paper entitled On Computable Numbers, with an Application to the Entscheidungsproblem. The crux of the paper was an elegant way to model a computer in mathematical terms. This was a breakthrough, because it allowed the tools of mathemat ics to be brought to bear on questions of computation. For example, with his model in hand, Turing immediately proved that there exist problems that no computer can solve- no matter how ingenius the programmer. Turings paper is all the more remarkable be cause he wrote it in 1936, a full decade before any computer actually existed The word"Entscheidungsproblem"in the title refers to one of the 28 mathematical problems posed by David Hilbert in 1900 as challenges to mathematicians of the 20th century. Turing knocked that one off in the same paper. And perhaps you',ve heard of the"Church-Turing thesis"? Same paper. So Turing was obviously a brilliant guy who generated lots of amazing ideas. But this lecture is about one of Turings less-amazing deas. It involved codes. It involved number theory. And it was sort of stupid 1 Turing s Code Let's look back to the fall of 1937. Nazi germany was rearming under Adolf Hitler, world- shattering war looked imminent, and-like us- Alan Turing was pondering the useful
6.042/18.062J Mathematics for Computer Science February 24, 2005 Srini Devadas and Eric Lehman Lecture Notes Number Theory II Image of Alan Turing removed for copyright reasons. The man pictured above is Alan Turing, the most important figure in the history of computer science. For decades, his fascinating life story was shrouded by government secrecy, societal taboo, and even his own deceptions. At 24 Turing wrote a paper entitled On Computable Numbers, with an Application to the Entscheidungsproblem. The crux of the paper was an elegant way to model a computer in mathematical terms. This was a breakthrough, because it allowed the tools of mathematics to be brought to bear on questions of computation. For example, with his model in hand, Turing immediately proved that there exist problems that no computer can solve— no matter how ingenius the programmer. Turing’s paper is all the more remarkable because he wrote it in 1936, a full decade before any computer actually existed. The word “Entscheidungsproblem” in the title refers to one of the 28 mathematical problems posed by David Hilbert in 1900 as challenges to mathematicians of the 20th century. Turing knocked that one off in the same paper. And perhaps you’ve heard of the “ChurchTuring thesis”? Same paper. So Turing was obviously a brilliant guy who generated lots of amazing ideas. But this lecture is about one of Turing’s lessamazing ideas. It involved codes. It involved number theory. And it was sort of stupid. 1 Turing’s Code Let’s look back to the fall of 1937. Nazi Germany was rearming under Adolf Hitler, worldshattering war looked imminent, and— like us— Alan Turing was pondering the useful
Number Theory II ness of number theory. He forsaw that preserving military secrets would be vital in the coming conflict and proposed a way to encrypt communications using number theory. This is an idea that has ricocheted up to our own time. Today, number theory is the basis for numerous public-key cryptosystems, digital signature schemes, cryptographic hash functions, and digital cash systems. Every time you buy a book from Amazon, check your grades on WebsIs, or use a PayPal account, you are relying on number theoretic algorithms Soon after devising his code, Turing disappeared from public view, and half a centur ould pass before the world learned the full story of where he'd gone and what he did there. Well come back to Turings life in a little while; for now, let's investigate the code Turing left behind. The details are uncertain, since he never formally published the idea so well consider a couple possibilities 1.1 Turing s Code(version 1.0) The first challenge is to translate a text message into an integer so we can perform math ematical operations on it. This step is not intended to make a message harder to read, so the details are not too important. Here is one approach: replace each letter of the message with two digits(A=01, B=02,C=03, etc. and string all the digits together to form one huge number. For example, the message""could be translated this way c t o 90320151825 Turings code requires the message to be a prime number, so we may need to pad the result with a few more digits to make a prime. In this case, appending the digits 13 gives the number 2209032015182513, which is prime Now here is how the encryption process works. In the description below, m is the unencoded message(which we want to keep secret), m* is the encrypted message(which the Nazis may intercept), and k is the key Beforehand The sender and receiver agree on a secret key, which is a large prime k Encryption The sender encrypts the message m by computing m=m·k Decryption The receiver decrypts m* by computing k-m
2 Number Theory II ness of number theory. He forsaw that preserving military secrets would be vital in the coming conflict and proposed a way to encrypt communications using number theory. This is an idea that has ricocheted up to our own time. Today, number theory is the basis for numerous publickey cryptosystems, digital signature schemes, cryptographic hash functions, and digital cash systems. Every time you buy a book from Amazon, check your grades on WebSIS, or use a PayPal account, you are relying on number theoretic algorithms. Soon after devising his code, Turing disappeared from public view, and half a century would pass before the world learned the full story of where he’d gone and what he did there. We’ll come back to Turing’s life in a little while; for now, let’s investigate the code Turing left behind. The details are uncertain, since he never formally published the idea, so we’ll consider a couple possibilities. 1.1 Turing’s Code (Version 1.0) The first challenge is to translate a text message into an integer so we can perform mathematical operations on it. This step is not intended to make a message harder to read, so the details are not too important. Here is one approach: replace each letter of the message with two digits (A = 01, B = 02, C = 03, etc.) and string all the digits together to form one huge number. For example, the message “victory” could be translated this way: “v i c t o r y” → 22 09 03 20 15 18 25 Turing’s code requires the message to be a prime number, so we may need to pad the result with a few more digits to make a prime. In this case, appending the digits 13 gives the number 2209032015182513, which is prime. Now here is how the encryption process works. In the description below, m is the unencoded message (which we want to keep secret), m∗ is the encrypted message (which the Nazis may intercept), and k is the key. Beforehand The sender and receiver agree on a secret key, which is a large prime k. Encryption The sender encrypts the message m by computing: m∗ = m k· Decryption The receiver decrypts m∗ by computing: m∗ m k = · = m k k
Number Theory Il For example, suppose that the secret key is the prime number k= 22801763489 and the message m is"victory". Then the encrypted message is: =2209032015182513·22801763489 50369825549820718594667857 There are a couple questions one might naturally ask about Turings code 1. How can the sender and receiver ensure that m and k are prime numbers, as re- uired? The general problem of determining whether a large number is prime or compos ite has been studied for centuries, and reasonably good primality tests were known even in Turing s time. In 2002, Manindra Agrawal, Neeraj Kayal, and Nitin Sax ena announced a primality test that is guaranteed to work on a number n in about (log n)steps. This definitively placed primality testing in the class of"easy"com- putational problems at last. Amazingly, the description of their breakthrough algo- rithm was only thirteen lines long! 2. Is Turings code secure? The Nazis see only the encrypted message m=m-k, so recovering the original mes- sage m requires factoring m*. Despite immense efforts, no really efficient factoring algorithm has ever been found. It appears to be a fundamentally difficult problem, though a breakthrough someday is not impossible. In effect, Turing s code puts to practical use his discovery that there are limits to the power of computation. Thus, provided m and k are sufficiently large, the Nazis seem to be out of luck! This all sounds promising, but there is a major flaw in Turings code 1.2 Breaking Turings Code Let's consider what happens when the sender transmits a second message using Turings code and the same key. This gives the Nazis two encrypted messages to look at: m1=m1·k and m2=m2·k The greatest common divisor of the two encrypted messages, mi and mp, is the secret key k. And, as weve seen, the gcd of two numbers can be computed very efficiently. So after the second message is sent, the Nazis can read recover the secret key and read every It is difficult to believe a mathematician as brilliant as Turing could overlook such glaring problem. One possible explanation is that he had a slightly different system mind, one based on modular arithmetic
Number Theory II 3 For example, suppose that the secret key is the prime number k = 22801763489 and the message m is “victory”. Then the encrypted message is: m∗ = m k· = 2209032015182513 · 22801763489 = 50369825549820718594667857 There are a couple questions one might naturally ask about Turing’s code. 1. How can the sender and receiver ensure that m and k are prime numbers, as required? The general problem of determining whether a large number is prime or composite has been studied for centuries, and reasonably good primality tests were known even in Turing’s time. In 2002, Manindra Agrawal, Neeraj Kayal, and Nitin Saxena announced a primality test that is guaranteed to work on a number n in about (log n)12 steps. This definitively placed primality testing in the class of “easy” computational problems at last. Amazingly, the description of their breakthrough algorithm was only thirteen lines long! 2. Is Turing’s code secure? The Nazis see only the encrypted message m∗ = m·k, so recovering the original message m requires factoring m∗. Despite immense efforts, no really efficient factoring algorithm has ever been found. It appears to be a fundamentally difficult problem, though a breakthrough someday is not impossible. In effect, Turing’s code puts to practical use his discovery that there are limits to the power of computation. Thus, provided m and k are sufficiently large, the Nazis seem to be out of luck! This all sounds promising, but there is a major flaw in Turing’s code. 1.2 Breaking Turing’s Code Let’s consider what happens when the sender transmits a second message using Turing’s code and the same key. This gives the Nazis two encrypted messages to look at: m∗ = m1 · k and m∗ 1 2 = m2 · k The greatest common divisor of the two encrypted messages, m∗ and m∗ 2, is the secret 1 key k. And, as we’ve seen, the gcd of two numbers can be computed very efficiently. So after the second message is sent, the Nazis can read recover the secret key and read every message! It is difficult to believe a mathematician as brilliant as Turing could overlook such a glaring problem. One possible explanation is that he had a slightly different system in mind, one based on modular arithmetic
2 Modular arithmetic On page 1 of his masterpiece on number theory, Disquisitiones Arithmeticae, Gauss intro- duced the notion of"congruence". Now, Gauss is another guy who managed to cough up a half-decent idea every now and then, so let's take a look at this one. Gauss said that a is congruent to b modulo n if n (a-b). This is denoted a= b(mod n). For example 29= 15(mod 7) because 7(29-15) Intuitively, the symbol is sort of like an=sign, and the mod 7 describes the specific sense in which 29 is equal-ish to 15. Thus, even though(mod 7) appears over on the right side, it is in no sense more strongly associated with the 15 than the 29; in fact, it actually defines the meaning of the sign Here's another way to think about congruences: congruence modulo n defines a partition of the integers into n sets so that congruent numbers are all in the same set. For example, suppose that were working modulo 3. Then we can partition the integers into 3 sets as follows 0,3,6,9 1,4,7,10, Now integers in the same set are all congruent modulo 3. For example, 6 and-3 are both in the first set, and theyre congruent because their difference, 6 =9, is a multiple of 3 Similarly, 1l and 5 are both in the last set because 11-5=6 is a multiple of 3. On the other hand, numbers in different sets are not congruent. For example, 9 is in the first set and 11 in the last set, and theyre not congruent because 11-9=2 is not a multiple of 3. The upshot is that when arithmetic is done modulo n there are only n really different kinds of number to worry about. In this sense, modular arithmetic is a simplification of ordinary arithmetic and thus is a good reasoning tool There are many useful facts about congruences, some of which are listed in the lemma below. The overall theme is that congruences work a lot like equations, though ther re are a couple exceptions Lemma 1(Facts About Congruences). The following hold forn>1 1.a≡a(modn) 2.a≡b(modm) implies b≡a(modn) 3.a≡b(modm)andb≡c(modn) implies a≡c(modn) 4.a≡b(modn) implies a+c≡b+c(modn) 5.a≡b(modn) implies ac≡be(modn) 6.a≡b(modn)andc≡d(modn) imply a+c≡b+d(modm)
4 Number Theory II 2 Modular Arithmetic On page 1 of his masterpiece on number theory, Disquisitiones Arithmeticae, Gauss introduced the notion of “congruence”. Now, Gauss is another guy who managed to cough up a halfdecent idea every now and then, so let’s take a look at this one. Gauss said that a is congruent to b modulo n if n | (a − b). This is denoted a ≡ b (mod n). For example: 29 ≡ 15 (mod 7) because 7 (29 | − 15). Intuitively, the ≡ symbol is sort of like an = sign, and the mod 7 describes the specific sense in which 29 is equalish to 15. Thus, even though (mod 7) appears over on the right side, it is in no sense more strongly associated with the 15 than the 29; in fact, it actually defines the meaning of the ≡ sign. Here’s another way to think about congruences: congruence modulo n defines a partition of the integers into n sets so that congruent numbers are all in the same set. For example, suppose that we’re working modulo 3. Then we can partition the integers into 3 sets as follows: { . . . , −6, −3, 0, 3, 6, 9, . . . } { . . . , −5, −2, 1, 4, 7, 10, . . . . . . , −4, −1, 2, 5, 8, 11, . . . } { } Now integers in the same set are all congruent modulo 3. For example, 6 and 3 are both in the first set, and they’re congruent because their difference, 6 − (−3) = 9, is a multiple of 3. Similarly, 11 and 5 are both in the last set, because 11 − 5 = 6 is a multiple of 3. On the other hand, numbers in different sets are not congruent. For example, 9 is in the first set and 11 in the last set, and they’re not congruent because 11 − 9 = 2 is not a multiple of 3. The upshot is that when arithmetic is done modulo n there are only n really different kinds of number to worry about. In this sense, modular arithmetic is a simplification of ordinary arithmetic and thus is a good reasoning tool. There are many useful facts about congruences, some of which are listed in the lemma below. The overall theme is that congruences work a lot like equations, though there are a couple exceptions. Lemma 1 (Facts About Congruences). The following hold for n ≥ 1: 1. a ≡ a (mod n) 2. a ≡ b (mod n) implies b ≡ a (mod n) 3. a ≡ b (mod n) and b ≡ c (mod n) implies a ≡ c (mod n) 4. a ≡ b (mod n) implies a + c ≡ b + c (mod n) 5. a ≡ b (mod n) implies ac ≡ bc (mod n) 6. a ≡ b (mod n) and c ≡ d (mod n) imply a + c ≡ b + d (mod n)
Number Theory Il 7.a≡b(modm)andc≡d(modn) imply ac≡bd(modn) Proof. We prove only parts 1 and 7; the other parts are proved similarly (part 1)Every integer divides 0, so n (a-a), which means a= a(mod n) (part 7) The assumption a= b(mod n) implies that ac bc(mod n) by part 5. Similarly, the assumption c= d(mod n) implies bc bd(mod n). Therefore, ac bd(mod n)by art 3 There is a close connection between modular arithmetic and the remainder operation which we looked at last time. To clarify this link, lets reconsider the partition of the integers defined by congruence modulo 3 6.-3.0.3.6.9 5,-2,1,4,7,10 Notice that two numbers are in the same set if and only if they leave the same remainder hen divided by 3. The numbers in the first set all leave a remainder of 0 when divided by 3, the numbers in the second set leave a remainder of 1, and the numbers in the third leave a remainder of 2. Furthermore, notice that each number is in the same set as its own remainder. For example 11 and 1l rem 3=2 are both in the same set. Let' s bundle all this happy goodness into a lemma Lemma 2( Congruences and Remainders). The following assertions hold 1.a≡( a rem n)(modn) 2. a=b(mod n) if and only if (a rem n)=(b rem n Proof.(of part 2) By the division algorithm, there exist unique pairs of integers q1, rI and 92, r2 such that a=gin +rI ( where0≤71<m) b= g2n+ra ( where0≤n2<m) In these terms, (a rem n)=rI and(b rem n)=r2. Subtracting the second equation from the first gives: a-b=(qh-q2)n+(r1-r2) (where -n<r1-r2< n) Now a = b(mod n)if and only if n divides the left side. This is true if and only if n divides the right side, which holds if and only if r1-r2 is a multiple of n. Given the bounds onr r2, this happens precisely when r1=r2, which is equivalent to(a rem n)=(b rem n). D
Number Theory II 5 7. a ≡ b (mod n) and c ≡ d (mod n) imply ac ≡ bd (mod n) Proof. We prove only parts 1 and 7; the other parts are proved similarly. (part 1) Every integer divides 0, so n | (a − a), which means a ≡ a (mod n). (part 7) The assumption a ≡ b (mod n) implies that ac ≡ bc (mod n) by part 5. Similarly, the assumption c ≡ d (mod n) implies bc ≡ bd (mod n). Therefore, ac ≡ bd (mod n) by part 3. There is a close connection between modular arithmetic and the remainder operation, which we looked at last time. To clarify this link, let’s reconsider the partition of the integers defined by congruence modulo 3: { . . . , −6, −3, 0, 3, 6, 9, . . . } { . . . , −5, −2, 1, 4, 7, 10, . . . . . . , −4, −1, 2, 5, 8, 11, . . . } { } Notice that two numbers are in the same set if and only if they leave the same remainder when divided by 3. The numbers in the first set all leave a remainder of 0 when divided by 3, the numbers in the second set leave a remainder of 1, and the numbers in the third leave a remainder of 2. Furthermore, notice that each number is in the same set as its own remainder. For example, 11 and 11 rem 3 = 2 are both in the same set. Let’s bundle all this happy goodness into a lemma. Lemma 2 (Congruences and Remainders). The following assertions hold: 1. a ≡ (a rem n) (mod n) 2. a ≡ b (mod n) if and only if (a rem n) = (b rem n) Proof. (of part 2) By the division algorithm, there exist unique pairs of integers q1, r1 and q2, r2 such that: a = q1n + r1 (where 0 ≤ r1 < n) b = q2n + r2 (where 0 ≤ r2 < n) In these terms, (a rem n) = r1 and (b rem n) = r2. Subtracting the second equation from the first gives: a − b = (q1 − q2)n + (r1 − r2) (where −n < r1 − r2 < n) Now a ≡ b (mod n) if and only if n divides the left side. This is true if and only if n divides the right side, which holds if and only if r1−r2 is a multiple of n. Given the bounds on r1 = r2, this happens precisely when r1 = r2, which is equivalent to (a rem n) = (b rem n)