MODULAR ELLIPTIC CURVES AND FERMATS LAST THEOREM common Ps which is given in Chapter 5. Believing now that the proof was complete, I sketched the whole theory in three lectures in Cambridge, England on une 21-23. However it became clear to me in the fall of 1993 that the con- truction of the Euler system used to extend Flach's method was incomplete and possibly Hawed Chapter 3 follows the original approach I had taken to the problem of bounding the Selmer group but had abandoned on learning of Flach's paper Darmon encouraged me in February, 1994, to explain the reduction to the com- plete intersection property, as it gave a quick way to exhibit infinite families of modular j-invariants. In presenting it in a lecture at Princeton, I made almost unconsciously, critical switch to the special primes used in Chapter 3 as auxiliary primes. I had only observed the existence and importance of these primes in the fall of 1992 while trying to extend Flach's work. Previously, I had only used primes q=-1 mod p as auxiliary primes. In hindsight this change was crucial because of a development due to de shalit. As explained before, I had realized earlier that Hida's theory often provided one step towards a power series ring at least in the ordinary case. At the Cambridge conference de Shalit had explained to me that for primes q= 1 mod p he had obtained a version of Hida's results. But excerpt for explaining the complete intersection argument in the lecture at Princeton, I still did not give any thought to my initial ap- proach, which I had put aside since the summer of 1991, since I continued to believe that the Euler system approach was the correct one Meanwhile in January, 1994, R. Taylor had joined me in the attempt to repair the Euler system argument. Then in the spring of 1994, frustrated in the efforts to repair the Euler system argument, I begun to work with Taylor on an attempt to devise a new argument using p=2. The attempt to use p= 2 reached an impasse at the end of August. As Taylor was still not convinced that the Euler system argument was irreparable, I decided in September to take one last look at my attempt to generalise Flach, if only to formulate more precisely the obstruction. In doing this i came suddenly to a marvelous revelation: I saw in a flash on September 19th, 1994, that de Shalit's theory, if generalised could be used together with duality to glue the Hecke rings at suitable auxiliary levels into a power series ring. I had unexpectedly found the missing key to my old abandoned approach. It was the old idea of picking gi s with i lmod p" and n→∞asi→ oo that I used to achieve the limiting proces. The switch to the special primes of Chapter 3 had made all this possible After I communicated the argument to Taylor, we spent the next few days making sure of the details. the full argument, together with the deduction of the complete intersection property, is given in Tw In conclusion the key breakthrough in the proof had been the realization in the spring of 1991 that the two invariants introduced in the appendix could be used to relate the deformation rings and the Hecke rings. In effect the n-
MODULAR ELLIPTIC CURVES AND FERMAT’S LAST THEOREM 453 common ρ5 which is given in Chapter 5. Believing now that the proof was complete, I sketched the whole theory in three lectures in Cambridge, England on June 21-23. However, it became clear to me in the fall of 1993 that the construction of the Euler system used to extend Flach’s method was incomplete and possibly flawed. Chapter 3 follows the original approach I had taken to the problem of bounding the Selmer group but had abandoned on learning of Flach’s paper. Darmon encouraged me in February, 1994, to explain the reduction to the complete intersection property, as it gave a quick way to exhibit infinite families of modular j-invariants. In presenting it in a lecture at Princeton, I made, almost unconsciously, critical switch to the special primes used in Chapter 3 as auxiliary primes. I had only observed the existence and importance of these primes in the fall of 1992 while trying to extend Flach’s work. Previously, I had only used primes q ≡ −1 mod p as auxiliary primes. In hindsight this change was crucial because of a development due to de Shalit. As explained before, I had realized earlier that Hida’s theory often provided one step towards a power series ring at least in the ordinary case. At the Cambridge conference de Shalit had explained to me that for primes q ≡ 1 mod p he had obtained a version of Hida’s results. But excerpt for explaining the complete intersection argument in the lecture at Princeton, I still did not give any thought to my initial approach, which I had put aside since the summer of 1991, since I continued to believe that the Euler system approach was the correct one. Meanwhile in January, 1994, R. Taylor had joined me in the attempt to repair the Euler system argument. Then in the spring of 1994, frustrated in the efforts to repair the Euler system argument, I begun to work with Taylor on an attempt to devise a new argument using p = 2. The attempt to use p = 2 reached an impasse at the end of August. As Taylor was still not convinced that the Euler system argument was irreparable, I decided in September to take one last look at my attempt to generalise Flach, if only to formulate more precisely the obstruction. In doing this I came suddenly to a marvelous revelation: I saw in a flash on September 19th, 1994, that de Shalit’s theory, if generalised, could be used together with duality to glue the Hecke rings at suitable auxiliary levels into a power series ring. I had unexpectedly found the missing key to my old abandoned approach. It was the old idea of picking qi’s with qi ≡ 1mod pni and ni → ∞ as i → ∞ that I used to achieve the limiting process. The switch to the special primes of Chapter 3 had made all this possible. After I communicated the argument to Taylor, we spent the next few days making sure of the details. the full argument, together with the deduction of the complete intersection property, is given in [TW]. In conclusion the key breakthrough in the proof had been the realization in the spring of 1991 that the two invariants introduced in the appendix could be used to relate the deformation rings and the Hecke rings. In effect the η-
ANDREW JOHN WILES variant could be used to count Galois representations. The last step after the June, 1993, announcement, though elusive was but the conclusion of a long process whose purpose was to replace, in the ring-theoretic setting, the methods based on Iwasawa theory by methods based on the use of auxiliary primes One improvement that I have not included but which might be used to simplify some of Chapter 2 is the observation of Lenstra that the criterion fc Gorenstein rings to be complete intersections can be extended to more general rings which are finite and free as Zp-modules. Faltings has pointed out an improvement, also not included, which simplifies the argument in Chapter 3 and TW]. This is however explained in the appendix to TW It is a pleasure to thank those who read carefully a first draft of some of this paper after the Cambridge conference and particularly N. Katz who patiently answered many questions in the course of my work on Euler systems, and together with Ilusie read critically the Euler system argument. Their questions led to my discovery of the problem with it. Katz also listened critically to my first attempts to correct it in the fall of 1993. I am grateful also to Taylor for his assistance in analyzing in depth the euler system argument. I am indebted to F. Diamond for his generous assistance in the preparation of the final version of this paper. In addition to his many valuable suggestions, several others also made helpful comments and suggestions especially Conrad, de Shalit, Faltings Ribet, Rubin, Skinner and Taylor. I am most grateful to H. Darmon for his encouragement to reconsider my old argument. Although I paid no heed to his advice at the time, it surely left its mark Table of contents apter 1 1. Deformations of Galois representations 2. Some computations of cohomology groups 3. Some results on subgroups of GL2 (k) Chapter 2 1. The Gorenstein property 2. Congruences between Hecke rings 3. The main conjectures Chapter timates for the Selmer group Chapter 4 1. The ordinary CM case 2. Calculation of n Chapter 5 Application to elliptic curves References
454 ANDREW JOHN WILES invariant could be used to count Galois representations. The last step after the June, 1993, announcement, though elusive, was but the conclusion of a long process whose purpose was to replace, in the ring-theoretic setting, the methods based on Iwasawa theory by methods based on the use of auxiliary primes. One improvement that I have not included but which might be used to simplify some of Chapter 2 is the observation of Lenstra that the criterion for Gorenstein rings to be complete intersections can be extended to more general rings which are finite and free as Zp-modules. Faltings has pointed out an improvement, also not included, which simplifies the argument in Chapter 3 and [TW]. This is however explained in the appendix to [TW]. It is a pleasure to thank those who read carefully a first draft of some of this paper after the Cambridge conference and particularly N. Katzwho patiently answered many questions in the course of my work on Euler systems, and together with Illusie read critically the Euler system argument. Their questions led to my discovery of the problem with it. Katzalso listened critically to my first attempts to correct it in the fall of 1993. I am grateful also to Taylor for his assistance in analyzing in depth the Euler system argument. I am indebted to F. Diamond for his generous assistance in the preparation of the final version of this paper. In addition to his many valuable suggestions, several others also made helpful comments and suggestions especially Conrad, de Shalit, Faltings, Ribet, Rubin, Skinner and Taylor.I am most grateful to H. Darmon for his encouragement to reconsider my old argument. Although I paid no heed to his advice at the time, it surely left its mark. Table of Contents Chapter 1 1. Deformations of Galois representations 2. Some computations of cohomology groups 3. Some results on subgroups of GL2(k) Chapter 2 1. The Gorenstein property 2. Congruences between Hecke rings 3. The main conjectures Chapter 3 Estimates for the Selmer group Chapter 4 1. The ordinary CM case 2. Calculation of η Chapter 5 Application to elliptic curves Appendix References
MODULAR ELLIPTIC CURVES AND FERMATS LAST THEOREM Chapter 1 This chapter is devoted to the study of certain Galois representations n the first section we introduce and study Mazur's deformation theory and discuss various refinements of it. These refinements will be needed later to make precise the correspondence between the universal deformation rings and the Hecke rings in Chapter 2. The main results needed are Proposition 1.2 which is used to interpret various generalized cotangent spaces as Selmer groups and(1.7)which later will be used to study them. At the end of the section we relate these Selmer groups to ones used in the Bloch-Kato conjecture, but this connection is not needed for the proofs of our main results In the second section we extract from the results of poitou and Tate on Galois cohomology certain general relations between Selmer groups as 2 varies as well as between Selmer groups and their duals. The most important obser vation of the third section is Lemma 1.10(i)which guarantees the existence of the special primes used in Chapter 3 and TWI 1. Deformations of Galois representations Let p be an odd prime. Let 2 be a finite set of primes including p and let Qz be the maximal extension of Q unramified outside this set and oo Throughout we fix an embedding of Q, and so also of Q>, in C. We will also fix a choice of decomposition group Da for all primes q in Z. Suppose that k is a finite field characteristic p and that (1.1) po:Gal(Qx/Q)→GL2(k is an irreducible representation. In contrast to the introduction we will assume in the rest of the paper that Po comes with its field of definition k. Suppose further that det Po is odd. In particular this implies that the smallest field of definition for Po is given by the field ko generated by the traces but we will not assume that k=ko. It also implies that po is absolutely irreducible. We con- sider the deformation pl to GL2(A)of Po in the sense of Mazur [Mal]. Thus if W(k)is the ring of Witt vectors of k, A is to be a complete Noeterian local W(k)-algebra with residue field k and maximal ideal m, and a deformation [ pl is just a strict equivalence class of homomorphisms p: Gal( Qz/Q)-gL2(A) such that p mod m= po, two such homomorphisms being called strictly equiv- alent if one can be brought to the other by conjugation by an element of ker: GL2(A)- GL2(k). We often simply write p instead of pl for the equivalent class
MODULAR ELLIPTIC CURVES AND FERMAT’S LAST THEOREM 455 Chapter 1 This chapter is devoted to the study of certain Galois representations. In the first section we introduce and study Mazur’s deformation theory and discuss various refinements of it. These refinements will be needed later to make precise the correspondence between the universal deformation rings and the Hecke rings in Chapter 2. The main results needed are Proposition 1.2 which is used to interpret various generalized cotangent spaces as Selmer groups and (1.7) which later will be used to study them. At the end of the section we relate these Selmer groups to ones used in the Bloch-Kato conjecture, but this connection is not needed for the proofs of our main results. In the second section we extract from the results of Poitou and Tate on Galois cohomology certain general relations between Selmer groups as Σ varies, as well as between Selmer groups and their duals. The most important observation of the third section is Lemma 1.10(i) which guarantees the existence of the special primes used in Chapter 3 and [TW]. 1. Deformations of Galois representations Let p be an odd prime. Let Σ be a finite set of primes including p and let QΣ be the maximal extension of Q unramified outside this set and ∞. Throughout we fix an embedding of Q, and so also of QΣ, in C. We will also fix a choice of decomposition group Dq for all primes q in Z. Suppose that k is a finite field characteristic p and that (1.1) ρ0 : Gal(QΣ/Q) → GL2(k) is an irreducible representation. In contrast to the introduction we will assume in the rest of the paper that ρ0 comes with its field of definition k. Suppose further that det ρ0 is odd. In particular this implies that the smallest field of definition for ρ0 is given by the field k0 generated by the traces but we will not assume that k = k0. It also implies that ρ0 is absolutely irreducible. We consider the deformation [ρ] to GL2(A) of ρ0 in the sense of Mazur [Ma1]. Thus if W(k) is the ring of Witt vectors of k, A is to be a complete Noeterian local W(k)-algebra with residue field k and maximal ideal m, and a deformation [ρ] is just a strict equivalence class of homomorphisms ρ : Gal(QΣ/Q) → GL2(A) such that ρ mod m = ρ0, two such homomorphisms being called strictly equivalent if one can be brought to the other by conjugation by an element of ker : GL2(A) → GL2(k). We often simply write ρ instead of [ρ] for the equivalent class
ANDREW JOHN WILES We will restrict our choice of Po further by assuming that either (i) Po is ordinary; viz., the restriction of Po to the decomposition group D has (for a suitable choice of basis) the form (1. Po 0 where x1 and x2 are homomorphisms from D, to k* with x2 unramified Moreover we require that x1+x2. We do allow here that pold be semisimple.(If x1 and x2 are both unramified and PolD, is semisimple then we fix our choices of xI and x2 once and for all (ii)Po is fat at p but not ordinary (cf. Sel] where the terminology finite is used);viz., PolD is the representation associated to a finite fat group scheme over Zp but is not ordinary in the sense of (i).(In general when we refer to the fat case we will mean that po is assumed not to be ordinary unless we specify otherwise. We will assume also that det poli, =w where Ip is an inertia group at p and w is the Teichmuller character giving the action on pth roots of unity. In case(ii)it follows from results of Raynaud that polD is absolutely irreducible and one can describe PolI, explicitly. For extending a Jordan-Holder series for the representation space(as an Ip-module) to one for finite fat group chemes(cf. Ray 1) we observe first that the trivial character does not occur on a subquotient, as otherwise(using the classification of Oort-Tate or Raynaud) the group scheme would be ordinary. So we find by raynauds results, that Poll, o k e 1 0 v2 where 1 and v2 are the two fundamental characters of degree 2(cf. Corollary 3. 4. 4 of Rayl]). Since v1 and 2 do not extend to characters of Gal(Qp/ Qp), PolD, must be absolutely irreducible We sometimes wish to make one of the following restrictions on the deformations we allow (i)(a)Selmer deformations. In this case we assume that Po is ordinary, with no- tion as above, and that the deformation ha representative P: Gal(Q=/Q)-GL2 (A) with the property that(for a suitable choice of basis p 0 x with x2 unramified X2 mod m, and det plI. Ew x1X2 where E is the cyclotomic character, E: Gal(Qz/Q)-,, giving the action on all p-power roots of unity, w is of order prime to p satisfying w=E mod p, and xI and x2 are the characters of (i) viewed as taking values in
456 ANDREW JOHN WILES We will restrict our choice of ρ0 further by assuming that either: (i) ρ0 is ordinary; viz., the restriction of ρ0 to the decomposition group Dp has (for a suitable choice of basis) the form (1.2) ρ0|Dp ≈ χ1 ∗ 0 χ2 where χ1 and χ2 are homomorphisms from Dp to k∗ with χ2 unramified. Moreover we require that χ1 = χ2. We do allow here that ρ0|Dp be semisimple. (If χ1 and χ2 are both unramified and ρ0|Dp is semisimple then we fix our choices of χ1 and χ2 once and for all.) (ii) ρ0 is flat at p but not ordinary (cf. [Se1] where the terminology finite is used); viz., ρ0|Dp is the representation associated to a finite flat group scheme over Zp but is not ordinary in the sense of (i). (In general when we refer to the flat case we will mean that ρ0 is assumed not to be ordinary unless we specify otherwise.) We will assume also that det ρ0|Ip = ω where Ip is an inertia group at p and ω is the Teichm¨uller character giving the action on pth roots of unity. In case (ii) it follows from results of Raynaud that ρ0|Dp is absolutely irreducible and one can describe ρ0|Ip explicitly. For extending a Jordan-H¨older series for the representation space (as an Ip-module) to one for finite flat group schemes (cf. [Ray 1]) we observe first that the trivial character does not occur on a subquotient, as otherwise (using the classification of Oort-Tate or Raynaud) the group scheme would be ordinary. So we find by Raynaud’s results, that ρ0|Ip ⊗ k ¯ k ψ1 ⊕ ψ2 where ψ1 and ψ2 are the two fundamental characters of degree 2 (cf. Corollary 3.4.4 of [Ray1]). Since ψ1 and ψ2 do not extend to characters of Gal(Q¯ p/Qp), ρ0|Dp must be absolutely irreducible. We sometimes wish to make one of the following restrictions on the deformations we allow: (i) (a) Selmer deformations. In this case we assume that ρ0 is ordinary, with notion as above, and that the deformation has a representative ρ : Gal(QΣ/Q) → GL2(A) with the property that (for a suitable choice of basis) ρ|Dp ≈ χ˜1 ∗ 0 ˜χ2 with ˜χ2 unramified, ˜χ ≡ χ2 mod m, and det ρ|Ip = εω−1χ1χ2 where ε is the cyclotomic character, ε : Gal(QΣ/Q) → Z∗ p, giving the action on all p-power roots of unity, ω is of order prime to p satisfying ω ≡ ε mod p, and χ1 and χ2 are the characters of (i) viewed as taking values in k∗ ,→ A∗.
MODULAR ELLIPTIC CURVES AND FERMATS LAST THEOREM (i)(b)Ordinary deformations. The same as in (i)(a)but with no condition on the determinant (i)(c)Strict deformations. This is a variant on (i)(a) which we only use when PolD is not semisimple and not fat (i.e. not associated to a finite flat group scheme). We also assume that x1X2= w in this case. Then a strict deformation is as in(i)(a)except that we assume in addition that (x1/X2ID,=E (ii) Flat(at p) deformations. We assume that each deformation p to GL2(A) has the property that for lotient A/a of finite order plD, mod a is the galois representation associated to the Qp- points of a finite flat group scheme over Zp In each of these four cases, as well as in the unrestricted case(in which w impose no local restriction at p)one can verify that Mazur's use of Schlessinger's criteria [Sch) proves the existence of a universal deformation GalQ/Q)→GL2(R) In the ordinary and restricted case this was proved by Mazur and in the Aat case by Ramakrishna [Ram]. The other cases require minor modifications of Mazur's argument. We denote the universal ring Rs in the unrestricted case and Rse. rod. Rstr rf in the other four cases. We often omit the > if the context makes it clear There are certain generalizations to all of the above which we will also need. The first is that instead of considering w(k)-algebras A we may consider O-algebras for O the ring of integers of any local field with residue field k. If we need to record which O we are using we will write Rs o etc. It is easy to see that the natural local map of local O-algebras BO→B⑧O w(k) is an isomorphism because for functorial reasons the map has a natural section which induces an isomorphism on Zariski tangent spaces at closed points, and one can then use Nakayama's lemma. Note, however, hat if we change the residue field via i: k' then we have a new deformation problem associated to the representation po= i o po. There is again a natural map of W(k)- gebras R(p0) ⑧W(k) which is an isomorphism on Zariski tangent spaces. One can check that this is again an isomorphism by considering the subring Rl of R(po) defined as the ubring of all elements whose reduction modulo the maximal ideal lies in k Since R(po) is a finite Ri-module, Ri is also a complete local Noetherian ring
MODULAR ELLIPTIC CURVES AND FERMAT’S LAST THEOREM 457 (i) (b) Ordinary deformations. The same as in (i)(a) but with no condition on the determinant. (i) (c) Strict deformations. This is a variant on (i) (a) which we only use when ρ0|Dp is not semisimple and not flat (i.e. not associated to a finite flat group scheme). We also assume that χ1χ−1 2 = ω in this case. Then a strict deformation is as in (i)(a) except that we assume in addition that (˜χ1/χ˜2)|Dp = ε. (ii) Flat (at p) deformations. We assume that each deformation ρ to GL2(A) has the property that for any quotient A/a of finite order ρ|Dp mod a is the Galois representation associated to the Q¯ p-points of a finite flat group scheme over Zp. In each of these four cases, as well as in the unrestricted case (in which we impose no local restriction at p) one can verify that Mazur’s use of Schlessinger’s criteria [Sch] proves the existence of a universal deformation ρ : Gal(QΣ/Q) → GL2(R). In the ordinary and restricted case this was proved by Mazur and in the flat case by Ramakrishna [Ram]. The other cases require minor modifications of Mazur’s argument. We denote the universal ring RΣ in the unrestricted case and Rse Σ , Rord Σ , Rstr Σ , Rf Σ in the other four cases. We often omit the Σ if the context makes it clear. There are certain generalizations to all of the above which we will also need. The first is that instead of considering W(k)-algebras A we may consider O-algebras for O the ring of integers of any local field with residue field k. If we need to record which O we are using we will write RΣ,O etc. It is easy to see that the natural local map of local O-algebras RΣ,O → RΣ ⊗ W(k) O is an isomorphism because for functorial reasons the map has a natural section which induces an isomorphism on Zariski tangent spaces at closed points, and one can then use Nakayama’s lemma. Note, however, hat if we change the residue field via i :,→ k then we have a new deformation problem associated to the representation ρ 0 = i ◦ ρ0. There is again a natural map of W(k )- algebras R(ρ 0) → R ⊗ W(k) W(k ) which is an isomorphism on Zariski tangent spaces. One can check that this is again an isomorphism by considering the subring R1 of R(ρ 0) defined as the subring of all elements whose reduction modulo the maximal ideal lies in k. Since R(ρ 0) is a finite R1-module, R1 is also a complete local Noetherian ring