MODULAR ELLIPTIC CURVES AND FERMAT'S LAST THEOREM 453 common ps which is given in Chapter 5.Believing now that the proof was complete,I sketched the whole theory in three lectures in Cambridge,England on June 21-23.However,it became clear to me in the fall of 1993 that the con- struction of the Euler system used to extend Flach's method was incomplete and possibly fawed. Chapter 3 follows the original approach I had taken to the problem of bounding the Selmer group but had abandoned on learning of Flach's paper Darmon encouraged me in February,1994,to explain the reduction to the com- plete intersection property,as it gave a quick way to exhibit infinite families of modular j-invariants.In presenting it in a lecture at Princeton I mad almost unconsciously,a critical switch to the special primes used in Chapter 3 as auxiliary primes.I had only observed the existence and importance of these primes in the fall of 1992 while trying to extend Flach's work.Previously,I had only used primes q≡ -1modp as auxiliary primes.In hindsight this change was crucial because of a development due to de Shalit.As explained before,I had realized earlier that Hida's theory often provided one step towards a power series ring at least in the ordinary case. At the Cambridge conference de Shali had explained to me that for primes g=1 modp he had obtained a version of Hida's results.But except for explaining the complete intersection argument in the lecture at Princeton I still did not give any thought to my initial ap proach,which I had put aside since the summer of 1991,since I continued to believe that the Euler system approach was the correct one. Meanwhile in January,1994,R.Taylor had joined me in the attempt to repair the Euler system argument.Then in the spring of 1994,frustrated in the efforts to repair the Euler system argument,I began to work with Taylor on an attempt to devise a new argument u =2 reached an impasse at the end of August.As Taylor was still not convinced that the Euler system argument was irreparable,I decided in September to take one last look at my attempt to generalise Flach,if only to formulate more precisely the obstruction.In doing this I came suddenly to a marvelous revelation: saw in a flash on September 19th,1994,that de Shalit's theory,if generalised, could be used together with duality to glue the Hecke rings at suitable auxiliary levels into a power series ring.I had unexpectedly found the missing key tomy old abandoned approach.It was the old idea of picking gi's with gi =1 mod p" and nioo as ioo that I used to achieve the limiting process.The switch to th special primes of Chapter3 had made all this possible After I communicated the argument to Taylor,we spent the next few days making sure of the details.The full argument,together with the deduction of the e complete inters ction property,is given in [TW] In conclusion the key breakthrough in the proof had been the realization in the spring of 1991 that the two invariants introduced in the appendix could be used to relate the deformation rings and the Hecke rings.In effect the n-
MODULAR ELLIPTIC CURVES AND FERMAT'S LAST THEOREM 453 common p~,which is given in Chapter 5. Believing now that the proof was complete, I sketched the whole theory in three lectures in Cambridge, England on June 21-23. However, it became clear to me in the fall of 1993 that the construction of the Euler system used to extend Flach's method was incomplete and possibly flawed. Chapter 3 follows the original approach I had taken to the problem of bounding the Selmer group but had abandoned on learning of Flach's paper. Darmon encouraged me in February, 1994, to explain the reduction to the complete intersection property, as it gave a quick way to exhibit infinite families of modular j-invariants. In presenting it in a lecture at Princeton, I made, almost unconsciously, a critical switch to the special primes used in Chapter 3 as auxiliary primes. I had only observed the existence and importance of these primes in the fall of 1992 while trying to extend Flach's work. Previously, I had only used primes q - -1 modp as auxiliary primes. In hindsight this change was crucial because of a development due to de Shalit. As explained before, I had realized earlier that Hida's theory often provided one step towards a power series ring at least in the ordinary case. At the Cambridge conference de Shalit had explained to me that for primes q E 1modp he had obtained a version of Hida's results. But except for explaining the complete intersection argument in the lecture at Princeton, I still did not give any thought to my initial approach, which I had put aside since the summer of 1991, since I continued to believe that the Euler system approach was the correct one. Meanwhile in January, 1994, R. Taylor had joined me in the attempt to repair the Euler system argument. Then in the spring of 1994, frustrated in the efforts to repair the Euler system argument, I began to work with Taylor on an attempt to devise a new argument using p = 2. The attempt to use p = 2 reached an impasse at the end of August. As Taylor was still not convinced that the Euler system argument was irreparable, I decided in September to take one last look at my attempt to generalise Flach, if only to formulate more precisely the obstruction. In doing this I came suddenly to a marvelous revelation: I saw in a flash on September 19th, 1994, that de Shalit's theory, if generalised, could be used together with duality to glue the Hecke rings at suitable auxiliary levels into a power series ring. I had unexpectedly found the missing key to my old abandoned approach. It was the old idea of picking qi7s with qi = 1modpn" and n, -+ oo as i -+ oo that I used to achieve the limiting process. The switch to the special primes of Chapter 3 had made all this possible. After I communicated the argument to Taylor, we spent the next few days making sure of the details. The full argument, together with the deduction of the complete intersection property, is given in [TW]. In conclusion the key breakthrough in the proof had been the realization in the spring of 1991 that the two invariants introduced in the appendix could be used to relate the deformation rings and the Hecke rings. In effect the 7-
454 ANDREW WILES invariant could be used to count Galois representations.The last step after the June.1993.announcement.though elusive.was but the conclusion of a long process whose purpose was to replace,in the ring-theoretic setting,the methods based on Iwasawa theory by methods based on the use of auxiliary prime One improvement that I have not included but which might be used to simplify some of Chapter 2 is the observation of Lenstra that the criterion for Gorenstein rings to be complete intersections can be extended to more general rings which are finite and free as Z-modules.Faltings has pointed out an improvement,also not included,which simplifies the argument in Chapter 3 and [TW].This is however explained in the appendix to TW). It is a pleasure to thank those who read carefully a first draft of some of this paper after the Cambridge conference and particularly N.Katz who patiently answered many questions in the course of my work on Euler systems,and together with Ilusie read critically the Euler system argument.Th eir questions edtomydiscovery of the problemwith it.Katzastedriticalltomy first attempts to correct it in the fall of 1993.I am grateful also to Taylor for his assistance in analyzing in depth the Euler system argument.I am indebted to F.Diamond for his generous assistance in the preparation of the final version of this paper.In addition to his many valuable suggestions,several others also made helpful comments and suggestions especially conrad.de shalit.faltings Ribet,Rubin,Skinner and Taylor.Finally,Iam most grateful toH.Darmon for his encouragement to reconsider my old argument.Although I paid no heed to his advice at the time,it surely left its mark. Table of Contents Chapter 1 1.Deformations of Galois representations 2 Some computations of cohomology groups 3. Some results on subgroups of GL2(k) Chapter 2 1. The Gorenstein property 2. Congruences between hecke rings 3. The main conjectures Chapter 3 Estimates for the Selmer group Chapter 4 1 The ordinary CM case 2. Calculation of n Chapter 5 Application to elliptic curves Appendix References
454 ANDREW WILES invariant could be used to count Galois representations. The last step after the June, 1993, announcement, though elusive, was but the conclusion of a long process whose purpose was to replace, in the ring-theoretic setting, the methods based on Iwasawa theory by methods based on the use of auxiliary primes. One improvement that I have not included but which might be used to simplify some of Chapter 2 is the observation of Lenstra that the criterion for Gorenstein rings to be complete intersections can be extended to more general rings which are finite and free as Zp-modules. Faltings has pointed out an improvement, also not included, which simplifies the argument in Chapter 3 and [TW]. This is however explained in the appendix to [TW]. It is a pleasure to thank those who read carefully a first draft of some of this paper after the Cambridge conference and particularly N. Katz who patiently answered many questions in the course of my work on Euler systems, and together with Illusie read critically the Euler system argument. Their questions led to my discovery of the problem with it. Katz also listened critically to my first attempts to correct it in the fall of 1993. I am grateful also to Taylor for his assistance in analyzing in depth the Euler system argument. I am indebted to F. Diamond for his generous assistance in the preparation of the final version of this paper. In addition to his many valuable suggestions, several others also made helpful comments and suggestions especially Conrad, de Shalit, Faltings, Ribet, Rubin, Skinner and Taylor. Finally, I am most grateful to H. Darmon for his encouragement to reconsider my old argument. Although I paid no heed to his advice at the time, it surely left its mark. Table of Contents Chapter 1 1. Deformations of Galois representations 2. Some computations of cohomology groups 3. Some results on subgroups of GL2(k) Chapter 2 1. The Gorenstein property 2. Congruences between Hecke rings 3. The main conjectures Chapter 3 Estimates for the Selmer group Chapter 4 1. The ordinary CM case 2. Calculation of 71 Chapter 5 Application to elliptic curves Appendix References
MODULAR ELLIPTIC CURVES AND FERMAT'S LAST THEOREM 455 Chapter 1 This chapter is devoted to the study of certain Galois representations. In the first section we introduc and study Mazur's deformation theory and discuss various refinements of it.These refinements will be needed later to make precise the correspondence between the universal deformation rings and the Hecke rings in Chapter 2.The main results needed are Proposition 1.2 which is used to interpret various generalized cotangent spaces as Selmer groups and(1.7)which later will be used to study them.At the end of the section we relate these Selmer groups to ones used in the Bloch-Kato conjecture,but this connection is not needed for the proofs of our main results. In the second section we extract from the results of Poitou and Tate on Galois cohomology certain general relations between Selmer groups as varies, as well as between Selmer groups and their duals.The most important obser vation of the third section is Lemma 1.10(i)which guarantees the existence of the special primes used in Chapter 3 and [TW]. 1.Deformations of Galois representations Let p be an odd prime.Let be a finite set of primes including p and let Qs be the maximal extension of Q unramified outside this set and oo. Throughout we fix an embedding of Q,and so also of Q,in C.We will also fix a choice of decomposition group D for all primes g in Z.Suppose that k is a finite field of characteristic p and that (1.1) po:Gal(Qs/Q)→GL2(k) is an irreducible representation.In contrast to the introduction we will assume in the rest of the paper with its field of definitionk Suppos further that det po is odd.In particular this implies that the smallest field of definition for po is given by the field ko generated by the traces but we will not assume that k=ko.It also implies that po is absolutely irreducible.We con- sider the deformations [p]to GL2(A)of po in the sense of Mazur [Mal].Thus if W(k)is the ring of Witt vectors of k,A is to be a complete Noetherian local W(k)-algebra with residue field k and maximal ideal m,and a deformation [pl is just a strict equivalence class of homomorphisms p:Gal(Q/Q) →GL2(A) such that pmod m po,two such homomorphisms being called strictly equiv- alent if one can be brought to the other by conjugation by an element of ker GL2(A)GL2(k).We often simply write p instead of [p]for the equivalence class
MODULAR ELLIPTIC CURVES AND FERMAT'S LAST THEOREM 455 Chapter 1 This chapter is devoted to the study of certain Galois representations. In the first section we introduce and study Mazur's deformation theory and discuss various refinements of it. These refinements will be needed later to make precise the correspondence between the universal deformation rings and the Hecke rings in Chapter 2. The main results needed are Proposition 1.2 which is used to interpret various generalized cotangent spaces as Selmer groups and (1.7) which later will be used to study them. At the end of the section we relate these Selmer groups to ones used in the Bloch-Kato conjecture, but this connection is not needed for the proofs of our main results. In the second section we extract from the results of Poitou and Tate on Galois cohomology certain general relations between Selmer groups as C varies, as well as between Selmer groups and their duals. The most important observation of the third section is Lemma l.lO(i) which guarantees the existence of the special primes used in Chapter 3 and [TW]. 1. Deformations of Galois representations Let p be an odd prime. Let C be a finite set of primes including p and let Qc be the maximal extension of Q unramified outside this set and w. Throughout we fix an embedding of Q,and so also of Qc, in C. We will also fix a choice of decomposition group D, for all primes q in Z. Suppose that k is a finite field of characteristic p and that is an irreducible representation. In contrast to the introduction we will assume in the rest of the paper that po comes with its field of definition k. Suppose further that det po is odd. In particular this implies that the smallest field of definition for po is given by the field ko generated by the traces but we will not assume that k = ko. It also implies that po is absolutely irreducible. We consider the deformations [p] to GL2(A) of po in the sense of Mazur [Mall. Thus if W(k) is the ring of Witt vectors of k, A is to be a complete Noetherian local W(k)-algebra with residue field k and maximal ideal m, and a deformation [p] is just a strict equivalence class of homomorphisms p: Gal(Qc/Q) -t GL2(A) such that p mod m = po, two such homomorphisms being called strictly equivalent if one can be brought to the other by conjugation by an element of ker : GL2(A) -t GL2(lc). We often simply write p instead of [p] for the equivalence class
456 ANDREW WILES We will restrict our choice of po further by assuming that either: (i)po is ordinary;viz.,the restriction of po to the decomposition group Dp has (for a suitable choice of basis)the form (1.2) where xi and x2 are homomorphisms from Dp tokwith x2 unramified. Moreover we require that x1x2.We do allow here that polp,be semisimple.(If xI and x2 are both unramified and poD is semisimple then we fix our choices of xi and x2 once and for all.) (ii)oisfat atp but not ordinary (cf.Sel]where the terminology finite is used);viz.,polD,is the representation associated to a finite flat group scheme over Zp but is not ordinary in the sense of(i).(In general when we refer to the flat case we will mean that po is assumed not to be ordinary unless we specify otherwise.)We will assume also that detpo where Ip is an inertia group at p and w is the Teichmuiller character giving the action on pth roots of unity. In case (ii)it follows from results of Raynaud that polD,is absolutely irreducible and one can describe polr explicitly.For extending a Jordan-Holder series for the representation space(as an p-module)toone for finite flat group schemes(cf.Ray1])we observe first that the trivial character does not occur on a subquotient,as otherwise (using the classification of Oort-Tate or Raynaud) the gro oup scheme would be ordinary.So we find by Raynaud's results,that PolIp≈h⊕2 where吻iand2 are the two fundamental characters of degree 2(cf.Corollary 3.4.4 of [Ray1]).Since v and 2 do not extend to characters of Gal(Qp/Qp),poD,must be absolutely irreducible. We will sometimes wish to make one of the following restrictions on the deformations s we allow (i)(a)Selmer deformations.In this case we assume that po is ordinary,with no tation above, and that the deformation s a representative P:Gal(Qs/Q)-GL2(A)with the property that (for a suitable choice of basis) with uramified,=xamod m,and detp =cw-1x1x2 where e is the cyclotomic character,Gal(Q/Q)Zp,giving the action on all p-power roots of unity,w is of order prime to p satisfying w=e modp,and xi and x2 are the characters of(i)viewed as taking values in A
456 ANDREW WILES We will restrict our choice of po further by assuming that either: (i) po is ordinary; viz., the restriction of po to the decomposition group D, has (for a suitable choice of basis) the form where XI and x2 are homomorphisms from D, to kt with ~2 unramified. Moreover we require that XI # ~2.We do allow here that polDp be semisimple. (If XI and ~2 are both unramified and polDp is semisimple then we fixour choices of XI and ~2 once and for all.) (ii) po is flat at p but not ordinary (cf. [Sell where the terminology finite is used); viz., polDp is the representation associated to a finite flat group scheme over Z, but is not ordinary in the sense of (i). (In general when we refer to the flat case we will mean that po is assumed not to be ordinary unless we specify otherwise.) We will assume also that det polIp = w where I, is an inertia group at p and w is the Teichmiiller character giving the action on pth roots of unity. In case (ii) it follows from results of Raynaud that polDp is absolutely irreducible and one can describe po 1 Ip explicitly. For extending a Jordan-Hijlder series for the representation space (as an I,-module) to one for finite flat group schemes (cf. [Rayl]) we observe first that the trivial character does not occur on a subquotient, as otherwise (using the classification of Oort-Tate or Raynaud) the group scheme would be ordinary. So we find by Raynaud's results, that pc11, k " $1 @ $2 where $1 and $2 are the two fundamental characters of degree 2 (cf. Corollary 3.4.4 of [Rayl]). Since $1 and q2 do not extend to characters of Gal(Q,/Q,) , po 1 D p must be absolutely irreducible. We will sometimes wish to make one of the following restrictions on the deformations we allow: (i) (a) Selmer deformations. In this case we assume that po is ordinary, with notation as above, and that the deformation has a representative p : Gal(Qc/Q) + GL2(A) with the property that (for a suitable choice of basis) with unramified, g2 = ~2 mod m, and det plIp = E W - where ~ ~ ~ ~ ~ E is the cyclotomic character, E: Gal(Qc/Q) + Z;, giving the action on all ppower roots of unity, w is of order prime to p satisfying w = E modp, and XI and ~2 are the characters of (i) viewed as taking values in kt -A*
MODULAR ELLIPTIC CURVES AND FERMAT'S LAST THEOREM 457 (i)(b)Ordinary deformations.The same as in (i)(a)but with no condition on the determinant. (i)(c)Strict deformations.This is a variant on (i)(a)which we only use when aenmo strict deformation is as in (i)(a)except that we assume in addition that (1/2)lDp=e. (ii)Flat (at p)deformations.We assume that each deformation p to GL2(A) has the p roperty that for any quotient A/of finite order mod is the Galois representation associated to the Qp-points of a finite flat group scheme over Zp. In each of these four cases,as well as in the unrestricted case(in which we impose no local restriction at p)one can verify that Mazur's use of Schlessinger's criteria [Sch]proves the existence of a universal deformation p:Gal(Q/Q)GL2(R) In the ordinary and unrestricted case this was proved by Mazur and in the flat case by Ramakrishna [Ram).The other cases require minor modifications of Mazur's argument. We denote the universal ring Rs in the unrestricted case and Rse,Rord,Rs,R in the other four cases.We often omit the if the context makes it clear There are certain generalizations to all of the above which we will als need.The first is that instead of considering W(k)-algebras A we may consider O-algebras for the ring of integers of any local field with residue field k.If. we need to record which we are sing we will write Ro etc.It is easy to see that the natural local map of local O-algebras is an isomorphism because for functorial reasons the map has a natural section which induces an isomorphism on Zariski tangent spaces at closed points,and one can then use Nakayama's lemma.Note,howey ver,that if we change the residue field then we problem iat to the representation po =io po.There is again a natural map of W(k)- algebras which is an isomorphism on Zariski tangent spaces.One can check that this is again an isomorphism by considering the subring R of R(p)defined as the nents whose reduction modulo the maximal ideal lies in k. Stemdulee octhr
MODULAR ELLIPTIC CURVES AND FERMAT'S LAST THEOREM 457 (i) (b) Ordinary deformations. The same as in (i)(a) but with no condition on the determinant. (i) (c) Strict deformations. This is a variant on (i) (a) which we only use when polDp is not semisimple and not flat (i.e. not associated to a finite flat group scheme). We also assume that XIX;l = w in this case. Then a strict deformation is as in (i)(a) except that we assume in addition that (21122)ID, = E. (ii) Flat (at p) deformations. We assume that each deformation p to GL2(A) has the property that for any quotient Ala of finite order plop mod a is the Galois representation associated to the Qp-points of a finite flat group scheme over Zp. In each of these four cases, as well as in the unrestricted case (in which we impose no local restriction at p) one can verify that Mazur's use of Schlessinger's criteria [Sch] proves the existence of a universal deformation In the ordinary and unrestricted case this was proved by Mazur and in the flat case by Ramakrishna [Ram]. The other cases require minor modifications of Mazur's argument. We denote the universal ring Re in the unrestricted case and RE, RF~,Rgr,R: in the other four cases. We often omit the C if the context makes it clear. There are certain generalizations to all of the above which we will also need. The first is that instead of considering W(lc) -algebras A we may consider 0-algebras for 0 the ring of integers of any local field with residue field lc. If we need to record which 0 we are using we will write etc. It is easy to see that the natural local map of local 0-algebras is an isomorphism because for functorial reasons the map has a natural section which induces an isomorphism on Zariski tangent spaces at closed points, and one can then use Nakayama's lemma. Note, however, that if we change the residue field via i : lc ~tk1 then we have a new deformation problem associated to the representation pb = i o po. There is again a natural map of W(k1)- algebras which is an isomorphism on Zariski tangent spaces. One can check that this is again an isomorphism by considering the subring R1 of R(p6) defined as the subring of all elements whose reduction modulo the maximal ideal lies in lc. Since R(pb) is a finite R1-module, R1 is also a complete local Noetherian ring