458 ANDrew WILES with residue field k.The universal representation associated to po is defined over R and the universal property of R then defines a map RR1.So we section to the mapR(6)→Rw受,W and the map is an isomorphism.(I am grateful to Faltings for this observation.)We will alsc need to extend the consideration of -algebras to the restricted cases.In each case we can require A to be an O-algebra and again it is easy to see that R险O二R吃品 in each case The second generalization concerns primes gp which are ramified in po. We distinguish three special cases(types (A)and(C)need not be disjoint): (A)polD=()for a suitable choice of basis,with xi and x2 unramified, x1x2=wand the fixed space of Ig of dimension 1, (B)polr=(,1,for a suitable choice of basis, (C)H(Q,WA)=0 where Wa is as defined in (1.6). Then in each case we can define a suitable deformation theory by imposing additional restrictions on those we have already considered,namely: (A)plD=()for a suitable choice of basis of A2 with v and 2 un- ramified and吻p2=e; (B)=(for a suitable choice of basis (of order primetop,so the same ch aracter as above); (C)detpl=detpoli.e.,of order prime to p. Thus if M is a set of primes in distinct from p and each satisfying one of (A),(B)or (C)for po,we will impose the corresponding restriction at each prime in M. Thus to each set of data D={,M}where.is Se,str,ord,flat or unrestricted,we can associate a deformation theory to po provided (1.3) po:Gal(Qs/Q))→GL2() is itself of type D and is the ring of integers of a totally ramified extension of W(k);po is ordinary if.is Se or ord,strict if.is strict and flat if.is fl (meaning flat);po is of type M,i.e.,of type (A),(B)or (C)at each ramified prime gp,M.We allow different types at different g's.We will refe to these as the standard deformation theories and write Rp for the universal ring associated to D and pD for the universal deformation (or even p if D is clear from the context). We note here that if D=(ord,∑,O,M)andD'=(Se,∑,O,M)then there is a simple relation between RD and Rp.Indeed there is a natural map
458 ANDREW WILES with residue field Ic. The universal representation associated to pb is defined over R1 and the universal property of R then defines a map R + R1. So we obtain a section to the map R(pb) + R @ W(kl) and the map is therefore W(k) an isomorphism. (I am grateful to Faltings for this observation.) We will also need to extend the consideration of 0-algebras to the restricted cases. In each case we can require A to be an 0-algebra and again it is easy to see that Rc," P Rc @ 0 in each case. Wflc) \, The second generalization concerns primes q # p which are ramified in po. We distinguish three special cases (types (A) and (C) need not be disjoint): (A) polo, = ( ,*,) for a suitable choice of basis, with ~1 and ~2 unramified, xi xzl = w and the fixed space of I, of dimension 1, (B) po11, = ( yP), xq# 1,for a suitable choice of basis, (C) H1(Q, Wx) = 0 where Wx is as defined in (1.6). Then in each case we can define a suitable deformation theory by imposing additional restrictions on those we have already considered, namely: (A) plo, = ( $' ;) for a suitable choice of basis of A2 with $1 and q2unramified and $l$;l = E; (B) piI, = ( yP) for a suitable choice of basis (x, of order prime to p, so the same character as above); (C) det pi = det po 1 I, i.e., of order prime to p. Thus if M is a set of primes in C distinct from p and each satisfying one of (A), (B) or (C) for pol we will impose the corresponding restriction at each prime in M. Thus to each set of data D = {.,C, 0,M) where . is Se, str, ord, flat or unrestricted, we can associate a deformation theory to po provided is itself of type D and 0 is the ring of integers of a totally ramified extension of W(Ic); po is ordinary if . is Se or ord, strict if . is strict and flat if . is fl (meaning flat); po is of type M, i.e., of type (A), (B) or (C) at each ramified prime q # p, q E M. We allow different types at different 9's. We will refer to these as the standard deformation theories and write Rv for the universal ring associated to D and p.o for the universal deformation (or even p if D is clear from the context). We note here that if D = (ord, C,0,M) and Dl = (Se, C, 0,M) then there is a simple relation between Rv and RVl. Indeed there is a natural map
MODULAR ELLIPTIC CURVES AND FERMAT'S LAST THEOREM 459 RD-Rp by the universal property of RD,and its kernel is a principal ideal generated by T=E-1()det pp()-1 where yE Gal(Qs/Q)is any element whose restriction to Gal(Q/Q)is a generator(where Qoo is the Zp-extension of Q)and whose restriction to Gal(Q(p)/Q)is trivial for any Nprime top with N E Qs,CN being a primitive Nth root of 1: (1.4) Rp/T Rp. It turns out that under the hypothesis that po is strict,ie.that polDp is not associated to a finite flat group scheme,the deformation problems in (i)(a)and (i)(c)are the same;i.e.,every Selmer deformation is already a strict deformation.This was observed by Diamond.The argument is local,so the decomposition group Dp could be replaced by Gal(Qp/Qp). PROPOSITION 1.1 (Diamond).Suppose that Dp-GL2(A)is a con- tinuous representation where A is an Artinian local ring with residue field k,a finite field of characteristic p.Suppose)with x1 and x2 unramified andx.Then the residual representationis associated to a finite fat group scheme over Zp. Proof(taken from [Dia,Prop.6.1]).We may replace r by x21 and we let =x1x21.Then (determines a cocycle t:Dp-M(1)where Mis a free Amodule of rank one on which D acts via.Let udenote the cohomology class in H(Dp,M(1))defined by t,and let uo denote its image in H1(Dp,Mo(1))where Mo M/mM.Let G=ker and let F be the fixed field of G(so F is a finite unramified extension of Qp).Choose n so that p"A =0.)is injective for,we see that the natural map of ADp/G]-modules H(G)MC,M(1))is an isomorphism.By Kummer theory,we have H1(G,M(1))FX/(FX)p"z M as Dp-modules.Now consider the commutative diagram H1(G,M(1))Dp- 一(Fx/(FxP8z,M)D, MDp (G,Mo(1)) 一(F×/(FX)P)8F,M M0 where the right-hand horizontal maps are induced by v :F×→Z.fp≠1, then MD CmM,so that the element resuo of H(G,Mo(1))is in the image of (/()P)F Mo.But this means that i is "peu ramifie"in the sense of [Se]and therefore comes from a finite flat group scheme.(See E1,(8.2)].) Remark.Diamond also observes that essentially the same proof shows that if Gal(Qa/Q)-GL2(A),where A is a complete local Noetherian
MODULAR ELLIPTIC CURVES AND FERMAT'S LAST THEOREM 459 Rv +RVl by the universal property of Rv, and its kernel is a principal ideal generated by T = ~-l(y)det pv(y) -1where y E Gal(Qc/Q) is any element whose restriction to Gal(Q,/Q) is a generator (where Q, is the Z,-extension of Q) and whose restriction to Gal(Q(CNP)/Q) is trivial for any N prime to p with CN E QC, CN being a primitive Nth root of 1: It turns out that under the hypothesis that po is strict, i.e. that polDp is not associated to a finite flat group scheme, the deformation problems in (i)(a) and (i)(c) are the same; i.e., every Selmer deformation is already a strict deformation. This was observed by Diamond. The argument is local, so the decomposition group Dp could be replaced by Gal(Q,/Q,). PROPOSITION 1.1 (Diamond). Suppose that s:D, i GL2(A) is a continuous representation where A is an Artinian local ring with residue field k, a finite field of characteristic p. Suppose s FZ (;lEXJ) with XI and ~2 unramified and # ~2.Then the residual representation s is associated to a finite flat group scheme over Z,. Proof (taken from [Dia, Prop. 6.11). We may replace s by s @ X~land we let cp = XIX,l. Then sg (rIt) determines a cocycle t : D, iM(l) where M is a free A-module of rank one on which D, acts via cp. Let u denote the cohomology class in H1(Dp, M(1)) defined by t, and let u0 denote its image in H1(D, Mo(l)) where Mo = M/mM. Let G = ker cp and let F be the fixed field of G (so F is a finite unramified extension of Q,). Choose n so that pnA = 0. Since H2(G, ppr) + H2(G, p,~) is injective for r 5 s, we see that the natural map of A[D,/G]-modules H1(G, ppn) @zpM + H1(G, M(1)) is an isomorphism. By Kummer theory, we have H1(G, M(1)) g F /(F ')pn @zpM as Dp-modules. Now consider the commutative diagram where the right-hand horizontal maps are induced by v, : FXi Z. If cp # 1, then M~Pc mM, so that the element resuo of H1(G, Mo(l)) is in the image of (Og/(Og)P) @F~Mo. But this means that .fi is "peu ramifi6" in the sense of [Se] and therefore ?i comes from a finite flat group scheme. (See [El, (8.2)].) Remark. Diamond also observes that essentially the same proof shows that if s : Gal(Qq/Qq) + GL2(A), where A is a complete local Noetherian
460 ANDREW WILES ring with residue field k,has the form,≈(g)with亓ramified then is of type (A). Globally,Proposition 1.1 says that if po is strict and if D=(Se,E,O,M) andD=(str,∑,O,M)then the natural map RD→Rp'isan orphism In each case the tangent space of Ro may be computed as in [Mal].Let A be a uniformizer for and let Uk2 be the representation space for po (The motivation for the subscript will become apparent later.)Let Vbe the representation space of Gal(Q/Q)on Adpo=Homk(U,UA)M2(k).Then there is an isomorphism of k-vector spaces(cf.the proof of Prop.1.2 below) (1.5) Homk(mp/(m品,k)≥H(Q/Q,) where H(Qs/Q,Va)is a subspace of H(Qx/Q,Va)which we now describe and mp is the maximal ideal of Rp.It consists of the cohomology classes which satisfy certain local restrictions at p and at the primes in M.We call mp/(m,A)the reduced cotangent space of RD. We begin with p.First we may write (since p 2),as k[Gal(Q/Q)]- modules, (1.6)V=Waek,where Wa {f E Homk(UA,U):tracef=0} (Sym2⑧det-l)po and k is the one-dimensional subspace of scalar multiplications.Then if po is ordinary the action of Dp on U induces a filtration of U and also on W and V.Suppose we write these,0C WWC Wa and 0CVCVC V.Thus U is defined by the requirement that Dp act on it via the character xI(cf.(1.2))and on U/U via x2.For Wa the filtrations are defined by W={f∈W:f(UR)C UR) w={U∈w:f=0onU} and the filtrations for V are obtained by replacing w by V.We note that these filtrations are often characterized by the action of Dp.Thus the action of Dp on We is viax/x on WA/Wg it is trivial and on W/W it is via X2/X1.These determine the filtration if either x1/x2 is not quadratic or polD is not semisimple.We define the k-vector spaces Vyrd={UeV以:f=0 in Hom(U/UR,U/U} Hse(Qp,Va)=ker{(Qp,Va)H(Qunr,Va/W)} Hdra(Qp,Va)=ker{H(Qp:Va)H(Qunr,Va/Vord)} Htr(Qp,Va)=ker{(Qp,Va)H(Qp:Wa/W)(Qunr,k)}
460 ANDREW WILES ring with residue field k, has the form (A T) with ?i ramified then T is of type (A). Globally, Proposition 1.1 says that if po is strict and if V = (Se, C, 0,M) and V"= (str, C, 0,M) then the natural map RD + RDl is an isomorphism. In each case the tangent space of RD may be computed as in [Mall. Let X be a uniformizer for 0 and let UA e k2 be the representation space for po. (The motivation for the subscript X will become apparent later.) Let VA be the representation space of Gal(Qc/Q) on Ad po = Hornk (UA, UA) 11 M2 (k). Then there is an isomorphism of k-vector spaces (cf. the proof of Prop. 1.2 below) where H&(Qc/Q, Vx) is a subspace of H1 (Qc/Q, VA) which we now describe and mz, is the maximal ideal of RD. It consists of the cohomology classes which satisfy certain local restrictions at p and at the primes in M. We call mD/(m&,A) the reduced cotangent space of RD. We begin with p. First we may write (since p # 2), as k[Gal(Qc/Q)]- modules, (1.6) Vx= Wx@k, where WA = {f EHomk(Ux,Ux):tracef =0} and k is the one-dimensional subspace of scalar multiplications. Then if po is ordinary the action of D, on Ux induces a filtration of Ux and also on WA and VA. Suppose we write these 0 c U: c UA, 0 c Wt c Wi c Wx and 0 c V: c V: c VA. Thus U: is defined by the requirement that D, act on it via the character ~1 (cf. (1.2)) and on Ux/U: via ~2. For Wx the filtrations are defined by and the filtrations for VA are obtained by replacing W by V. We note that these filtrations are often characterized by the action of D,. Thus the action of D, on W: is via ~11x2;on w~/w:it is trivial and on WA/W~it is via ~21x1. These determine the filtration if either x1/x2is not quadratic or polD, is not semisimple. We define the k-vector spaces
MODULAR ELLIPTIC CURVES AND FERMAT'S LAST THEOREM 461 In the Selmer case we make an analogous definition for H(Q,Wa)by replacing V by similarly in the strict case.In the fat the fact that there is a natural isomorphism of k-vector spaces '(Qp,)一ExtD(Ux,U) where the extensions are omputed in the category of k-vector spaces with local Galois action.Then H(Qp,Va)is defined as the k-subspace of H(Qp,VA) which is the inverse image of Exta(G,G),the group of extensions in the cate- gory of finite flat commutative group schemes ove Zp killed by p,G being the (unique)finite flat group scheme over Zp associated to Ux.By [Rayl]all such extensions in the inverse image even correspond to k-vector space schemes.For more details and calculations see [Ram). For g different from p and ge M we have three cases (A),(B),(C).In case (A)there is a filtration by Do entirely analogous to the one for p.We write this 0cWgcW9C Wa and we set (ker H(Qa,VA) →H'(Qg,w/Ww)⊕H'(Qgnr,k)in case(A) Hb(Qa,VA)= ker:H(Qg,VA) →H(Qgnr,) in case (B)or (C). Again we make an analogous definition for (Wa)by replacingV by Wa and deleting the last term in case (A).We now define the k-vector space Hb(Q/Q,Va)as Hb(Qs/Q,)={a∈H(Qz/Q,):ag∈Hb,(Qg,)for all g∈M, ap∈H(Qp,)} where*is Se,str,ord,fl or unrestricted according to the type of D.A similar definition applies to H(Qs/Q,Wa)if.is Selmer or strict. Now and for the rest of the section we are going to assume that po arises from the reduction of the A-adic representation associated to an eigenform More precisely we assume that there is a normalized eigenform f of weight 2 and level N,divisible only by the primes in E,and that there is a prime of such thatmod.Here is the ring of integers of the field generated by the Fourier coefficients of f so the fields of definition of the two representations need not be the same.However we assume that k O./A and we fix such an embedding so the comparison can be made over k.It will be convenient moreover to assume that if we are considering po as being of type D then D is defined using O-algebras where DOfx is an unramified extension whose residue field isk.(Although this condition is unnecessary,it is convenient to useas the uniformizer for .Finally we assume that
MODULAR ELLIPTIC CURVES AND FERMAT'S LAST THEOREM 461 In the Selmer case we make an analogous definition for Hi,(Q, Wx) by replacing VA by Wx, and similarly in the strict case. In the flat case we use the fact that there is a natural isomorphism of k-vector spaces where the extensions are computed in the category of k-vector spaces with local Galois action. Then Hfl(Qp, VA) is defined as the k-subspace of H1 (Q, VA) which is the inverse image of EX~;(G,G), the group of extensions in the category of finite flat commutative group schemes over Zp killed by p, G being the (unique) finite flat group scheme over Zp associated to UA. By [Rayl] all such extensions in the inverse image even correspond to k-vector space schemes. For more details and calculations see [Ram]. For q different from p and q E M we have three cases (A), (B), (C). In case (A) there is a filtration by D, entirely analogous to the one for p. We write this 0 c wX~, c wtqc WX and we set I - H1(Q, WA/w:") @ H1(Qinr,k) in case (A) HA,(Qq,VA)= ker : H1(Q, V,) ( + xl(qyr,VA) in case (B) or (C). Again we make an analogous definition for H~,(Q,WA) by replacing VA by Wx and deleting the last term in case (A). We now define the k-vector space H&(QclQ, Vx) as H&(QE/Q,VA)= {a E H'(QE/Q, VA): a, E H~,(Q,VA) for all q E M, where c is Se, str, ord, fl or unrestricted according to the type of 27. A similar definition applies to H&(Qc/Q, Wx)if . is Selmer or strict. Now and for the rest of the section we are going to assume that po arises from the reduction of the A-adic representation associated to an eigenform. More precisely we assume that there is a normalized eigenform f of weight 2 and level N, divisible only by the primes in C, and that there is a prime A of Of such that po = pf,~mod A. Here Of is the ring of integers of the field generated by the Fourier coefficients of f so the fields of definition of the two representations need not be the same. However we assume that k > Of,x/A and we fix such an embedding so the comparison can be made over k. It will be convenient moreover to assume that if we are considering po as being of type 27 then D is defined using 0-algebras where 0 > is an unramified extension whose residue field is k. (Although this condition is unnecessary, it is convenient to use X as the uniformizer for 0.) Finally we assume that pf
462 ANDREW WILES itself is of type D.Again this is a slight abuse of terminology as we are really condering the xteio ofcand ot bt do this without further mention if the context makes it clear.(The analysis of this section actually applies to any characteristic zero lifting of po but in all our applications we will be in the more restrictive context we have described here.) With these hypotheses there is a unique local homomorphism R of O-algebras which takes the universal deformation to (the class of)pf.Let =ker:Rp.LetK be the field of fractions of and let U) with the Galois action taken from PfA.Similarly,let V=AdpfK/O (K/O)4 with the adjoint representation so that ≌W;⊕K/O where Wf has Galois action via Sym2pf detp and the action on the second factor is trivial.Then if po is ordinary the filtration of U under the Adp action of Dp induces one on Wr which we write 0c. Often to simplify the notation we will drop the index f from Wi,V etc.There is also a filtration on Wam={ker A":W- -Wf}given by Win =Wannwi (compatible with our previous description for n=1).Likewise we write Vn for{kerλn:V -+V. We now explain h to extend the definition of Hp to give meaning to H(Qs/Q,Vam)and H(Qs/Q,V)and these are O/a"and O-modules,re- spectively.In the case where po is ordinary the definitions are the same with oeplacingV and orreplacing One checks easily that (1.7) H(Qx/Q,Vin)H(Qs/Q,V)n, where as usual the subscript "denotes the kernel of multiplication by A". This just uses the divisibility of H(Qs/Q,V)and H(Qp,W/Wo)in the strict case.In the Selmer case one checks that for m>n the kernel of H(Qpnr,Van/Won)H(Qnr,Vam/Wom) has only the zero element fixed under Gal(Q/Qp)and the ord case is similar. Checking conditions at EM is done with similar arguments.In the Selmer and strict cases we make analogous definitions with Wan in place of Van and W in place of V and the analogue of(1.7)still holds. We now conider the case whereo is flat (but not ordinry).We claim first that there is a natural map of O-modules (1.8) H'(Qp,)一Ext6D,iCm,Un) for each m >n where the extensions are of O-modules with local Galois action.To describe this suppose thatV).Then we can asso- ciate to a a representation pa:Gal(Qp/Qp)-GL2(One])(where One]=
462 ANDREW WILES itself is of type 27.Again this is a slight abuse of terminology as we are really considering the extension of scalars pf,x 18 0 and not pf,x itself, but we will "f,~ do this without further mention if the context makes it clear. (The analysis of this section actually applies to any characteristic zero lifting of po but in all our applications we will be in the more restrictive context we have described here.) With these hypotheses there is a unique local homomorphism Rv + 0 of 0-algebras which takes the universal deformation to (the class of) pf,~. Let pv = ker : Rv + 0. Let K be the field of fractions of 0and let Uf = (~10)~ with the Galois action taken from pf,~. Similarly, let Vf = AdpfIx8" K/O = (~/0)*with the adjoint representation so that Vf = Wf @ K/0 where Wf has Galois action via sym2 pf,~ 8 det and the action on the second factor is trivial. Then if po is ordinary the filtration of Uf under the Adp action of Dp induces one on Wf which we write 0 c w~O c ~f c Wf. Often to simplify the notation we will drop the index f from Wj, Vf etc. There is also a filtration on Wxn = {ker An: Wf -Wf) given by Win = Wxn n wi (compatible with our previous description for n = 1). Likewise we write Vxn for {ker An: Vf -Vf). We now explain how to extend the definition of H&to give meaning to H&(Qc/Q, VAn) and H&(Qc/Q, V) and these are O/An and 0-modules, respectively. In the case where po is ordinary the definitions are the same with VAn or V replacing Vx and O/An or K/0 replacing k. One checks easily that as 0-modules (1.7) H&(QclQ,Vxn) = H&(QC/Q,Vxn, where as usual the subscript An denotes the kernel of multiplication by An. This just uses the divisibility of HO(Qc/Q, V) and H'(Q*, w/wO) in the strict case. In the Selmer case one checks that for m > n the kernel of HI (QF~ , VA~/W;~) + H1(&inrlVA~/W;~) has only the zero element fixed under Gal(QpUnr/Qp) and the ord case is similar. Checking conditions at q E M is done with similar arguments. In the Selmer and strict cases we make analogous definitions with Wxn in place of Vxn and W in place of V and the analogue of (1.7) still holds. We now consider the case where po is flat (but not ordinary). We claim first that there is a natural map of 0-modules (1.8) H' (4p. Vxn ) + EX~~~D~~ (~xm, Uxn ) for each m 2 n where the extensions are of 0-modules with local Galois action. To describe this suppose that a E H1(Qp, VAn). Then we can associate to a a representation pa: Gal(Qp/Qp) + GL2(0,[.c]) (where On[&] =