41SPECIALRELATIVITYANDFLATSPACETIMEwe will therefore leave out factors of c in all subsequent formulae.Empirically we know thatc is the speed of light, 3x 1o°meters per second; thus, we are working in units where 1 secondequals 3× 1o8 meters.Sometimes it will be useful to refer to the space and time componentsof r separately, so we will use Latin superscripts to stand for the space components alone:Tl =Tr: r?=y(1.7)a3=zIt is also convenient to write the spacetime interval in a more compact form. We thereforeintroduce a 4 x 4 matrix, the metric, which we write using two lower indices:-10000100(1.8)Nv0010000(Some references, especially field theory books, define the metric with the opposite sign, sobe careful.) We then have the nice formulas=nwArtAr(1.9)Notice that we use the summation convention, in which indices which appear both assuperscripts and subscripts are summed over. The content of (1.9) is therefore just the sameas (1.3).Now we can consider coordinate transformations in spacetime at a somewhat more ab-stract level than before. What kind of transformations leave the interval (1.9) invariant?One simple variety are the translations, which merely shift the coordinates:(1.10)H-→=H+,where a is a set of four fixed numbers. (Notice that we put the prime on the index, not onthe r.) Translations leave the differences r unchanged, so it is not remarkable that theinterval is unchanged. The only other kind of linear transformation is to multiply r by a(spacetime-independent)matrix:rH=AHya(1.11)or,inmoreconventional matrixnotation(1.12)r=Ar.These transformations do not leave the differences Ar unchanged, but multiply them alsoby the matrix A. What kind of matrices will leave the interval invariant? Sticking with thematrixnotation,what we would likeiss? =(Ar)Tn(Ar) = (Ar)Tn(Ar)(1.13)= (△r)TATnA(△r)
1 SPECIAL RELATIVITY AND FLAT SPACETIME 4 we will therefore leave out factors of c in all subsequent formulae. Empirically we know that c is the speed of light, 3×108 meters per second; thus, we are working in units where 1 second equals 3×108 meters. Sometimes it will be useful to refer to the space and time components of x µ separately, so we will use Latin superscripts to stand for the space components alone: x i : x 1 = x x 2 = y x 3 = z (1.7) It is also convenient to write the spacetime interval in a more compact form. We therefore introduce a 4 × 4 matrix, the metric, which we write using two lower indices: ηµν = −1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 . (1.8) (Some references, especially field theory books, define the metric with the opposite sign, so be careful.) We then have the nice formula s 2 = ηµν∆x µ∆x ν . (1.9) Notice that we use the summation convention, in which indices which appear both as superscripts and subscripts are summed over. The content of (1.9) is therefore just the same as (1.3). Now we can consider coordinate transformations in spacetime at a somewhat more abstract level than before. What kind of transformations leave the interval (1.9) invariant? One simple variety are the translations, which merely shift the coordinates: x µ → x µ ′ = x µ + a µ , (1.10) where a µ is a set of four fixed numbers. (Notice that we put the prime on the index, not on the x.) Translations leave the differences ∆x µ unchanged, so it is not remarkable that the interval is unchanged. The only other kind of linear transformation is to multiply x µ by a (spacetime-independent) matrix: x µ ′ = Λµ ′ νx ν , (1.11) or, in more conventional matrix notation, x ′ = Λx . (1.12) These transformations do not leave the differences ∆x µ unchanged, but multiply them also by the matrix Λ. What kind of matrices will leave the interval invariant? Sticking with the matrix notation, what we would like is s 2 = (∆x) T η(∆x) = (∆x ′ ) T η(∆x ′ ) = (∆x) TΛ T ηΛ(∆x) , (1.13)
51SPECIALRELATIVITYANDFLATSPACETIMEand thereforen=ATnA,(1.14)orNo = A"pA"anuv .(1.15)We want to find the matrices A, such that the components of the matrix nμv are thesame as those of npa; that is what it means for the interval to be invariant under thesetransformations.The matrices which satisfy (1.14) are known as the Lorentz transformations; the setof them forms a group under matrix multiplication, known as the Lorentz group. There isa close analogy between this group and O(3), the rotation group in three-dimensional space.The rotation group can be thought of as 3 × 3 matrices R which satisfy1= RT1R,(1.16)where 1 is the 3 × 3 identity matrix. The similarity with (1.14) should be clear; the onlydifference is the minus sign in the first term of the metric n, signifying the timelike direction.The Lorentz group is therefore often referred to as O(3,1). (The 3 × 3 identity matrix issimplythemetricforordinaryfatspace.Suchametric,inwhich all of theeigenvaluesarepositive, is called Euclidean, while those such as (1.8) which feature a single minus sign arecalled Lorentzian.)Lorentz transformations fall into a number of categories. First there are the conventionalrotations, such as a rotation in the r-y plane:00(10)00cos0singA"'y(1.17)0- sin gcoso0(0001The rotation angle is a periodic variable with period 2. There are also boosts, whichmay be thought of as “rotations between space and time directions."An example is givenbycosh @- sinh o0000sinh ocoshAu(1.18)00100001The boost parameter , unlike the rotation angle, is defined from -oo to oo. There arealso discrete transformations which reverse the time direction or one or more of the spa-tial directions. (When these are excluded we have the proper Lorentz group, SO(3,1).) Ageneral transformation can be obtained by multiplying the individual transformations; the
1 SPECIAL RELATIVITY AND FLAT SPACETIME 5 and therefore η = ΛT ηΛ , (1.14) or ηρσ = Λµ ′ ρΛ ν ′ σηµ′ν ′ . (1.15) We want to find the matrices Λµ ′ ν such that the components of the matrix ηµ′ν ′ are the same as those of ηρσ; that is what it means for the interval to be invariant under these transformations. The matrices which satisfy (1.14) are known as the Lorentz transformations; the set of them forms a group under matrix multiplication, known as the Lorentz group. There is a close analogy between this group and O(3), the rotation group in three-dimensional space. The rotation group can be thought of as 3 × 3 matrices R which satisfy 1 = R T1R , (1.16) where 1 is the 3 × 3 identity matrix. The similarity with (1.14) should be clear; the only difference is the minus sign in the first term of the metric η, signifying the timelike direction. The Lorentz group is therefore often referred to as O(3,1). (The 3 × 3 identity matrix is simply the metric for ordinary flat space. Such a metric, in which all of the eigenvalues are positive, is called Euclidean, while those such as (1.8) which feature a single minus sign are called Lorentzian.) Lorentz transformations fall into a number of categories. First there are the conventional rotations, such as a rotation in the x-y plane: Λ µ ′ ν = 1 0 0 0 0 cos θ sin θ 0 0 − sin θ cos θ 0 0 0 0 1 . (1.17) The rotation angle θ is a periodic variable with period 2π. There are also boosts, which may be thought of as “rotations between space and time directions.” An example is given by Λ µ ′ ν = cosh φ − sinh φ 0 0 − sinh φ cosh φ 0 0 0 0 1 0 0 0 0 1 . (1.18) The boost parameter φ, unlike the rotation angle, is defined from −∞ to ∞. There are also discrete transformations which reverse the time direction or one or more of the spatial directions. (When these are excluded we have the proper Lorentz group, SO(3,1).) A general transformation can be obtained by multiplying the individual transformations; the
61SPECIALRELATIVITYANDFLATSPACETIMEexplicit expression for this six-parameter matrix (three boosts, three rotations) is not suffi-ciently pretty or useful to bother writing down. In general Lorentz transformations will notcommute, so the Lorentz group is non-abelian.The set of both translations and Lorentztransformations is a ten-parameter non-abelian group, the Poincaré group.You should not be surprised to learn that the boosts correspond to changing coordinatesby moving to a frame which travels at a constant velocity, but let's see it more explicitly.For the transformation given by (1.18),the transformed coordinates t'and ’will be givenbyt= tcosh@-rsinhΦ(1.19)r=-tsinho+rcosho.From this we see that the point defined by r'=0 is moving; it has a velocity_sinho(1.20)=tanho.U=t-coshpTo translate into more pedestrian notation, we can replace Φ = tanh- to obtaint = (t-va)r= (r-ut)(1.21)where = 1/V1-2. So indeed, our abstract approach has recovered the conventionalexpressions for Lorentz transformations.Applying these formulae leads to time dilation,length contraction,and soforth.An extremely useful tool is the spacetime diagram, so let's consider Minkowski spacefrom this point of view. We can begin by portraying the initial t and axes at (what areconventionally thought of as)right angles,and suppressing they and z axes.Then accordingto (1.19),under a boost in the r-t plane ther'axis (t'=0) is given by t =rtanho, whilethe t' axis (r' = O) is given by t = /tanh o. We therefore see that the space and time axesare rotated into each other, although they scissor together instead of remaining orthogonalinthetraditionalEuclideansense.(Aswe shall see,theaxesdoinfactremainorthogonalin the Lorentzian sense.)This should come as no surprise, since if spacetime behaved justlikea four-dimensional version of spacetheworld would bea verydifferent place.It is also enlightening to consider the paths corresponding to travel at the speed c =1.These are given in the original coordinate system by r = t. In the new system, a moment'sthought reveals that the paths defined by r'=±t' are precisely the same as those definedby = ±t; these trajectories are left invariant under Lorentz transformations. Of coursewe know that light travels at this speed; we have therefore found that the speed of light isthe same in any inertial frame. A set of points which are all connected to a single event by
1 SPECIAL RELATIVITY AND FLAT SPACETIME 6 explicit expression for this six-parameter matrix (three boosts, three rotations) is not suffi- ciently pretty or useful to bother writing down. In general Lorentz transformations will not commute, so the Lorentz group is non-abelian. The set of both translations and Lorentz transformations is a ten-parameter non-abelian group, the Poincar´e group. You should not be surprised to learn that the boosts correspond to changing coordinates by moving to a frame which travels at a constant velocity, but let’s see it more explicitly. For the transformation given by (1.18), the transformed coordinates t ′ and x ′ will be given by t ′ = t cosh φ − x sinh φ x ′ = −tsinh φ + x cosh φ . (1.19) From this we see that the point defined by x ′ = 0 is moving; it has a velocity v = x t = sinh φ cosh φ = tanh φ . (1.20) To translate into more pedestrian notation, we can replace φ = tanh−1 v to obtain t ′ = γ(t − vx) x ′ = γ(x − vt) (1.21) where γ = 1/ √ 1 − v 2 . So indeed, our abstract approach has recovered the conventional expressions for Lorentz transformations. Applying these formulae leads to time dilation, length contraction, and so forth. An extremely useful tool is the spacetime diagram, so let’s consider Minkowski space from this point of view. We can begin by portraying the initial t and x axes at (what are conventionally thought of as) right angles, and suppressing the y and z axes. Then according to (1.19), under a boost in the x-t plane the x ′ axis (t ′ = 0) is given by t = x tanh φ, while the t ′ axis (x ′ = 0) is given by t = x/ tanh φ. We therefore see that the space and time axes are rotated into each other, although they scissor together instead of remaining orthogonal in the traditional Euclidean sense. (As we shall see, the axes do in fact remain orthogonal in the Lorentzian sense.) This should come as no surprise, since if spacetime behaved just like a four-dimensional version of space the world would be a very different place. It is also enlightening to consider the paths corresponding to travel at the speed c = 1. These are given in the original coordinate system by x = ±t. In the new system, a moment’s thought reveals that the paths defined by x ′ = ±t ′ are precisely the same as those defined by x = ±t; these trajectories are left invariant under Lorentz transformations. Of course we know that light travels at this speed; we have therefore found that the speed of light is the same in any inertial frame. A set of points which are all connected to a single event by
71SPECIALRELATIVITYANDFLATSPACETIME1xXstraight lines moving at the speed of light is called a light cone; this entire set is invariantunder Lorentz transformations. Light cones are naturally divided into future and past; theset of all points inside the future and past light cones of a point p are called timelikeseparated from p, while those outside the light cones are spacelike separated and thoseon the cones are lightlike or null separated from p. Referring back to (1.3), we see that theinterval between timelike separated points is negative,between spacelike separated points ispositive, and between null separated points is zero. (The interval is defined to be s?,not thesquare root of this quantity.) Notice the distinction between this situation and that in theNewtonian world; here, it is impossible to say (in a coordinate-independent way)whether apoint that is spacelike separated from p is in the future of p, the past of p, or“at the sametime"To probe the structure of Minkowski space in more detail, it is necessary to introducethe concepts of vectors and tensors.We will start with vectors,which should be familiar.Ofcourse, in spacetime vectors are four-dimensional, and are often referred to as four-vectorsThis turns out to make quite a bit of difference; for example, there is no such thing as acrossproductbetweentwofour-vectors.Beyond the simple fact of dimensionality, the most important thing to emphasize is thateach vector is located at a given point in spacetime. You may be used to thinking of vectorsas stretching from one point to another in space, and even of "free" vectors which you canslide carelessly from point to point. These are not useful concepts in relativity. Rather, toeach point p in spacetime we associate the set of all possible vectors located at that point;this set is known as the tangent space at p, or T,.The name is inspired by thinking of theset of vectors attached to a point on a simple curved two-dimensional space as comprising a
1 SPECIAL RELATIVITY AND FLAT SPACETIME 7 x’ x t t’ x = -t x’ = -t’ x = t x’ = t’ straight lines moving at the speed of light is called a light cone; this entire set is invariant under Lorentz transformations. Light cones are naturally divided into future and past; the set of all points inside the future and past light cones of a point p are called timelike separated from p, while those outside the light cones are spacelike separated and those on the cones are lightlike or null separated from p. Referring back to (1.3), we see that the interval between timelike separated points is negative, between spacelike separated points is positive, and between null separated points is zero. (The interval is defined to be s 2 , not the square root of this quantity.) Notice the distinction between this situation and that in the Newtonian world; here, it is impossible to say (in a coordinate-independent way) whether a point that is spacelike separated from p is in the future of p, the past of p, or “at the same time”. To probe the structure of Minkowski space in more detail, it is necessary to introduce the concepts of vectors and tensors. We will start with vectors, which should be familiar. Of course, in spacetime vectors are four-dimensional, and are often referred to as four-vectors. This turns out to make quite a bit of difference; for example, there is no such thing as a cross product between two four-vectors. Beyond the simple fact of dimensionality, the most important thing to emphasize is that each vector is located at a given point in spacetime. You may be used to thinking of vectors as stretching from one point to another in space, and even of “free” vectors which you can slide carelessly from point to point. These are not useful concepts in relativity. Rather, to each point p in spacetime we associate the set of all possible vectors located at that point; this set is known as the tangent space at p, or Tp. The name is inspired by thinking of the set of vectors attached to a point on a simple curved two-dimensional space as comprising a
81SPECIALRELATIVITYANDFLATSPACETIMEplane which is tangent to the point. But inspiration aside, it is important to think of thesevectors as being located at a single point, rather than stretching from one point to another.(Although this won't stop us from drawing them as arrows on spacetime diagrams.)TpmanifoldMLater we will relate the tangent space at each point to things we can construct from thespacetime itself. For right now, just think of T, as an abstract vector space for each pointin spacetime. A (real) vector space is a collection of objects ("vectors") which, roughlyspeaking, can be added together and multiplied by real numbers in a linear way.Thus, forany two vectors V and W and real numbers a and b, we have(1.22)(a+b)(V+W)=aV+bV+aW+bW.Every vector space has an origin, i.e. a zero vector which functions as an identity elementunder vector addition. In many vector spaces there are additional operations such as takingan inner (dot) product, but this is extra structure over and above the elementary concept ofa vector space.A vector is a perfectly well-defined geometric object, as is a vector field, defined as aset of vectors with exactly one at eachpoint in spacetime.(The set of all thetangent spacesof a manifold M is called the tangent bundle, T(M).) Nevertheless it is often useful forconcrete purposes to decompose vectors into components with respect to some set of basisvectors. A basis is any set of vectors which both spans the vector space (any vector isa linear combination of basis vectors)and is linearly independent (no vector in the basisis a linear combination of other basis vectors).For any given vector space, there will bean infinite number of legitimate bases, but each basis will consist of the same number of
1 SPECIAL RELATIVITY AND FLAT SPACETIME 8 plane which is tangent to the point. But inspiration aside, it is important to think of these vectors as being located at a single point, rather than stretching from one point to another. (Although this won’t stop us from drawing them as arrows on spacetime diagrams.) p manifold M Tp Later we will relate the tangent space at each point to things we can construct from the spacetime itself. For right now, just think of Tp as an abstract vector space for each point in spacetime. A (real) vector space is a collection of objects (“vectors”) which, roughly speaking, can be added together and multiplied by real numbers in a linear way. Thus, for any two vectors V and W and real numbers a and b, we have (a + b)(V + W) = aV + bV + aW + bW . (1.22) Every vector space has an origin, i.e. a zero vector which functions as an identity element under vector addition. In many vector spaces there are additional operations such as taking an inner (dot) product, but this is extra structure over and above the elementary concept of a vector space. A vector is a perfectly well-defined geometric object, as is a vector field, defined as a set of vectors with exactly one at each point in spacetime. (The set of all the tangent spaces of a manifold M is called the tangent bundle, T(M).) Nevertheless it is often useful for concrete purposes to decompose vectors into components with respect to some set of basis vectors. A basis is any set of vectors which both spans the vector space (any vector is a linear combination of basis vectors) and is linearly independent (no vector in the basis is a linear combination of other basis vectors). For any given vector space, there will be an infinite number of legitimate bases, but each basis will consist of the same number of