One of the things I like about the cultural climate I inhabit is that: it understands that all manner of matters look different depending on how one comes at them; and, by listening to those who come at it various ways, learns more of the picture than any one could see personally. There is a lively interest in discovering more general truths – that is, ones which remain true from many ways of seeing matters. We find it constructive to understand how a matter will appear to other folk: it aids us in expressing it to them, if nothing else. We must also, of necessity, remain clear about how does it appear to ourselves – else we cannot contribute to, or fully comprehend, the discourse about how it appears to those about us.

The central theme of relativity is the pursuit of laws of physics that will be true for anyone, regardless of their state of movement. The focus is on what they may discover by physical experiments conducted in boxes (which hide the details of their surroundings from them): so the question of where they're seeing it from should be irrelevant to how they interpret the experiments they do. This idea that one can do experiments inside a box and expect to see the physics that goes on therein goes back a way. It lies at the root of Newton's laws: in which the characters of time and space were presumed independent of one another – an expectation now described as Galilean invariance.

How Maxwell showed Einstein The Way

The great triumph of Einstein's work springs from his understanding that Maxwell's description of electrodynamics was right. What Maxwell expressed in his formulation was built on data collected in laboratories: the effects of gravity being known or side-stepped – and, in any case, dwarfed by the electric and magnetic forces studied – the data were all effectively adjusted for what they would be in the ideal domain, far from any massive bodies or other sources of extraneous force, wherein Newton asserts that massive bodies move in fixed direction at fixed speed. Newton chose that ideal domain well: when Maxwell's description of electrodynamics is expressed as it would be there, it is elegant and clear.

Assumed Galilean invariance

Newton took as read an orthodox view of how do things look to moving observers which was simple and got right answers in so far as anyone was able to check: or, indeed, form any notion of how else position and time might relate to one another. This Galilean invariance amounts to the eminently sensible presumption that the shape, mass and size of a rigid body, as attributes of it, are wholly independent of its velocity: a metre ruler is a metre long regardless of what particular fixed speed and fixed direction Newton says it's moving in. This assertion turns out to be wrong, but Newton wasn't to know.

Newton's laws are expressed subject to an unwitting presumption of what we now call Galilean invariance – position and time are entirely independent questions. They give a nice clean model in which one can consider adding some fixed offset to the velocities of all bodies in a solution to the equations: the account of what is going on that it gives is just exactly what someone flying past with minus that velocity would see. It also turns out to be a solution of Newton's equations (as should be no surprise, given Newton's first law). We now read this as telling us that the laws of physics look the same to observers moving at constant velocity with respect to one another. The actual motions of bodies look a bit different, but the difference is expressible simply in terms of a fixed offset to all the velocities involved. At least formally, given the context of Newton's first law (far from any (other) gravitating body or other cause of extraneous force), this presumed that the only gravitating bodies involved be the (presumably distant also from us) observer flying past at some fixed velocity and whatever physical process we are observing.

Among the discoveries of electrodynamics was this pair: a wire in which an electric current is flowing will be subject to a force, in the presence of a magnetic field if either

In particular, if the source of the magnetic field were moving (hence causing variation of the local magnetic field seen by our wire), the force on the wire was exactly what one obtained by holding the magnet still and moving the wire in the opposite direction past it. This appeared experimentally and fitted with the Galilean presumption.

Now, 'appen, it turns out that Maxwell's equations of electrodynamics aren't Galilean invariant. Doubtless it was obvious they would be, but it turned out they aren't.

Proof of Galilean non-invariance

Intuitively, one naturally presumes that the permeability and permittivity of free space (which describe the strengths of magnetism and electricity, respectively) will be the same as seen by all observers – if only for want of any way of seeing how they might vary. From these two we may infer the speed at which light travels. But if light travels at that speed in two opposite directions as seen by me, and you're passing on a train, it's travelling faster one way than the other. This was as obvious as Galilean invariance: so obvious one wouldn't notice having presumed it. However, Maxwell's equations support free-field solutions travelling at the speed of light – and no other – which can only work for one observer, in the Galilean scheme.

To prove the contradiction more formally, one must look at Maxwell's equations and see how they transform when we change frames of reference. To this end I now introduce the impedance of free space and the speed of light: these are obtained from the permeability, μ0, and permittivity, ε0, by taking the square roots of their ratio and product:

These, in turn, imply μ0 = Z0/c and 1/ε0 = Z0.c, which I'll substitute in place of the permeability and permittivity where the equations involve them.

By choosing one spatial co-ordinate, x, parallel to the direction of movement, we can express any Galilean change of frame of reference as replacing (t,x,y,z) with (t,u,y,z) with u = x −v.t for some velocity v. Keeping t and x fixed implies keeping t and u fixed (and conversely), so partial differentiation with respect to y and z don't change. Keeping t,y,z fixed, any variation in x equates to an identical variation in u, and vice versa, so partial differentiation in x and u are equivalent. However, keeping u,y,z fixed and varying t, x varies as u+v.t and so ∂/∂t in the (t,u,y,z) frame is ∂/∂t +v.∂/∂x in the (t,x,y,z) frame.

Of course, the two frames see different currents, since a charge at rest for one frame is moving in the other, so constitutes a current in the latter but not the former; but charge was assumed to be the same for both frames of reference. So, if (t,x,y,z) sees charge density ρ and current density j, (t,u,y,z) will see charge density ρ and current density i(t,u,y,z) = j(t,u+v.t,y,z)−V.ρ, where V is a vector velocity of size v in the x direction.

The forces on objects are the same in both frames of reference (under Galilean invariance), from which we can infer how the electric and magnetic fields change. If (t,x,y,z) sees electric field E and magnetic field B, while (t,u,y,z) sees e and b, equality of force implies ρ.E+j^B =ρ.e+i^b =ρ(e−V^b)+j^b. We can have j non-zero somewhere with ρ=0 everywhere, implying that j^b = j^B, whence e = E +V^B. By symmetry, this gives e −V^B = E = e −V^b hence V^B = V^b. Together with j^B = j^b, with j potentially in any direction, assuming the relationship between b and B only depends on our change of coordinates, I conclude b = B.

We can state Maxwell's equations, following Oliver Heaviside, in the form

div(B) = 0

The divergence of the magnetic field, B, is zero. The divergence of a vector field is the result of differentiating each of its components with respect to the corresponding component of position (at fixed time and value of the other components of position) and summing the results; so div(B) means ∂Bx/∂x+∂By/∂y+∂Bz/∂z. It is orthodoxly spelled ∇·B.

Since b=B and differentiating with respect to spatial co-ordinates is unchanged by our Galilean change of frame of reference, if div(B) is zero in one frame of reference, then it will be zero in all frames of reference: so this equation is Galilean invariant.

curl(E) +∂B/∂t = 0

The antisymmetric outer derivative, or curl, of the electric field is equal and opposite to the rate at which the magnetic field changes with time. The curl of a vector field is obtained by taking the antisymmetric outer product of the spatial differential operator, construed as a vector (∂/∂x,∂/∂y,∂/∂z), and the vector field; thus curl(E) has components (∂Ez/∂y −∂Ey/∂z, ∂Ex/∂z −∂Ez/∂x, ∂Ey/∂x −∂Ex/∂y). It is orthodoxly spelled ∇^E.

Our transformed curl(e) is curl(E+V^B) = curl(E) +V.div(B) −v.∂B/∂x; our transformed ∂b/∂t is ∂B/∂t +v.∂B/∂x. Adding the two together, the two v.∂B/∂x terms cancel, and V.div(B) is zero (see above), so we get curl(E) +∂B/∂t again: this equation is also invariant.

div(E) = ρ.c.Z0

The divergence of the electric field, E, is the charge density divided by the permittivity of free space.

In our (t,u,y,z) system, div(e) is equal to (t,x,y,z)'s div(E+V^B), which is div(E)+V·curl(B) and not, generally, equal to div(E) – it suffices to chose our V not perpendicular to B, e.g. parallel to the direction of some current which is subject to a force.

curl(B).c = j.Z0 +∂E/∂t/c

the antisymmetric outer derivative (a.k.a. curl) of the magnetic field, divided by the permeability of free space, is equal to the physical current, j, plus the displacement current ε0.∂E/∂t.

When we perform our Galilean transformation, curl(B) remains unchanged: but we need to replace j with i = j−V.ρ, E with E+V^B and ∂/∂t with ∂/∂t +v.∂/∂x, giving us j.Z0 −V.ρ.Z0 +∂E/∂t/c +V^∂B/∂t/c +v.∂E/∂x/c +v.V^∂B/∂x/c, which again differs from the un-transformed equation, unless

  • V.ρ.Z0 = V^∂B/∂t/c +v.V^∂B/∂x/c +v.∂E/∂x/c

[Maxwell originally stated these equations in a form which required a separate name for each co-ordinate of each vector field; and a separate equation for each co-ordinate of the two vector equations above (the div(B) and div(E) ones are pure scalar equations), for a total of eight equations in ten functions of position and time. The footnotes to the third edition of Maxwell's classic text-book on the theory reveal that the vector notation was being developed, but was still somewhat poor. Oliver Heaviside ultimately cleaned up the mess, both putting the vector notation in its familiar form and expressing Maxwell's equations using it. The form

·B.c = 0 ^E +∂B.c/∂(c.t) = 0
·E = ρ.c.Z0 ^B.c −∂E/∂(c.t) = j.Z0

is more (nearly) usual for presentation of these equations.]

A compelling illusion

One way out of the bind was to suppose that light does indeed travel at a fixed speed with respect to observers who are at rest: in effect, to say that light's way of looking at the universe is privileged, in a sense we long ago gave up presuming for our own perspective. This implies a frame of reference which had universal applicability.

So, obviously, the first thing you try to do is measure your own velocity with respect to light's frame of reference. Measure the difference between observed physics and Maxwell's predictions for light's frame of reference. Funny thing: no trace of it. Are we, after all, the one reference frame the whole Universe bows to ? Then again, we know the Earth is spinning and that, in spinning frames of reference, gravity isn't as simple as Newton's laws (one needs centrifugal and Coriolis adjustments, at least) – nor, indeed, is electrodynamics – so obviously a spinning frame as the Universe's viewpoint is right out. But the labs in which the ingenious interferometry was done were sat on the Earth's crust, so spinning. Even if so slowly spinning that the effect was too small to be noticed, the velocity of a lab in one place on the Earth is different from another elsewhere: the experimenters would never have considered for a moment the absurd idea that they had happened to select, as the location for their lab, the one location on the Earth which the reference frame of light honoured. What would happen if they had an Earth-quake ?

So, obviously, if one lab sees light's velocity as independent of direction (where we might expect it to appear to move slower, as we catch up on it, if it is going in the same direction as us; but faster, as we rush apart, if it goes the other way) then so will any other lab. If that's true all over the Earth, it has to be true for any lab anywhere, in the sense that if you took the lab, sealed it up tight, dragged it to somewhere else and did experiments in the lab while it was there, you'd still find things to be the same.

So, if light has a reference frame of its own, in which Maxwell's equations take their neatest form, the Universe must be playing some trick which conceals our movement, relative to that frame, from us. Fortunately, as this idea began to take shape, we had Henri Poincaré on hand to point us away from the metaphysical trap. An illusion that convincing is real. In more modern terms, physics is about describing the illusion the Universe gives us that we know as reality. So, in fact, the speed of light is the same to all observers.

With hind-sight, of course, it's all obvious. The great thing Albert Einstein did was to grab hold of the beautiful truths (physics looks the same to any observer, Maxwell's equations should take their cleanest form) and hold firm to them while struggling free of the obvious presumption which permeated the very culture from which he had learned those truths. Once he was done, a new viewpoint entered that culture and has shaped the notions of obvious that spring readily to my mind.


Special Relativity came about from recognizing that Maxwell's equations must look the same for all observers and, in particular, the speed of light must appear the same to all observers, regardless of the velocity of the light's emitter or receiver.

Special Relativity is effectively an amended description of the ideal domain I attributed to Newton earlier: massive bodies do, indeed, move in straight lines at steady speed; though the size, shape and mass of a body turn out to depend on its velocity (in an understood way). It is what you get if you use Lorentz invariance (inferred from Maxwell's equations – it uses light as a yard-stick) in place of Galilean invariance.

Now, the ideal domain only actually matches Newton's description as seen by an observer who isn't accelerating or spinning. So, rather than discussing the domain itself, we discuss the frame of reference of an observer: and Newton's first law becomes an assertion that there are frames of reference in which (except in so far as Newton's second law comes into play) massive bodies move in straight lines at fixed speed. One of the things on which Galilean and Lorentz invariance agree (blessed be) is that if one observer sees massive bodies behaving like that, and sees a second observer as non-spinning and moving at constant velocity, then the second observer also agrees that each massive body moves at constant speed in fixed direction (Galileo and Lorentz disagree, however, about the relationship among the fixed velocities involved). It rapidly emerges that these frames of reference are exactly the ones in which one sees the law of conservation of momentum (a.k.a. inertia): this is inferred from Newton's laws. We therefore characterize these frames of reference as inertial.

In the presence of gravitating bodies, massive bodies are always subject to forces, so they accelerate. The total momentum of all massive bodies involved is still conserved – as seen by an inertial frame, albeit this has to be defined by reference to bodies far from the ones we are studying. One can study how physics then looks to someone inside a laboratory and discover that, if the laboratory is freely-falling (for example, in orbit about some massive body), the physics looks mostly like physics in an inertial frame far from all gravitating bodies: there are small errors due to tidal effects (the direction to and distance from the centre of the orbit vary minutely from one part of the lab to another) but these may be made as small as one cares (by making the orbit huge enough by comparison to the scale of our laboratory and the sensitivity of our ability to make measurements). Thus my freely-falling frame of reference sees physics as locally ideal even though my laboratory's velocity (as seen by an inertial frame of reference) is not constant – specifically, the condition for perfection of the local ideal is that the acceleration of my laboratory and the bodies at rest in it must exactly match the gravitational forces on them.

So the natural next question must be what form do Maxwell's equations take in the presence of gravitation ? Since we are no longer far from all gravitating bodies we aren't in Newton's ideal domain: we need some way to get information about the real world from what we know of the ideal domain. In order to do so, we need to look again at the ideal domain.

Suppose a team of observers watches events in the ideal domain, but each observer has a private little rocket-ship in which to zoom around. When an observer's rocket is off, the observer drifts at constant velocity and (unless the rocket-ship is spinning) sees the laws of physics in their simplest form. When a rocket is active, its observer is accelerating, so things will look different. We can, however, examine what this observer will see – by considering what her colleagues see who are coasting by at the time. At any given moment the accelerating observer has a particular velocity and a peer drifting by with that velocity will be seeing the laws of physics in their simple form: from this we may infer what the accelerated observer experiences. Slightly harder, but in a similar vein, we can consider how physics in the ideal domain looks to a spinning observer.

These thought-experiments had, even in Einstein's time, known results for the Newtonian formalism: but under the Galilean invariance, they didn't seem to say anything very constructive – and they gave more complicated physics. The approach does, however, allow one to describe how physics looks to an observer accelerating at a constant rate in a fixed direction. Within Newton's description of gravitation, what such an observer sees looks like what an observer at rest would see in the presence of a gravitational field whose strength matches the acceleration: there would be errors inversely proportional to how close the observer is to the source of that gravitational field (due to variation in the direction, to the centre of gravity of the massive body causing the field, from points in different parts of the observer's laboratory), but these may be made arbitrarily small by being sufficiently far from the massive body (whose mass has to increase as the square of the distance to attain the required field strength). Formally, the simply accelerating case coincides with the limiting case of a gravitational field of given strength due to an immensely distant immensely2 massive body. (The scale of immensity involved is proportional to the scale of the laboratory, in which the observer is able to conduct experiments, divided by the sensitivity of the measurements the observer is able to make: for a laboratory a few metres across, and an observer only able to measure things to one part in a million, the Earth's radius is suitably immense.) This much was understood before Einstein's time.

Einstein again applied Poincaré's rule and concluded that gravitation and acceleration were the same thing in so far as they created the illusion of being so. This lead him to formalize the ideal domain as being any domain in which Newton's first law holds – massive bodies move at constant velocity except in so far as (second law) forces act on them. In the presence of gravitational fields, we may think of these as being freely falling laboratories: alternatively, reflecting the character of the first law, we may call them locally inertial frames of reference.

On the way there, physics learned an important lesson: sufficiently convincing illusions are indistinguishable from reality. As an immediate example of this, we may consider a case examined well before Einstein's time: if you're sat inside an accelerating box, what you'll experience inside the box is just the same as if the box were sat at rest in a suitable gravitational field (produced by a sufficiently distant body, so that we can't detect the tidal effects of it). Indeed, if you're sat in a laboratory on the surface of the Earth, you effectively experience both gravity, as Newton described it, and a centrifugal force (whose strength is about 0.35% of that of gravity) away from the Earth's spin axis due to the acceleration you experience due to going round and round in circles: the result feels like gravity only it doesn't quite point at the centre of mass of the Earth.

Newtonian physics can also describe the tidal effects of other bodies, principally Sun and Moon, which arise because the Earth is a largely rigid body, so responds to the ambient gravitational field, due to all other bodies, at a whole, effectively averaging the ambient field over the whole body of the Earth. Earth thus accelerates at roughly the ambient gravitational field strength at its centre; points on its surface thus experience small tidal forces due to the ambient field's strength varying enough to be different, at at least some of Earth's surface, than it is at Earth's centre. However, the magnitude of these tidal effects perturbs effective gravity experienced at the surface by only about one part in nine million; although this is enough to deform the oceans of our planet, it is small enough to be largely ignorable in the laboratory.

Einstein had the presence of mind to see, in this, a way of extending Newton's ideal domain in which physical laws are at their simplest. Newton's characterization of this domain is far from all gravitating bodies: and his first law says that massive bodies, subject to no extraneous forces, move in straight lines at steady speed.

The obvious next question was what effects do gravitating bodies have ? and that necessarily involved studying gravitation. Now, in fact, the experiments by which we learned about electromagnetism were conducted on the Earth's surface, in a spinning frame of reference, in the presence of gravity. Einstein took the Lorentz invariance to heart (and now that it had been so named, folk could begin identifying the older presumption as Galilean invariance). His great leap was to accept the constancy of the speed of light as the way to infer how clocks and lengths in different frames of reference relate to one another – rather than accepting a naïve a priori assumption about such relations. Attempting to infer, from how things are in Newton's ideal domain, what the appropriate replacement for Galilean

The field equations

When Maxwell's equations of electrodynamics are put into their relativistic form (on a smooth manifold, M, with tangent bundle T and gradient bundle, G; these last two are dual to one another) we find ourselves dealing with:

Separating out the energy-momentum-stress tensor into the above term due to electromagnetism and a separate one due to matter, and naming the latter M, we obtain:

Now, κ.ε0/2 times c3 is 4.π.G.ε0 which, as may easily be seen by comparing Newton's force law for gravity and Gauss' for electrostatics, is the square of a charge-to-mass ratio, roughly 86.16 nano-Coulombs per tonne. Let f be the result of scaling F by this charge-to-mass ratio: then f·g is a linear map from momentum to force, with units 1/time; we can now re-write the field equations in terms of f. First, we have Maxwell's equations:

in which the scaling of μ(j) is c3 divided by a current, about 2.7689e24 Amps; divide j by that current (to get an inverse area quantity), name that J, and the last equation becomes d^(μ(f)) = μ(J).c3. The force on the current, F·g·j becomes (with a little rearrangement) 2.f·g·J/κ. With f in place of F, Einstein's gravitational equation becomes:

(in which g\R, pronounced g under R by analogy with R over g for R/g, means g's inverse contracted on the left of R.) Next, observe that the right-hand side is g·f·g·f·g +R minus some scalar field times the metric, g. The left-hand side can be zero – or at least, we must presume it can – so we can infer that, in the absence of matter, g·f·g·f·g +R is the metric multiplied by some scalar field. Dividing it by the metric, we get f·g·f·g +g\R as the identity linear map times the same scalar field: we can take its trace, knowing that the trace of the identity is 4 (or, rather, the dimension of space-time) and infer that the scalar field must be equal to trace(f·g·f·g +g\R)/4, which isn't quite what the above equation told us – the two differ by trace(g\R/4) +Λ.

This implies that one of the following is true:

An obvious thing to check at this point is that τ[*,0,*](D(T)) is zero, which should hold true for T = g·F·g·F −trace(g·F·g·F)/4, at least in the absence of matter.

Valid CSSValid HTML 4.01 Written by Eddy.