One of the things I like about the cultural climate I inhabit is that: it
understands that all manner of matters look different depending on how one comes
at them; and, by listening to those who come at it various ways, learns more of
the picture than any one could see personally. There is a lively interest in
discovering more general
truths – that is, ones which remain true
from many ways of seeing matters. We find it constructive to understand how a
matter will appear to other folk: it aids us in expressing it to them, if
nothing else. We must also, of necessity, remain clear about how does it
appear
to ourselves – else we cannot contribute to, or fully
comprehend, the discourse about how it appears to those about us.
The central theme of relativity is the pursuit of laws of physics that will
be true for anyone, regardless of their state of movement
. The focus is
on what they may discover by physical experiments conducted in boxes (which hide
the details of their surroundings from them): so the question of where they're
seeing it from should be
irrelevant to how they interpret the experiments
they do. This idea that one can do experiments inside a box
and expect
to see the physics that goes on therein goes back a way. It lies at the root of
Newton's laws: in which the characters of time and space were
presumed independent
of one another – an expectation now described
as Galilean invariance
.
The great triumph of Einstein's work springs from his understanding that
Maxwell's description of electrodynamics was right
. What Maxwell
expressed in his formulation was built on data collected in laboratories: the
effects of gravity being known or side-stepped – and, in any case, dwarfed
by the electric and magnetic forces studied – the data were all
effectively adjusted
for what they would be
in the ideal domain,
far from any massive bodies or other sources of extraneous force, wherein Newton
asserts that massive bodies move in fixed direction at fixed speed. Newton
chose that ideal domain well: when Maxwell's description of electrodynamics is
expressed as it would be there
, it is elegant and clear.
Newton took as read an orthodox view of how do things look to moving
observers
which was simple and got right answers
in so far as anyone
was able to check: or, indeed, form any notion of how else position
and time
might relate to one another. This Galilean invariance
amounts to the eminently sensible presumption that the shape, mass and size of
a rigid body
, as attributes of it, are wholly independent of its
velocity: a metre ruler is a metre long regardless of what particular fixed
speed and fixed direction Newton says it's moving in. This assertion turns out
to be wrong, but Newton wasn't to know.
Newton's laws are expressed subject to an unwitting presumption of what we
now call Galilean invariance
– position and time are entirely
independent questions. They give a nice clean model in which one can consider
adding some fixed offset to the velocities of all bodies in a solution
to the equations: the account of what is going on
that it gives is just
exactly what someone flying past with minus that velocity would see. It also
turns out to be a solution of Newton's equations (as should be no surprise,
given Newton's first law). We now read this as telling us that the laws of
physics look the same to observers moving at constant velocity with respect
to one another. The actual motions of bodies look a bit different, but the
difference is expressible simply in terms of a fixed offset to all the
velocities involved. At least formally, given the context of Newton's first law
(far from any (other) gravitating body or other cause of extraneous force), this
presumed that the only gravitating bodies involved be the (presumably distant
also from us) observer flying past at some fixed velocity and whatever physical
process we are observing.
Among the discoveries of electrodynamics was this pair: a wire in which an electric current is flowing will be subject to a force, in the presence of a magnetic field if either
In particular, if the source of the magnetic field were moving (hence
causing variation of the local magnetic field seen by our wire), the force on
the wire was exactly what one obtained by holding the magnet still and moving
the wire in the opposite direction past it. This appeared experimentally and
fitted with the Galilean
presumption.
Now, 'appen, it turns out that Maxwell's equations of
electrodynamics aren't Galilean invariant. Doubtless it
was obvious
they would be, but it turned out they aren't.
Intuitively, one naturally presumes that the permeability and permittivity of free space (which describe the strengths of magnetism and electricity, respectively) will be the same as seen by all observers – if only for want of any way of seeing how they might vary. From these two we may infer the speed at which light travels. But if light travels at that speed in two opposite directions as seen by me, and you're passing on a train, it's travelling faster one way than the other. This was as obvious as Galilean invariance: so obvious one wouldn't notice having presumed it. However, Maxwell's equations support free-field solutions travelling at the speed of light – and no other – which can only work for one observer, in the Galilean scheme.
To prove the contradiction more formally, one must look at Maxwell's equations and see how they transform when we change frames of reference. To this end I now introduce the impedance of free space and the speed of light: these are obtained from the permeability, μ0, and permittivity, ε0, by taking the square roots of their ratio and product:
These, in turn, imply μ0 = Z0/c and 1/ε0 = Z0.c, which I'll substitute in place of the permeability and permittivity where the equations involve them.
By choosing one spatial co-ordinate, x, parallel to the direction of movement, we can express any Galilean change of frame of reference as replacing (t,x,y,z) with (t,u,y,z) with u = x −v.t for some velocity v. Keeping t and x fixed implies keeping t and u fixed (and conversely), so partial differentiation with respect to y and z don't change. Keeping t,y,z fixed, any variation in x equates to an identical variation in u, and vice versa, so partial differentiation in x and u are equivalent. However, keeping u,y,z fixed and varying t, x varies as u+v.t and so ∂/∂t in the (t,u,y,z) frame is ∂/∂t +v.∂/∂x in the (t,x,y,z) frame.
Of course, the two frames see different currents, since a charge at
rest
for one frame is moving in the other, so constitutes a current in the
latter but not the former; but charge was assumed to be the same for both frames
of reference. So, if (t,x,y,z) sees charge density ρ and current density j,
(t,u,y,z) will see charge density ρ and current density i(t,u,y,z) =
j(t,u+v.t,y,z)−V.ρ, where V is a vector velocity of size v in the x
direction.
The forces on objects are the same in both frames of reference (under Galilean invariance), from which we can infer how the electric and magnetic fields change. If (t,x,y,z) sees electric field E and magnetic field B, while (t,u,y,z) sees e and b, equality of force implies ρ.E+j^B =ρ.e+i^b =ρ(e−V^b)+j^b. We can have j non-zero somewhere with ρ=0 everywhere, implying that j^b = j^B, whence e = E +V^B. By symmetry, this gives e −V^B = E = e −V^b hence V^B = V^b. Together with j^B = j^b, with j potentially in any direction, assuming the relationship between b and B only depends on our change of coordinates, I conclude b = B.
We can state Maxwell's equations, following Oliver Heaviside, in the form
- div(B) = 0
The divergence of the magnetic field, B, is zero. The divergence of a vector field is the result of differentiating each of its components with respect to the corresponding component of position (at fixed time and value of the other components of position) and summing the results; so div(B) means ∂Bx/∂x+∂By/∂y+∂Bz/∂z. It is orthodoxly spelled ∇·B.
Since b=B and differentiating with respect to spatial co-ordinates is unchanged by our Galilean change of frame of reference, if div(B) is zero in one frame of reference, then it will be zero in all frames of reference: so this equation is Galilean invariant.
- curl(E) +∂B/∂t = 0
The antisymmetric outer derivative, or curl, of the electric field is equal and opposite to the rate at which the magnetic field changes with time. The curl of a vector field is obtained by taking the antisymmetric outer product of the spatial differential operator, construed as a vector (∂/∂x,∂/∂y,∂/∂z), and the vector field; thus curl(E) has components (∂Ez/∂y −∂Ey/∂z, ∂Ex/∂z −∂Ez/∂x, ∂Ey/∂x −∂Ex/∂y). It is orthodoxly spelled ∇^E.
Our transformed curl(e) is curl(E+V^B) = curl(E) +V.div(B) −v.∂B/∂x; our transformed ∂b/∂t is ∂B/∂t +v.∂B/∂x. Adding the two together, the two v.∂B/∂x terms cancel, and V.div(B) is zero (see above), so we get curl(E) +∂B/∂t again: this equation is also invariant.
- div(E) = ρ.c.Z0
The divergence of the electric field, E, is the charge density divided by the permittivity of free space.
In our (t,u,y,z) system, div(e) is equal to (t,x,y,z)'s div(E+V^B), which is div(E)+V·curl(B) and not, generally, equal to div(E) – it suffices to chose our V not perpendicular to B, e.g. parallel to the direction of some current which is subject to a force.
- curl(B).c = j.Z0 +∂E/∂t/c
the antisymmetric outer derivative (a.k.a. curl) of the magnetic field, divided by the permeability of free space, is equal to the physical current, j, plus the
displacement currentε0.∂E/∂t.When we perform our Galilean transformation, curl(B) remains unchanged: but we need to replace j with i = j−V.ρ, E with E+V^B and ∂/∂t with ∂/∂t +v.∂/∂x, giving us j.Z0 −V.ρ.Z0 +∂E/∂t/c +V^∂B/∂t/c +v.∂E/∂x/c +v.V^∂B/∂x/c, which again differs from the un-transformed equation, unless
- V.ρ.Z0 = V^∂B/∂t/c +v.V^∂B/∂x/c +v.∂E/∂x/c
[Maxwell originally stated these equations in a form which required a separate name for each co-ordinate of each vector field; and a separate equation for each co-ordinate of the two vector equations above (the div(B) and div(E) ones are pure scalar equations), for a total of eight equations in ten functions of position and time. The footnotes to the third edition of Maxwell's classic text-book on the theory reveal that the vector notation was being developed, but was still somewhat poor. Oliver Heaviside ultimately cleaned up the mess, both putting the vector notation in its familiar form and expressing Maxwell's equations using it. The form
∇·B.c = 0 | ∇^E +∂B.c/∂(c.t) = 0 |
∇·E = ρ.c.Z0 | ∇^B.c −∂E/∂(c.t) = j.Z0 |
is more (nearly) usual for presentation of these equations.]
One way out of the bind was to suppose that light does indeed travel at a
fixed speed with respect to observers who are at rest
: in effect, to say
that light's way of looking at the universe is privileged, in
a sense we long ago gave up presuming for our own perspective. This implies a
frame of reference which had universal applicability.
So, obviously, the first thing you try to do is measure your own velocity with respect to light's frame of reference. Measure the difference between observed physics and Maxwell's predictions for light's frame of reference. Funny thing: no trace of it. Are we, after all, the one reference frame the whole Universe bows to ? Then again, we know the Earth is spinning and that, in spinning frames of reference, gravity isn't as simple as Newton's laws (one needs centrifugal and Coriolis adjustments, at least) – nor, indeed, is electrodynamics – so obviously a spinning frame as the Universe's viewpoint is right out. But the labs in which the ingenious interferometry was done were sat on the Earth's crust, so spinning. Even if so slowly spinning that the effect was too small to be noticed, the velocity of a lab in one place on the Earth is different from another elsewhere: the experimenters would never have considered for a moment the absurd idea that they had happened to select, as the location for their lab, the one location on the Earth which the reference frame of light honoured. What would happen if they had an Earth-quake ?
So, obviously, if one lab sees light's velocity as independent of direction (where we might expect it to appear to move slower, as we catch up on it, if it is going in the same direction as us; but faster, as we rush apart, if it goes the other way) then so will any other lab. If that's true all over the Earth, it has to be true for any lab anywhere, in the sense that if you took the lab, sealed it up tight, dragged it to somewhere else and did experiments in the lab while it was there, you'd still find things to be the same.
So, if light has a reference frame of its own, in which Maxwell's equations
take their neatest form, the Universe must be playing some trick which conceals
our movement, relative to that frame, from us. Fortunately, as this idea began
to take shape, we
had Henri
Poincaré on hand to point us away from the metaphysical trap. An
illusion that convincing is real. In more modern terms, physics is
about describing the illusion the Universe gives us that we know
as reality
. So, in fact, the speed of light is
the same to all
observers.
With hind-sight, of course, it's all obvious. The great thing Albert
Einstein did was to grab hold of the beautiful truths (physics looks the same to
any observer, Maxwell's equations should take their cleanest form) and hold
firm to them while struggling free of the obvious
presumption which
permeated the very culture from which he had learned those truths. Once he was
done, a new viewpoint entered that culture and has shaped the notions
of obvious
that spring readily to my mind.
Special Relativity came about from recognizing that Maxwell's equations must look the same for all observers and, in particular, the speed of light must appear the same to all observers, regardless of the velocity of the light's emitter or receiver.
Special Relativity is effectively an amended description of the ideal
domain
I attributed to Newton earlier: massive bodies do, indeed, move in
straight lines at steady speed; though the size, shape and mass of a body turn
out to depend on its velocity (in an understood way). It is what you get if you
use Lorentz invariance (inferred from Maxwell's
equations – it uses light as a yard-stick) in place of Galilean
invariance.
Now, the ideal domain
only actually matches Newton's description as
seen by an observer who isn't accelerating or spinning. So, rather than
discussing the domain itself, we discuss the frame of reference
of an
observer: and Newton's first law becomes an assertion that there are frames of
reference in which (except in so far as Newton's second law comes into play)
massive bodies move in straight lines at fixed speed. One of the things on
which Galilean and Lorentz invariance agree (blessed be) is that if one observer
sees massive bodies behaving like that, and sees a second observer as
non-spinning and moving at constant velocity, then the second observer also
agrees that each massive body moves at constant speed in fixed direction
(Galileo and Lorentz disagree, however, about the relationship among the fixed
velocities involved). It rapidly emerges that these frames of reference are
exactly the ones in which one sees the law of conservation of momentum
(a.k.a. inertia): this is inferred from Newton's laws. We therefore
characterize these frames of reference as inertial
.
In the presence of gravitating bodies, massive bodies are always subject to
forces, so they accelerate. The total momentum of all massive bodies involved
is still conserved – as seen by an inertial frame, albeit this has to be
defined by reference to bodies far from the ones we are studying. One can study
how physics then looks to someone inside a laboratory and discover that, if the
laboratory is freely-falling (for example, in orbit about some massive body),
the physics looks mostly like physics in an inertial frame far from all
gravitating bodies: there are small errors due to tidal effects (the direction
to and distance from the centre of the orbit vary minutely from one part of the
lab to another) but these may be made as small as one cares (by making the orbit
huge enough by comparison to the scale of our laboratory and the sensitivity of
our ability to make measurements). Thus my freely-falling frame of reference
sees physics as locally
ideal even though my laboratory's velocity (as
seen by an inertial frame of reference) is not constant – specifically,
the condition for perfection of the local ideal is that the acceleration of my
laboratory and the bodies at rest in it must exactly match the gravitational
forces on them.
So the natural next question must be what form do Maxwell's equations
take in the presence of gravitation ?
Since we are no longer far
from all gravitating bodies
we aren't in Newton's ideal domain: we need some
way to get information about the real world from what we know of the ideal
domain. In order to do so, we need to look again at the ideal domain.
Suppose a team of observers watches events in the ideal domain, but each
observer has a private little rocket-ship in which to zoom around. When an
observer's rocket is off, the observer drifts
at constant velocity and
(unless the rocket-ship is spinning) sees the laws of physics in their simplest
form. When a rocket is active, its observer is accelerating, so things will
look different. We can, however, examine what this observer will see – by
considering what her colleagues see who are coasting by at the time. At any
given moment the accelerating observer has a particular velocity and a peer
drifting by with that velocity will be seeing the laws of physics in their
simple form: from this we may infer what the accelerated observer
experiences. Slightly harder, but in a similar vein, we can consider how physics
in the ideal domain looks to a spinning observer.
These thought-experiments had, even in Einstein's time, known results for
the Newtonian formalism: but under the Galilean invariance, they didn't seem to
say anything very constructive – and they gave more complicated physics.
The approach does, however, allow one to describe how physics looks to an
observer accelerating at a constant rate in a fixed direction. Within Newton's
description of gravitation, what such an observer sees looks like
what an
observer at rest would see in the presence of a gravitational field whose
strength matches the acceleration: there would be errors inversely proportional
to how close the observer is to the source of that gravitational field (due to
variation in the direction, to the centre of gravity of the massive body causing
the field, from points in different parts of the observer's laboratory), but
these may be made arbitrarily small by being sufficiently far from the massive
body (whose mass has to increase as the square of the distance to attain the
required field strength). Formally, the simply accelerating case coincides with
the limiting case of a gravitational field of given strength due to an immensely
distant immensely2 massive body. (The scale of immensity involved is
proportional to the scale of the laboratory, in which the observer is able to
conduct experiments, divided by the sensitivity of the measurements the observer
is able to make: for a laboratory a few metres across, and an observer only able
to measure things to one part in a million, the Earth's radius is suitably
immense.) This much was understood before Einstein's time.
Einstein again applied Poincaré's rule and concluded that gravitation
and acceleration were the same thing
in so far as they created the
illusion of being so. This lead him to formalize the ideal domain
as
being any domain in which Newton's first law holds – massive bodies move
at constant velocity except in so far as (second law) forces act on them. In
the presence of gravitational fields, we may think of these as being freely
falling
laboratories: alternatively, reflecting the character of the first
law, we may call them locally inertial
frames of reference.
On the way there, physics learned an important lesson: sufficiently
convincing illusions are indistinguishable from reality. As an immediate
example of this, we may consider a case examined well before Einstein's time: if
you're sat inside an accelerating box, what you'll experience inside the box is
just the same as if the box were sat at rest in a suitable gravitational field
(produced by a sufficiently distant body, so that we can't detect the tidal
effects of it). Indeed, if you're sat in a laboratory on the surface of the
Earth, you effectively experience both gravity, as Newton described it, and
a centrifugal force
(whose strength is about 0.35% of that of gravity)
away from the Earth's spin axis due to the acceleration you experience due to
going round and round in circles: the result feels like
gravity only it
doesn't quite point at the centre of mass of the Earth.
Newtonian physics can also describe the tidal effects of other bodies, principally Sun and Moon, which arise because the Earth is a largely rigid body, so responds to the ambient gravitational field, due to all other bodies, at a whole, effectively averaging the ambient field over the whole body of the Earth. Earth thus accelerates at roughly the ambient gravitational field strength at its centre; points on its surface thus experience small tidal forces due to the ambient field's strength varying enough to be different, at at least some of Earth's surface, than it is at Earth's centre. However, the magnitude of these tidal effects perturbs effective gravity experienced at the surface by only about one part in nine million; although this is enough to deform the oceans of our planet, it is small enough to be largely ignorable in the laboratory.
Einstein had the presence of mind to see, in this, a way of extending
Newton's ideal domain in which physical laws are at their
simplest
. Newton's characterization of this domain is far from all
gravitating bodies
: and his first law says that massive bodies, subject to
no extraneous forces, move in straight lines at steady speed.
The obvious next question was what effects do gravitating bodies
have ?
and that necessarily involved studying gravitation.
Now, in fact, the experiments by which we learned about electromagnetism were
conducted on the Earth's surface, in a spinning frame of reference, in the
presence of gravity.
Einstein took the Lorentz invariance to heart (and now that it had been so
named, folk could begin identifying the older presumption as Galilean
invariance). His great leap was to accept the constancy of the speed of light
as the way to infer how clocks and lengths in different frames of reference
relate to one another – rather than accepting a
naïve a priori assumption about such relations.
Attempting to infer, from how things are in Newton's ideal domain, what the
appropriate replacement for Galilean
When Maxwell's equations of electrodynamics are put into their relativistic form (on a smooth manifold, M, with tangent bundle T and gradient bundle, G; these last two are dual to one another) we find ourselves dealing with:
directionis constant, though its
sizemay vary. In the presence of sources, Einstein's field equation encodes the sources as the energy-momentum-stress tensor, which the equation asserts to be proportional to Ricci −g.(trace(Ricci/g)/2 −Λ), with Λ being the cosmological constant.
constantmetric, g (i.e. D(g) is the zero of rank G⊗G⊗G = bulk(⊗, [G,G,G])), which is a tensor field on M of rank G⊗G; only the symmetric part of g is
observable; the rank G⊗G is equivalent to the rank {linear map (G: |T)}; when g is construed in these terms, it is invertible and diagonalisable with the familiar signature [−,+,+,+] or the exact negative of this, depending whether you consider time or space to be real (the other is effectively imaginary).
sourceof the electromagnetic field; it is
observable. Its time-like component is (spatial) charge density times speed of light; its space-like components are the usual 3-space current densities; it has the units of current per area. We obtain (G⊗T: D(j) |M) and G⊗T (which is synonymous with {linear map (G: |G)}, provided M is finite-dimensional) is amenable to the trace operator, τ[*,*]. The law of conservation of charge becomes: trace(D(j)) = 0. This can be restated (using the alternating derivative in preference to the covariant one) as d^(μ(j)) = 0, which implies that μ(j) is d^ some second rank alternating form, in any topologically trivial region.
sourceterm for Einstein's field equation for gravitation. This implies that, for some constant K, g·F·g·F·g.K +Ricci is equal to the product of g with a scalar field, which is then trivially trace(K.g·F·g·F +Ricci/g)/4.
Separating out the energy-momentum-stress tensor into the above term due to electromagnetism and a separate one due to matter, and naming the latter M, we obtain:
Now, κ.ε0/2 times c3 is 4.π.G.ε0 which, as may easily be seen by comparing Newton's force law for gravity and Gauss' for electrostatics, is the square of a charge-to-mass ratio, roughly 86.16 nano-Coulombs per tonne. Let f be the result of scaling F by this charge-to-mass ratio: then f·g is a linear map from momentum to force, with units 1/time; we can now re-write the field equations in terms of f. First, we have Maxwell's equations:
in which the scaling of μ(j) is c3 divided by a current, about 2.7689e24 Amps; divide j by that current (to get an inverse area quantity), name that J, and the last equation becomes d^(μ(f)) = μ(J).c3. The force on the current, F·g·j becomes (with a little rearrangement) 2.f·g·J/κ. With f in place of F, Einstein's gravitational equation becomes:
(in which g\R, pronounced g under R
by analogy with R over
g
for R/g, means g's inverse contracted on the left of R.) Next, observe
that the right-hand side is g·f·g·f·g +R minus some
scalar field times the metric, g. The left-hand side can be zero – or at
least, we must presume it can – so we can infer that, in the absence of
matter, g·f·g·f·g +R is the metric multiplied by
some scalar field. Dividing it by the metric, we get
f·g·f·g +g\R as the identity linear map times the same
scalar field: we can take its trace, knowing that the trace of the identity is 4
(or, rather, the dimension of space-time) and infer that the scalar field must
be equal to trace(f·g·f·g +g\R)/4, which isn't quite what
the above equation told us – the two differ by trace(g\R/4)
+Λ.
This implies that one of the following is true:
An obvious thing to check at this point is that τ[*,0,*](D(T)) is zero, which should hold true for T = g·F·g·F −trace(g·F·g·F)/4, at least in the absence of matter.
Written by Eddy.