In the course of the development of the science of physics, three intimately related differential operators emerged with rôles pivotal to the abstract formalization of the laws of physics as they were understood before the ramifications of electromagnetism displaced the three-dimensional model of space which is the home of these three. They ultimately emerge as aspects of a unified differential operator which exists in spaces of any dimension.
Any real three-dimensional vector space, V, is isomorphic to the simple
vector space {lists ({reals}:|3)} of lists of three real numbers. Our familiar
notion of length of vectors (construed as displacements in space) gives us
an inner
product on these, combining two vectors, u and v, to yield a
real number, which I'll write u*v. There is
also a closely-related
vector-valued outer
product, which I'll write as u^v (on this page;
elsewhere, I use ^ for the antisymmetrising tensor product). We can
characterize the former by a linear isomorphism (dual(V):g|V), a.k.a. a
positive-definite symmetric bilinear quadratic form, or metric; then u*v =
g(u,v) = g(v,u). We can characterize the outer
product by a square root, m, of g's
determinant; i.e. a measure, or alternating form on V of rank 3, satisfying
det(g) = m×m; then u^v is the result of contracting u and v with m, to
produce a member of dual(V), which is then fed to g's inverse to turn it into a
vector in V. When identifying V with {lists ({reals}:|3)}, it is usual to chose
(co-ordinates in V, i.e.) an isomorphism for which g and m take the
forms
In such a system of co-ordinates, the three differential operators used in Heaviside's characterization of Maxwell's equations of electrodynamics are:
which maps a scalar field ({reals}:h|V) to a vector field
(V:|V) which describes the gradient
of h; grad(h) is also written
∇h; its co-ordinates at [x,y,z] are
which maps a vector field (V:f|V) to a vector field (V:|V)
which describes the extent to which f circulates
(i.e. goes round in
circles), more or less. It is usual to write curl(f) as ∇^f, as if it
were an outer product. If f's co-ordinates are [u,v,w] then curl(f)'s
co-ordinates at [x,y,z] are
which maps a vector field (V:f|V) to a scalar field ({reals}:|V)
which describes the extent to which f diverges
or points away from the
place where it is evaluated. It is usual to write div(f) as ∇*f, as if it
were an inner product. If f's co-ordinates are [u,v,w] then div(f) is
Thanks to the symmetry of second derivatives, we obtain two crucial results: for any scalar field h and vector field [u,v,w],
the zero vector field, and
Furthermore, in any topologically trivial (a.k.a. simply
connected
) region of three-dimensional space, any vector field f for which
∇*f = 0 throughout the region is necessarily the curl of some vector
field; and any vector field for which ∇^f = zero throughout the region is
necessarily the gradient of some scalar field. However, the same does not hold
true in a topologically non-trivial region (i.e. one with any holes
or cuts
in it); indeed, the exceptions provide a description
(the de Rham cohomology) of the topology of
the region.
The presence of an underlying unity among grad, div and curl was noticed fairly soon after they emerged as relevant to physics; it is fairly well expressed by the identification
as a vector differential operator
which, combined with the use
of ^ and *, at least gives grad, div and curl formulaic coherence, even if some
might be uncomfortable with the conflation of scalar differential operators and
the vector notation. However, while equivalents of grad and div can thus be
described for vector spaces of other dimensions than 3, it is not clear what to
make of curl in these cases, since its definition only appears to make sense in
three dimensions – in 1+n dimensions, its natural analogue will combine n
vectors, rather than two of them. It turns out that grad and div are (albeit
somewhat disguised by use of the metric) the first and last in a unified chain
of differential operators; the chain's length is equal to the dimension of the
vector space and curl does indeed correspond to the middle item in the chain for
three dimensions.
Suppose we were to replace our co-ordinates with ones that vary only half as fast; [X,Y,Z] = [x/2, y/2, z/2]. We'd get [∂/∂X, ∂/∂Y, ∂/∂Z] = 2.∇ while an ordinary vector quantity [u,v,w] would want to be described by [u/2,v/2,w/2], scaling the same way as the co-ordinates. This contrast is ∇'s way of telling us it is really a co-vector entity, belonging to the dual of our vector space; the reason we were able to get away with treating it as a vector, above, is that we carefully chose co-ordinates in which our metric, i.e. *, has the particular simple form given above. A re-scaling of co-ordinates would oblige us to change the formula for * if we're to continue getting the same lengths for our vectors; that, in turn, would oblige us to change around the formula for ^ if we're to get the same vector (expressed in its modified co-ordinates) when we combine (the co-ordinates of) two vectors.
What's happened is that we've systematically used our measure to identify antisymmetric third rank entities with scalars; and antisymmetric second rank entities with first rank ones, i.e. vectors and co-vectors; at the same time, we've been using our metric to identify co-vectors with vectors and, implicitly, to convert between co-vector and vector factors in second rank tensor entities. Thus several ranks of tensor entity have been conflated with one another, hiding what's really going on.
Now, a co-vector is just a linear map from vectors to scalars, i.e. a member
of dual(V) = {linear ({reals}:|V)}, the dual of our original vector space,
V. This fits naturally with what differentiation does: if I have a function
({reals}:f|V), its derivative at some point in V, f'(v), is supposed to be
something that I can multiply
a small displacement by to get a good
approximation to the the change in f between v and a point thus-displaced from
v; f(v+e)−f(v) should just be proportional to
e, for small enough
e. This begs for f'(v) to be a linear map, which takes e as input and yields
f(v+e)−f(v) as output. Since e is a displacement in V and f produces real
outputs, this makes f'(v) a member of dual(V), i.e. a co-vector.
Indeed, if we have a function between two linear spaces, (U:f|V), we'll
likewise want its derivative, at any v in V, to be something to which we can
give a small displacement, e, in V and get out a proportional
small
change in the value of f, f(v+e)−f(v), in U. Thus f'(v) must be a linear
map (U:f'(v)|V), which is to say a tensor of rank U⊗dual(V). Thus
differentiating a scalar field should yield a co-vector field; differentiating
one of those should yield a second rank tensor of type dual(V)⊗dual(V);
differentiating one of those should yield a third rank tensor of rank
dual(V)⊗dual(V)⊗dual(V) and so on.
The other ingredient we need to throw into the pot is antisymmetrization; this comes into play because second derivatives are symmetric, so it annihilates them. To do antisymmetrization, we need to rearrange the order of the tensor factors in a quantity, then add and subtract some combination of the thus-rearranged tensor; this is only possible if all rearrangements have the same tensor rank; which, in turn, requires the original tensor's rank to be simply some single vector space combined with itself repeatedly – like the dual(V)⊗…⊗dual(V) ranks that arise naturally from repeated application of differentiation, as just discussed.
… unfinished
Written by Eddy.