This page is undergoing upheavals, delegating most of what it used to say to a page on Leibniz operators which covers the relevant ground better.

Differentiation on a smooth manifold

It turns out that we have some freedom as to how we define differentiation on a smooth manifold. This arises out of the fact that our notion of `the same' as we go from the gradient space at one point on the manifold to that at another is somewhat arbitrary: we can only do as much as can be deduced from continuity, which only really suffices to say `not too different'. The freedom to choose our notion of constancy, ie sameness, leads to a freedom of choice of differential operator (which is what measures the rate of deviation from `same').

The good news is that we do have a well-defined differential operator on scalar fields, namely the gradient operator, d. A natural condition to impose on any differential operator we are to use will, therefore, be that it agrees with d on scalar fields. But first we must decide what a differential operator is ...

Differential operators

One can define differentiation on functions between vector spaces. Let U and V be two vector spaces with (V:f:U) some function between them. Let (dim:b:V) be some basis of V, with (dim:p:dual-V) the implied dual basis of V's dual: this is defined by p(i).b(j) being 1 when i=j but 0 otherwise. [Here, dim is some arbitrary index set: if it is finite we usually take it to be a natural number, called the dimension of V, but it could equally be a set of names, such as {left, forward, up, future}, without any implied ordering; in what follows, I assume we can define summation over the members of dim.] This makes sum(dim: i-> b(i)×p(i) :V⊗dual-V) equal to the identity on V, via the definitive isomorphism between V⊗dual-W and Linear(W,V) applied with W=V.

[Note: the × and ⊗ used here and following express the tensor product: given v in V and s in dual-W, v×s is the linear map (W: w-> v.(s.w) :V), in which s.w is a scalar; V⊗dual-W is the span, in {linear (W::V)}, of (: [v,s]-> v×s :). I'd ideally use &tensor; and &Tensor; where I've used × and ⊗, but &tensor; needs to appear many times in formulae, which will read better if its a single character: and ⊗ gives a × inscribed in a circle (which is the symbol I'm used to) on at least one browser, Grail.]

The basic properties of a differential operator, as we are familiar with them from the world of (tensor, scalar and) vector spaces, are:


The derivative of a function (V:f:U) between vector spaces is a function (V: Df :dual-V ⊗ U). Thus D has `tensored on' a dual-V to the space of outputs of our function, f.

Note that dual-V⊗U is the transpose of U⊗dual-V, which is Linear(V,U) as above: also that sum(dim: i-> p(i)×b(i) :) is the identity on dual-V, so that Df is trivially equal to sum(dim: i-> p(i)×(b(i)Df) :), with each b(i)Df being in U.


If I can add two functions, I can add their derivatives: and the answer will be the derivative of their sum: D(f+g) = Df + Dg. Derivatives also scale: D(c.f) = c.Df when c is a constant scalar.

The product rule

The derivative of a product is equal to a sum of two terms: each is the product of one multiplicand and the derivative of the other.

Now, doing this to a tensor product w×u gives us (Dw)×u and w×(Du): we need to add these to get D(w×u). With w= (V:w:W) and u= (V:u:U), D(w×u) is (V: :dual(V)⊗W⊗U), as is (Dw)×u. However, w×(Du) is (V: :W⊗dual(V)⊗U), so we need to massage this last into the same form as the other two if we're to be able to do addition. I'll write the appropriately massaged form of this as [D| w× |u] = sum(dim: i-> p(i)×w×(b(i)·Du) :) = τ([1,2,0], w⊗(Du)). The product rule then becomes

Quantification of variation

As one moves through a displacement δ in V, the value of the function, f, changes by f(m+δ) - f(m) = δ·Df(m) to first order in δ. One may eliminate the approximateness of this definition by converting it to something intrinsically differential: namely, the rate of change of f as one moves along a curve with tangent vector v is v·Df. (And it was for the sake of this notational convenience, v·Df being marginally more intelligible than (Df)·v, that I chose to use the transpose of the usual form of Linear(V,U), above.) In particular, this implies that the derivative of any constant function is zero.

Symmetry of second derivatives

The derivative of the derivative of a function is symmetric: for any v, w in V, the values of v.(w.DDf) and w.(v.DDf) must be equal. Note that (V:f:U) yields (V: DDf :dual-V⊗dual-V⊗U), so both v.(w.DDf) and w.(v.DDf) are in U.

Smooth Manifolds

When we come to look at analogous structure on a smooth manifold, M, one of the first kinds of examples (of something interesting to differentiate) that presents itself is a `section', f, of some tensor bundle, B, derived from the gradient, G, and tangent, T, bundles of M. For each point m of M, B gives us a vector space, B(m), of tensors at m: f is then a mapping (M: m->f(m) :B(m)) for which f(m) is in B(m) for every m. Sections, as these are known, arise naturally from the gradient and tangent bundles of the manifold: in the simplest case, B is either G or T. More generally, B is obtained from G and T using the standard tools of tensor algebra: so that a complete discussion of G, T and those standard tools suffices to cover B. Further, since G and T are dual to one another, at each point of M, discussion of either G or T covers most of the other.

When we come to look at gradient fields, we can use a basis at any point of M to obtain its dual, a basis of tangents at that point. This naturally leads to discussion of a `section' of bases of G: formally, a mapping bb= (M: m-> (dim: bb(m) :Gm) :{bases of Gm}), with (|bb) being the region of M on which this `local basis' of G is defined. One can turn bb inside-out (transpose it) to get b=(dim: i-> (M: m-> bb(m,i) :Gm) :gradient fields), so b(i,m) = bb(m,i), which is often more useful when gradient fields are principally under discussion, rather than gradients at a point.

As for gradients, so equally for tangent fields: in particular, we can obtain pp= (M: m-> (dim: pp(m) :Tm) :{bases of Tm}), dual to bb in the sense that, for each m in M and i,j in dim, pp(m,i).bb(m,j) is 0 unless i=j, in which case it is 1: call this when(i=j). This in turn implies p= (dim: i-> (M: m-> pp(m,i) :Tm) :tangent fields), so p(i,m) = pp(m,i) and p(i,m).b(j,m) = when(i=j), so that p(i).b(j) is the constant scalar field (M: m-> when(i=j) :{0,1}) for each i,j in dim. In this sense, p and b are mutually dual local bases of T and G respectively. Then a gradient field ((|p): w :) in G can be written as sum(dim: i-> b(i) (w.p(i)) :), with each w.p(i) being a scalar field. Likewise, a tangent field ((|b): v :) in T is sum(dim: i-> p(i) (b(i).v) :) with each b(i).v a scalar.

Equally, this means that sum(dim: i-> b(i)×p(i) :G⊗T) is the identity linear map on G: that is, (M: m-> (Gm: w->w :Gm) :{linear (Gm: :Gm)}), or rather the restriction of this to m in (|p). [This depends on representing {linear (U::W)} as W⊗dual-U, with W=U=Gm so that dual-U = Tm.] Likewise, sum(dim: i-> p(i)×b(i) :T⊗G) is the T's identity on (|b).

Differential operators on a smooth manifold.

The co-vectorial nature of the operator, its linearity and the product rule can be carried over without difficulty: they make the differential operator a Leibniz operator of rank G. Quantifying variation, though, is harder: and we cannot guarantee symmetry of second derivatives (which was a theorem in the flat world).

We can do a bit with quantification of variation - we know we can do it for scalar fields, since the definition of quantification of variation corresponds exactly to that of the gradient operator, d, on scalar functions. So we require that differential operators agree with d on scalar fields. Beyond that, to do quantification of variation we need to make some sense of f(m+δ) - f(m) when f(m+δ) is in S(m+δ) for some rank, S, of the tensor bundle, while f(m) is in S(m): which means we need, for each point, h, of some neighbourhood of m, a `constant' function F(h) = (M: n-> F(h,n) :S(n)) of the same rank as f with F(h,h)=f(h). This gives F(h,p) in S(m) alongside f(m) and, in a fairly reasonable sense, F(h,m) `=' f(h): so we can read f(m+δ)-f(m) as F(m+δ, m) -f(m).

Our differential operator will, like any Leibniz operator, annihilate the zero tensor field of every rank and all trace-permutation operators (when these last are considered as tensor fields), including the identity on each rank. These are, indeed, tensor fields we could sensibly have required to be constant: it is reassuring that they arise naturally (ie without having to be assumed). However, all the non-zero tensor fields this gives us are trace-permutation operators, whose ranks (when expressed as ⊗(n| :{T,G}) for some natural n) have as many T-factors as G-factors (so, in fact, n is even). Most ranks don't satisfy that (eg G, T, G⊗G), so in most ranks the only constant tensor field we have is zero. So we can't offer, except to a limited degree in some ranks, enough naturally constant tensor fields to construct (M:F:{(M::S)}), for arbitrary (M:f:S) and m in (|f), for which F(k,k) = f(k) and F(k) is constant for every k in some neighbourhood of m. Still, this just means we have some freedom to chose a differential operator, given which we express `F(k) is constant' as: D(F(k)) = (M: n-> 0 :G(n)⊗S(n)), the zero-section of (M::G⊗S). We can then, indeed, construct a differential operator out of this notion of constancy: we may sensibly expect this to be D.

We could proceed to describe the general differential operator and see what actions arise. This would require us to devise a physical theory in which, to obtain independence of choice of differential operator, we would have to identify the characteristics of the differential operator which must be compensated away to get physical processes independent of differential operator. While this will, doubtless, be most enlightening (once understood) and produce an interesting gauge theory, it looks much harder than I want to try just now. Besides, as we shall now see, there is an easier way forward.

Constancy of the metric

There is a natural sense in which we may think of our metric as the common ground on which all our physical theory will agree: this puts it in a natural rôle to be thought of as constant (even though, in the eyes of any particular chart, it will doubtless vary). It is therefore natural to look for the properties that a differential operator must possess if it is to annihilate the metric. If these lead to a contradiction, we'll have to abandon constancy of the metric: if they leave us, still, with some choice then we'll have to think of some other disambiguator. However, as we shall see, the constancy of the metric is attainable and, with one further assumption, by just one differential operator.

To simplify this examination we shall express everything we can in terms of the gradient operator, d, and the derivatives of the elements of a local basis of gradients. Given the above relationship between the derivatives of a basis and those of its dual, plus the product rule and the feasibility of expressing any tensor field as a sum of terms, each of which is the product of a scalar field and a tensor product of members of the basis and its dual: given all that, the derivatives of arbitrary tensor fields will follow from those of the basis, hence of its dual, and the coincidence of derivative with gradient for scalar fields.

So, let (dim: b :gradient fields) be a pointwise basis of the gradient fields in some neighbourhood and look, therein, at the zero derivative of the metric. The metric can be expressed in terms of the basis and its dual, so this is

0 = Dg

expand g in terms of b and g's action on p:

= sum(dim×dim: [i,j]->
D((g.p(i)p(j)).b(j)×b(i)) :)

apply the product rule:

= sum(dim2: [i,j]->
d((g.p(i)).p(j)) × b(j)×b(i) + ((g.p(i)).p(j)) (Db(j))×b(i)
+ ((g.p(i)).p(j)) sum(dim: k-> b(k)×b(j)×(p(k)Db(i)) :) :)

and expand in terms of the obvious basis of dual-V⊗dual-V⊗dual-V:

= sum(dim3: [n,m,l]-> b(n)×b(m)×b(l){
p(n)d(p(m).g.p(l)) + sum(dim: j-> (p(j).g.p(l)) (p(n)(Db(j))p(m)) :) + sum(dim: i-> (p(m).g.p(i)) (p(n)(Db(i))p(l)) :) } :)

Now, for this to be zero, each of its components must be zero: whence the value of the curly bracketed expression must be zero for every choice of l, m and n. We can now save ourselves a lot of trouble by looking into the value of

Gb(i,j,k) =
sum(dim: h-> (b(h).(g-1).b(i)). {p(j).d(p(h).g.p(k)) + p(k).d(p(j).g.p(h)) - p(h).d(p(k).g.p(j))} :) / 2

in which b(h).(g-1).b(i) is, for each i and h in dim, a scalar. The {bracketed} expression is also a scalar, for each j, k and h in dim. Thus, for each i, j and k in dim, Gb(i,j,k) is a scalar field on (|b). We can eliminate the g-1 term by multiplying this by the gradient field g.b(h) for some fixed h in dim, to obtain the gradient field

g.p(h).Gb(i,j,k) =
b(i).{p(j).d(p(h).g.p(k)) + p(k).d(p(j).g.p(h)) - p(h).d(p(k).g.p(j))} / 2


I haven't finished writing this page yet.

So, we've got a differential operator, D, on a manifold M. We've decided that its second derivative on scalar fields is symmetric: so any scalar field f has D^df = 0. Let's extend the definition of d to apply to each ×⊗(n| i-> G :) - that is G⊗...⊗G with n repeats of G - using (×⊗(n|:{G}): d : ×^(n|:{G})) = (: sum(I| i-> &tensor;(1+n(i)| u(i) :G) :) = Dw, w-> sum(I| i-> ^(1+n(i)| u(i) :G) :) :).

maybe I need (^(N,G): d :) to map w to (1 - sum(1+N| i-> τ(i,N,...,i+1,1+N,i-1,...0) :)/N)·Dw where the τ's permutation means swap i with 1+N.

Now, suppose we have some ordinal N for which: for each n in N, (×^(n|:{G}): d&on;d :×^(2+n|:{G})) is zero but, for each n in 1+N, span(×^(n|:{G}): d |) = ×^(1+n|:{G}). This is certainly true for N = 0 = empty: the only n in 1+N = 1 is then 0 = {}; so, for each m in M, ^({}|:{G(m)}) = R = ⊗({}|:{G(m)}) making ×^(n|:{G}) = {(M|:R)} = {scalar fields}. Consider any w in ×^(N|:{G}) with Dw = sum(I| i-> ⊗(1+n(i)| u(i) :G) :). Now, (×^(N| :G): d |) spans ×^(1+N| :G) by our inductive hypothesis: so each (1+N|u:G) in our sum for Dw can be written as f.dv for some scalar function f and some v in ×^(N|:G). This gives dw = Dw, since each dv is already wholly antisymmetric by definition: and (d&on;d)(w) = sum(I| i-> d(f(i).dv(i)) :×⊗(1+N|:{G})). Now Df(i) is df(i), so d(f(i).dv(i)) is the (1+N)-antisymmetric part of D(f(i).dv(i)) = (df(i))⊗dv(i) + f(i).D(dv(i))

Maintained by Eddy.
$Id: differ.html,v 1.6 2009-08-09 14:37:08 eddy Exp $