Measure is what generalises, to whatever dimension your space is, the notion that is length for a line, area in a plane and volumen in our familiar three-dimensional world. It has its simplest form on vector spaces, which makes it possible to define on smooth manifolds; and it retains enough of its simple form in a Euclidean space that I'll take the trouble to describe it there. The fact that it's invariant under shears is a clue that it doesn't actually depend on what metric one uses – which determines distances and angles, thereby putting numbers to the measure – so I'll address myself to the Euclidean space stripped of its metric.

Measure is invariant under translations, rotations and shears; its magnitude is invariant under reflections, albeit the sign may change; and scaling in each of several directions independently scales measure by the product of the scalings applied to the different directions. Most of those transformations (even, indeed, the shear) would normally be expressed in terms of the metric that I've stripped away from my Eulcidean space, so I'm going to need to do a little work to construct the things that take their place.

First, though, let me summarise what a Euclidean space looks like without
its ditances and angles. There are positions in it, called points

.
Between any two positions, there is a displacement. The displacements form a
vector space (or at least a module over some division ring with characteristic
zero), in which we can add displacements and scale them; the results are
displacements. We can add any displacement to any point in our space; the
result is a point in our space. We cannot add points or scale them; we can only
add them to displacements and ask for the displacement from one to anohter; this
displacement, when added to the former, gives the latter.

Formally, we have collections {points} and {displacements}; the latter forms a vector space, with scaling by number and an addition between displacements; and we have an addition between points and displacements for which:

- For any point P and displacement d, P +d = d +P is a point;
- For any displacement d, ({points}| P +d ←P |{points}) is iso; this
mapping is called
translation through

d; - For any point P, ({points}: P +d ←d |{displacements}) is iso; and
- For displacements d, e and point P, P +(d +e) = (P +d) +e.

As {displacements} is a vector space, it has a dual, the vector space
of linear maps from it to {scalars} (the numeric values by which we can scale
displacements), in the usual way; and mappings from it to other vector spaces
(over the same scalars) are linear precisely if they respect addition and
scaling, in the usual way. A function from {points} is described as linear, or
as varying linearly

, precisely if its differences between
points depend only on the displacements between the points; that is, a function
(: f :{points}) is linear if there is some linear (: df |{displacements}) for
which, for any point P and displacement d, f(P) +df·d = f(P +d).

I'm using df for the induced map on displacements because, indeed, this is the derivative of f; however, for now, I don't want or need to involve differentiation. That the derivative is the same at every point (so that I can write it as df rather as df(P) for some P) follows naturally from the specification just given. In what follows, d(T(f, u)) is in fact u×df, but this is not relevant to the discussion. Note, in passing, that I'll tend to use capital letters as names for points, e.g. P, Q.

So now I have the tools to define a type of linear transformation that shall end up letting me characterise measure. For any linearly varying ({scalars}: f |{points}) and any displacement u, the transformation T(f, u) = ({points}: P +f(P).u ←P |{points}) scales measures by 1 +df·u.

When f is constant, so df is zero, this transformation is just a translation, since f(P).u is the same for all P; and df·u = 0, so it scales measures by 1, i.e. measure is invariant under translation. When we have more than one direction in our space, i.e. there are displacements that are not the results of scaling one anohter, every linearly varying ({scalars}: f |{points}) does have some non-zero displacements u for which df·u is zero; for these, T(f, u) is a shear, which also conserves measure.

What if df·u is −1 ? Then, for any point P, f(P +f(P).u) = f(P) +f(P).df·u = f(P) −f(P) = 0; every output of T(f, u) is in ({0}: f |), the set of points where f is zero. This is necessarily a space of co-dimension 1; we have projected our space down to it and collapsed the u direction out of it. Thus, indeed, measures are scaled by zero, as claimed.

Next stop, −2; this just means using twice the u that we used in the
case of −1, so we're going to reflect

in the sub-space ({0}: f |)
on which f is zero. (Note that this won't always be a reflection in the common
sense, where the invariant plane is perpendicular to the direction of movement;
we have no notion of perpendicular and, indeed, this is generally a combination
of a reflection and a shear. None the less, every reflection *is* one of
these.) With Q = T(f, u, P) = P +f(P).u we get f(Q) = f(P) +f(P).df·u =
f(P) −2.f(P) = −f(P) so T(f, u, Q) = Q +f(Q).u = Q −f(P).u =
P; thus T(f, u)∘T(f, u) is the identity and T(f, u) is self-inverse.

So we have self-inverse reflections

; and we can obtain rotations (and
translations, for that matter) as composites of reflections, so this should take
care of rotations, too.

If our ({scalars}: f |{points}) isn't constant, there are some point P and displacement h for which f(P +h) = f(P) +k for some non-zero scalar k = df·h; as it's non-zero, we can divide a displacement by it and scale by another scalar to get −h.f(P)/k; this is a displacement, so we can add it to P and find f(P −h.f(P)/k) = f(P) −df·h.f(P)/k = f(P) −f(P) = 0; thus, if our linearly varying scalar function isn't constant, there is some point at which it is zero. If our space has more than one independent direction, those displacements v with df·v = 0 can be added to this point to get an invariant sub-space. In any case, we can pick a basis (empty if there really are no such u with df·u = 0) of the displacements within the invariant sub-space.

If f is constant, T(f, u) is a translation; otherwise, if df·u is zero we have a shear. Otherwise, df·u is non-zero and T(f, −u/df·u) is the projection discussed above, that maps each point to a point on the invariant sub-space and thus scales measures by a factor of zero. We can thus write any point P as (P −f(P).u/df·u) +f(P).u/df·u in which the first, bracketed, term is in the invariant plane and the last is parallel to u. When T(f, u) maps P to P +f(P).u, it just scales the last term by 1 +df·u, so is a scaling of the u-wards component by this factor (albeit the other components, that are unscaled, aren't perpendicular to u, as we have no metric to determine what perpendicular means). Indeed, given a metric, one could construct an orthonormal basis of the invariant sub-space's displacements, extend it to an orthonormal basis of the full space and re-write this as a scaling of the component in this perpendicular direction combined with a shear (or two) parallel to the invariant sub-space.

So the T(f, u) provide us with a set of linear transformations of our space that scale measures in predictable ways. When we compose such transformations, the composite scales measures by the product of the scalings applied by the successive transformations; this lets us build more complex transformations out of the T(f, u), still scaling measures in predictable ways; in particular, it lets us select the measure-conserving transformations that result and deliberately construct such transformations (e.g. exploiting a carefully-chosen positive definite quadratic form on displacements, regardless of whether it accords with anyone's sense of distance).

Written by Eddy.