Linearity

Consider two uniform binary operators thought of as addition, a uniform binary operator parallel to one of them viewed as multiplication and an action of the domain of this multiplication on the domain of the other addition: (L×L|+:L), (S×S|+:S), (S×S|.:S) and (S×L|·:L), or (:+:L), (:+:S), (:.:S) and (:·:L) for short. (:.:S) may distribute over (:+:S), may be associative or may left-associate over (:·:L); this last may left-distribute over (:+:L) or right-distribute over (:+:S). I'll describe (:·:L) as linear , in terms of (:.:S), (:+:S) and (:+:L), precisely if:

Considering S=L with (:+:S)=(:+:L), I'll describe a multiplication, (:.:S), on S as self-linear (in terms of this addition) precisely if it is linear in terms of the given addition and itself: that is, it distributes over the addition and left-asssociates over itself (thus it is associative). Notice that we haven't had to assume associativity of +.

Define a self-linear action to be a parallel pair of (uniform) binary operators, thought of as addition and multiplication, with:

the addition
the multiplication

A scalar domain is just the range of a continuous epic self-linear action.

Linear Spaces

I'll describe a linear action (S×L|:L) as a linear action of S on L whenever the additions and S's own multiplication are taken for granted. When, in such a case, the linear action is also taken for granted, I'll describe L as an S-linear space or a linear domain of S. If I introduce, in some discourse, an S-linear space L, I am introducing (and thereafter asking you to take for granted) S, L and the binary operators on them (except insofar as they may have already been introduced) thus implied.

I shall, by default, take the additions and multiplications involved in a linear space for granted and denote them with their usual symbols. I refer to the (L|:L) functions (S| s-> ·0(s)=(L| k->s·k :L) |) as scalings of L and to the action of applying one to L as scaling.

I'll call the range of a self-linear action a self-linear algebra, thereby introducing (if not already introduced) the addition and multiplication thus implied. From the definition of self-linear action, the range, S, of any epic self-linear action is then the range, also, of an epic linear action of S (on itself), so S is an S-linear space. Furthermore, any homset {(|:S)} supports a natural addition (induced by that on S, applied pointwise) and supports a natural S-linear multiplication (as may readilly be verified): for a, b in {(H|:S)}, s in S,

a+b=
(H| h-> a(h) + b(h) :S)
s·a=
(H| h-> s.a(h) :S)

In an S-linear space R, consider any r, t for which r=r+t. For any s in S we now have r + s.r = r + t + s.r; thus s.r = t + s.r. Equally, s.r = s.(r+t) = s.r + s.t; so s.r + s.t = s.r + t, whence s.t = t = 1.t. Thus t is a fixed-point of S-multiplication. When S is not the trivial scalar domain, {1}, there are s in S other than 1 and the equation s.t = 1.t thus contradicts right-cancellability of scaling. In particular, any additive identity in R (t was only required to preserve one r=r+t, not arbitrary r) is necessarily a fixed point of scaling and its presence implies that scaling is not right-cancellable.

Consider a self-linear algebra S (which therefore contains a multiplicative identity, 1) and any S-linear space R. If the linear action (of S on R) is epic, every member u of R is s.v for some s in S, v in R: thus 1.u=1.(s.v)=(1.s).v=s.v=u by left-associativity of the multiplication on S over the linear action. Thus the linear action of 1 is the identity on the range, which is as we want it. [Insisting that 1 acts as the identity is, in fact, equivalent to requiring the linear action to be epic.] Thus the identity is always a scaling in a self-linear algebra.

As defined elsewhere, we describe a linear space as a vector space if its addition is complete: as it has already been constrained to be associative and cancellable, this makes it a group (unless you object to the empty group). In particular, the minimal linear spaces, {} and {0}, are both vector spaces.

Difference Spaces

We can extend any linear space to a vector space by using the standard difference construction. Little work is required to show that the linear action of the scalar domain on the original linear space induces a linear action on L×L, s.(h,k) = (s.h, s.k), which respects the equivalence relation and so induces a linear action on the resulting equivalence classes (we have (h,k) and (a,b) equivalent iff h+b=a+k, thus iff s.(h,k) and s.(a,b) are equivalent, so the results of scaling equivalent pairs are equivalent: so, to scale an equivalence class, take any member, scale it and your answer is the equivalence class containing the result of this scaling). A typical (difference-) equivalence class is writtin as [(h,k)], with (h,k) being one of its members.

The result is a vector space in which we have a standard embedding of our original linear space, via k -> (k+k,k), which preserves all linear properties. We call this the vector space of differences within (or between members of) our linear space: it will shortly emerge as the tangent bundle of the linear space. The space of differences within a vector space is isomorphic to that vector space. [Proof: each (h,k) has a well-defined h-k, as addition is a group; so any (a,b), having a-b likewise well-defined, has a-b=h-k ⇔ a+k=h+k ⇔ (a,b) and (h,k) are equivalent. Thus [(h,k)]->h-k is well-defined, monic, epic and inverse to v->[(v,0)], which is also monic and epic: these form the isomorphism.]

Linear Maps

We say that a function from one S-linear space to another, (|f:)

respects addition
⇔ every u,v in (|f) has f(u+v)= f(u)+f(v),
respects scaling
⇔ every u in (|f) and s in S have f(s.u)= s.f(u), and
is linear
⇔ it respects both addition and scaling: that is, for all s in S, u,v in (|f): f(u+s.v)= f(u) + s.f(v).

For any sets A, B, any function (A|i:B) induces one ({(B|:S)}| u-> (A| a-> u(i(a)) :S) = u o i :{(A|:S)}): this is epic if i is monic, monic if i is epic and iso if i is iso. This induced function is linear. [Proof: for u, v in {(B|:S)}, u+v = (B| b-> u(b)+v(b) :S) and (u o i) + (v o i) = (A| a> u(i(a))+v(i(a)) = (u+v)(i(a)) :S) = (u+v)o i, so we respect addition; for v in {(B|:S)} and s in S, s.v = (B| b-> s.v(b) :S) and s.(v o i) = (A| a> s.v(i(a)) = (s.v)(i(a)) :S) = (s.v)o i, so we respect scaling; QED.]

It follows from the definitions that any linear map respects differences, in the sense that it induces a map from the differences of its domain to those of its range: the equivalence relation defining differences is stated purely in terms of addition, which a linear map respects, so the linear map respects the equivalence relation. Thus, for linear f, f([(u,v)]) = [(f(u),f(v))]: any (h,k) in [(u,v)] has h+v=u+k, so f(h)+f(v)=f(u)+f(k) whence [(f(h),f(k))] = [(f(u),f(v))].

Given S-linear spaces, L and, M, {S-linear (L|:M)} is also S-linear. [Proof: for a, b in {S-linear (L|:M)}, define a+b = (L| k-> a(k)+b(k) :M); for any such a and any s in S, define s.a = (L| k-> s.a(k) :M). Associativity and distributivity at each point in L, and at its image in M, then imply themselves for these operators. QED.]

Subspaces

A subset of an linear space which is closed under addition and under scaling is described as a subspace of the original linear space: equivalently, a subset is a subspace if it is also a linear space. It is worth noting that a subset closed under addition is necessarily closed under positive integer scalings. A subspace of a linear space is called a proper subspace if it is a proper subset of it - ie not the whole space. A subspace of a linear space is called trivial if it is the whole space or one of the trivial linear spaces, {} and {0}.

Some discussion is needed of closure under differences: the above allows {({0,1}|u:Positive): u(0) > u(1)} as a subspace, with which I'm not entirely at ease. Such ease would require the subset to contain any v in L for which some t, u in the subset have t=u+v: I'm not happy with that either. Then again, I want any linear space to be a subspace of its space of differences: perhaps that's what matters ?

The intersection of two subspaces of a linear space is necessarily also a subspace. For any linear space, L, we define the span of a function (:f:L) to be the minimal linear subspace of L of which (f|) is a subset. This is, equally, the intersection of all subspaces of L which have (f|) as a subset. We describe a function (:f:L) as spanning L precisely if span(f)=L. Since any subset, H, of L has a natural embedding (H|:L), the span of the latter is synomymously referred to as span(H), for brevity.

We can be more specific. Given (:f:L), consider S= {sum(n| i-> k(i).f(h(i)) :L): (n|h:f), (n|k:{scalars}) have n a natural number}, the collection of finite linear combinations of f's outputs. We can be certain that span(f) subsumes S, since every linear space containing all of f's outputs must contain every result of scaling of one of its members and every result of summing finitely many of its members, hence every member of S. But S is manifestly a linear sub-space of L and contains every output of f, so it subsumes span(f), making S= span(f) and giving us an explicit form for span(f). Consequently, every member of span(f) may be written as a finite sum of scaled outputs of f.

The span of any linear map is just its range: the range of a linear map is necessarily a linear space, since every f(u) and f(v) in it have f(u+v) in it, which is f(u) + f(v), and every scaling that can be applied in the range can be applied to the domain giving s.f(u) = f(s.u). Thus it is a linear space, subsumes itself and is subsumed by any other linear space which subsumes it: so the range is the minimal linear space containing the range.

For any linear map (:f:L) and any linear subspace V of L, (|f:V) is a subspace of f's domain. [Proof: for any a, b in (|f:V), we have f(a) and f(b) in V; closure of V under addition then implies f(a)+f(b) is in V; this is f(a+b) so a+b is in (|f:V); this gives us additive closure. For any scaling, (L| h->s.h :L), and any a in (|f:V) we have s.f(a) in V by closure of V under scalings; this is f(s.a) so s.a is in (|f:V) and we have multiplicative closure; QED.] In particular, if L has a zero, 0, then {0} is a subspace of it and any linear map (:f:L) has (|f:{0}), known as the kernel of f, as a linear space.

Dual spaces and linear mapping spaces

Since this embedding of U in dual(dual(U)) turns up rather a lot, and is a natural and faithful representation of U, I'll generally treat it as synonymous with U. However, when I'm being fussy I'll distinguish the range from U by using a different font, bold, for the double-dual: so, for a given linear map U, I'll refer to (U: u-> (dual(U)| w-> w(u) :S) |), as U. Note that U and U are isomorphic.

Combining Linear Spaces

From any two S-linear spaces, U and V, we can make an S-linear space of their Cartesian product, U×V, by defining: [u,v] + [t,w] = [u+t, v+w] and s.[u,v] = [s.u, s.v].

Theorem: the S-linear space thus constructed out of the range and the kernel of a linear map is isomorphic to the domain of the linear map.

For S-linear (U|f:V) and any v in (f|), consider (|f:{v}). Members t, u of this have f(t) = v = f(u). We now work in the difference space of the range and see that f(t-u) is the additive identity there, whence t-u lies in the kernel of f's (linear) action on the differences in U.

When we come to look at the dimensions of linear spaces, we'll find that U×V's dimension is the sum of those of U and V separately. Indeed, we can embed U and V in U×V using u-> [u,0] and v-> [0,v], and project U×V by mapping [u,v] to u or v as appropriate. So, though the linear space is U×V, it is sometimes denoted U⊕V (with ⊕ read as a + in a circle), in contrast to the tensor product ...

Differentiation

We now have enough mathematical tools at our disposal to talk about differentiation on functions between linear spaces (over a shared variety of scaling). The critical components are linearity and a topology.

Something not yet clear to me: how far does the sanity of what follows depend on the scalar domain including the rationals ?

From what follows, one can proceed to the definition of differentiation on smooth manifolds, which can be shown to coincide with what follows.

What follows can be done for any function from a linear space to a linear space: slopes are, however, differences between linear maps between these spaces.

Here's how to do differentiation:

First off, to minimise mess later, define a voluminous hull of a linear space, V, to be a function (:v:V) for which the differences between members of (v|) span V. (This implies that (|v) is bigger than the dimension of V.) In a one-dimensional vector space, two distinct points (or, rather, a function yielding at least two distinct points) constitute a voluminous hull; in two dimensions, you need three points and they mustn't be colinear; three dimensions require four non-coplanar points; and so on. Voluminous hulls are what's needed as the generalisation of the base-line of a chord.

Now define a chord-slope, a, of a function (V:f:U) between linear spaces in an open subset, H, of (|f) in the topology of V to be a linear (V:a:U) for which some voluminous hull, (:v:V), with (v|) contained in H satisfies: for any w, x in (v|), a(x-w) = f(x) - f(w). For such an f and H, we define slopes(f,H) to be the intersection of all closed subsets of {linear (V::U)} which contain all chord-slopes of f in H: that is, slopes(f,H) is the topological closure of the set of chord-slopes of f in H.

We can thence define, for any (V:f:U) and any v in (|f), gradients(f,v) to be the intersection, over open neighbourhoods, H, of v, of slopes(f,H). We say f is differentiable at v precisely if this is terminal - ie it holds precisely one value, which we refer to as the gradient of f at v.

Note that gradient(f,v) = { a } implies that a is in slopes(f,H) for every open neighbourhood H of v and that for any b in {linear (V::U)} differing from a there is at least some open neighbourhood, H, of v for which b is not in slopes(f,H). Note, also, that we had to define slopes(f,H) to be the closure of the set of chord-slopes of f in H so that a point of inflexion (eg the point where a sinusoid crosses its centreline) can be differentiable: unless V contains infinitesimal displacements, there may be no actual voluminous hull which has, as its chord-slope, the gradient at the point of inflexion.

We say that (V:f:U) is differentiable throughout (or just on) some open subset, H, of (|f) precisely if it is differentiable at each v in H. In such a case, we define (H|f':{linear (V::U)}) to deliver, at each v in H, the gradient of f at v: f' is called the derivative of f.

livery
Written by Eddy.