]> Convergence

# Convergence

Consider some collection V of values on which we have a distance function ({mappings ({0}&unite;{positives}: |V)}: d |V) for which:

• for all u, v in V: d(u, v) = d(v, u) and d(u, v) = 0 precisely if u = v;
• for all u, v, w in V: d(u, v) +d(v, w) ≥ d(u, w).

The second condition is known as the triangle rule, as it corresponds to saying that the edge connecting two vertices of a triangle is a shorter path than the route via the other two edges, taking a detour via the third vertex. Note that this condition is not met by the square of the usual distance; indeed, the cosine rule says it fails (for the squared distance) whenever the detour is via an obtuse angle. The properties below are defined with respect to such a distance function; in contexts using these properties, the choice of distance function is usually implicit.

Now suppose we have an infinite sequence (V: f |{naturals}) in V; the values it takes in V may be all over the place. If there is a mapping ({naturals}: N |{positives}) for which d(f(n), f(m)) < h whenever N(h) is in both n and m, the sequence f is said to be Cauchy (or a Cauchy sequence). What this says is that – no matter how tiny h may be, as long as it's positive – all outputs of f from indices greater than N(h) are within h of one another.

Consider two infinite sequences f, g, each (V: |{naturals}); I'll describe f and g as mutually convergent if there are mappings N, M, each ({naturals}: |{positives}), for which, for each positive h, d(f(n), g(m)) < h whenever N(h) is in n and M(h) is in m. In such a case, with N(h) also in some natural i, d(f(n), f(i)) ≤ d(f(n), g(m)) +d(f(i), g(m)) < 2.h, so we can use (: N(h) ←2.h :) to show that f is Cauchy and, likewise, (: M(h) ← 2.h :) to show that g is Cauchy. Thus mutually convergent sequences are always Cauchy.

Consider an infinite sequence (V: f |{naturals}) and a v in V for which there is a mapping ({naturals}: N |{positives}) satisfying d(v, f(n)) < h whenever N(h) is in n. We can use (: N(h) ← 2.h :) to show that f is Cauchy; indeed, it is mutually convergent with the infinite constant sequence ({v}: |{naturals}). In such a case, f is said to converge to v. If a sequence f and value v have, for every positive h, infinitely many naturals n for which d(v, f(n)) < h, I'll describe v as a limit point for f; if f converges to some v, that v is necessarily a limit point of f. If a sequence f has two distinct limit points, u and v, we have only to consider some positive h with 2.h ≤ d(u, v) to see that f does not converge to either u or v (since the n with d(u, f(n)) < h don't have d(v, f(n)) < h; as there are infinitely many of them, any N(h) we try shall be in some such n, precluding convergence to v; and conversely for u). If some v is a limit point of a sequence f then we can extract a sub-sequence of f, f&on;n for some ({naturals}: n |{naturals}), which converges to v.

One collection, Q, is described as dense in another, R, precisely if, for every r in R, there is some Cauchy sequence in Q that converges to r. As an example, the diadic rationals {i/power(n, 2): natural n, integer i} are dense in the rationals and in the reals; as, indeed, are the p-adic rationals {i/power(n, p): natural n, integer i}, for any natural p > 1. The Cauchy sequence in question, for any given real or rational r, is simply the sequence of successive approximations to r obtained by truncating the base p representation of r after n digits, for each natural n in turn.

In a vector space V, given a sequence (V: f :), for any natural N the sequence (V: f(n +N) ←n :) = f&on;add(N) is a tail of f; we can ask which simplices of V all outputs of f&on;add(N) lie in; we can take the intersection of all such simplices for all natural N; this will give the convex hull of the set of f's limit points, including any that aren't there; but, if there is only one and it isn't there, we'll get empty. Alternatively, I could make a topology using any voluminous simplex; by translating it to be centred on each point and shrinking it in towards that point, I can have neighbourhoolds of that point of diverse scales; then stipulate the series as Cauchy precisely if, for each positive scale of the simplex, there is a point in the sequence I can centre it on and have all later values of the sequence lie within. That would mimic Cauchy closely enough to be equivalent to the form above.

### Example: Cauchy but not convergent

Not every Cauchy sequence converges to a value: consider the iterated sequence (: x |{naturals}) defined by x(0) = 1 and, for each natural i, x(i+1) = x(i)/2 +1/x(i). As long as x(i) is rational and positive, so is x(i+1), by the form of the specification; and x(0) is rational and positive so we may induce that every x(i) is a positive rational. Now consider

power(2, x(i+1)) = x(i+1).x(i+1)
= (x(i)/2 +1/x(i)).(x(i)/2 +1/x(i))
= x(i).x(i)/4 +1/2 +1/2 +1/x(i)/x(i)
= 1 +power(2, x(i)/2) +power(2, 1/x(i))
> 1

since each of the squared terms added to 1 is positive. Indeed, when we consider the sum of squares of t/2 and 1/t, for rational t: if t > 2 or t < 1, the sum is necessarily > 1 (because one of its terms is, with the other positive); for 2 > t with t.t > 2, the sum is > 1/2 +1/2 = 1. Although I'm left with a gap when 1 < t.t < 2, I don't mind: x(0) = 1, for which the sum of squares is 1/4 +1 > 1, thus ensuring x(1)'s square is > 2; which, inductively, ensures every later x(i) has its square > 2. If we subtract 2 from the square above, we get

power(2, x(i+1)) −2
= power(2, x(i))/4 +power(2, 1/x(i)) −1
= (power(2, x(i)) −2)/4 +1/power(2, x(i)) −1/2
< (power(2, x(i)) −2)/4

for positive i, since these have 1/power(2, x(i)) < 1/2. So each entry in x after the first has square > 2 and the difference above 2 diminishes by at least a factor of 4 for each successive term. (The differences actually diminish much faster than this, roughly squaring and dividing by 8 at each step.) Thus, after its initial [1, 3/2, …], our sequence is decreasing; so no entry lies outside the range between the first two. Observe that 3/2 − 1/8 = 11/8 has square 121/64 < 128/64 = 2 and power(2) is increasing on positives (as (x +h).(x +h) = x.x +2.h.x +h.h > x.x whenever h is positive), so all entries with square > 2 are > 11/8 and thus within 1/8 below 3/2. We can now define our ({naturals}: N |{positives}) by: N(h) = 0 for h > 1/2 (all entries are within 1/2 of one another); and N(h) = 1+i for power(2.i +1, 1/2) ≥ h > power(2.i +3, 1/2), for each natural i (all entries after x(1) are within 1/8 below x(1); each later entry narrows that range by a factor of 4). Since every rational h is greater than some odd power of 1/2, this does define N for all rational inputs. With this N we demonstrate that x is Cauchy. Were it to have a limit point, however, the limit point would necessarily have square 2; which means it isn't rational. As long as we're doing our iteration in {rationals}, this Cauchy sequence thus has no limit points.

### Example: an incomplete integral ringlet dense in {reals}

The integral sub-ringlet of {reals} comprising {n −m.√2: n, m natural with n +m > 0} doesn't contain 0 (or any negative integer). Its positive members are just those n −m.√2 for which n.n > 2.m.m. As 1 −√2 lies between −1 and 0 (indeed, even between −2/5 and −3/7), its successive powers alternate in sign while converging to 0 (which isn't a value of this ringlet). This gives us arbitrarily small values close to 0 on each side; scaling these by positive naturals gives us values arbitrarily close to any real you care for; so the values of this ringlet are dense in {reals}.

### Example: summing powers

Consider the simple power series sum(power), with unit coefficient for every power. While we can manipulate this formally using the rules for polynomial arithmetic, expressed purely in terms of the series of (unit) coefficients, its evaluation at any given input x only makes sense if the sum(: power(n, x) ←n :{naturals}) converges. To study such an infinite sum, we study the series of results we can get by truncating it at different naturals; each of these truncated results is known as a partial sum; in the present case, each is sum(: power(i, x) ←i |n) for some natural n. If the input, x, is 1 then this sum is just n and the sequence of partial sums definitely doesn't converge. Otherwise, multiplying this by 1 −n gives us 1 −power(n, x), from which we can infer that each partial sum is (1 −power(n, x))/(1 −x). When the successive powers of x (that we're indeed summing here) form a Caucy sequence – typically conerging on zero – we can duly infer that these partial sums form a Cauchy sequence likewise. If the limit of the sequence of powers of x is Z(x), we get sum(: power(i, x) ←i :{naturals}) = (1 −Z(x))/(1 −x). In the usual case, where all x smaller than 1 have Z(x) = 0, this gives us sum(power) = (: 1/(1 −x) ←x :{values smaller than 1}).

This special case frequently serves as a helpful bound on other infinite sums; if we can show that (beyond some point in the sequence summed) each term is less than (some constant times) a power of some value smaller than one, we can use this case to provide an upper bound on the tails of any such sum (that is, all the terms from naturals beyond each given, typically large, natural) and, thereby bound the difference between partial sums beyond some chosen value so as to establish that these differences are indeed less than some demanded tolerance, ensuring the infinite sum does converge.

### Example: the exponential function

In studying permutations we meet the factorial function ({naturals}: product(:successor:n) ←n |{naturals}) and the usual shorth-hand n! for factorial(n). Given the positive rationals, we can infer multiplicative inverses for these; and we can scale those by various things. In particular, we can consider the power series exp = sum(: power/factorial :) = sum(: power(n)/n! ←n |{naturals}). We can evaluate this at any real input; as the reals are ordered and contain a natural embedding of the naturals, with every real and its negation both less than at least some natural, there is some natural N for which −1 < x/n < 1 for all n > N and, beyond power(N, x)/N!, each term in exp(x)'s sum is thus smaller than the one before. Furthermore, for each natural j, −1/j < x/n < 1/j for all n > j.N so power(n, x)/n! shrinks faster for larger n, necessarily converging to zero fast. Thus the sequence being summed does at least converge to zero, although this isn't enough to ensure the same for the sequence of partial sums, (: sum(: power(i, x)/i! ←n |n) ←n :{naturals}).

So let's look at the rest of the infinite sum, after such a partial sum – a tail of the power series, sum(: power(i, x)/i! ←i :(|>:N)), for some N. If we chose N large enough that −1/j < x/n < 1/j for all n > N, for some natural j > 1, then each power(i, x)/i! with i > N falls between ±power(N, x)/N!/power(i − N, j) and the tail's sum falls between ±sum(: 1/power(n+N+1, j) ←n :{naturals}).power(N, x)/N = ±power(N, x)/N!/j/(1 −1/j) = ±power(N, x)/N!/(j −1). For large enough choices of N, we can have this true for any natural j > 1; and the leading power(N, x)/N! factor tends to zero for large N; so the tail beyond each partial sum tends to zero as we cut off the sum later; the sum duly converges.

The above reasoning relied on x being real or rational (I used the fact that our arithmetic is an ordered field); however, equivalent reasoning applies whenever we have a distance d on our arithmetic's values for which h(x) = d(x, 0) is a homomorphism from our arithmetic's non-zero multiplication to that of {positives}, for which ({positive x: x.N < 1}: h |) for smaller N always subsumes that for larger N and the intersection of these over all natural N is empty (so, if we &unite;{0} each, the intersection is {0}). These conditions are not infrequently met, so exp ends up converging at all finite inputs in a wide range of ringlets, including the automorphisms of real and complex vector spaces.

Now, the differentiation rule for polynomials maps sum(p.power) to sum((: (n+1).p(n+1) ←n :).power); with p(n) = 1/n! we have (n+1).p(n+1) = (n+1)/(n+1)! = 1/n! so exp' = exp is its own derivative – exp is a fixed point of differentiation. This, combined with the wide range of contexts in which it's defined, makes it one of the most powerful and useful tools of mathematics.

## Cauchy completion

In any V meeting the conditions above that let us define Cauchy sequences, we can consider the collection of Cauchy sequences reduced modulo an equivalence induced from mutual convergence. Each ({v}: |{naturals}) for v in V is trivially Cauchy and mutually convergent with any sequence that converges to v. Thus the collectin of Cauchy sequences (for the given distance function d) in V, understood modulo the equivalence of mutual convergence, contains an image of V. We'll need to show mutual convergence is an equivalence; then we'll see that, with a suitable distance induced from d, the equivalence-reduced space of Cauchy sequences does provide values to which its Cauchy sequencess converge.  Written by Eddy.