]> The Collatz conjecture

# The Collatz conjecture

There's a nice simple function we can define on the naturals, as the union of two partial functions, one from even naturals, the other from odd naturals:

• Collatz = ({naturals}: n ← 2.n :)&unite;(: 3.i +1 ←i :{odd nautrals}

Its output on odd inputs is always greater than its input; on even inputs, its output is less than its input except for 0, which is a fixed point. Its action on even inputss has exactly this one fixed point, as 0 +n = n = 2.n = n +n implies 0 = n by cancellation of addition. A given odd input, i, is 2.n +1 for some natural n; leading to output 3.i +1 = 6.n +3 +1 = 2.(3.n +2), which is even; so Collatz has no odd fixed points. Thus 0 is its only fixed point.

When we iterate Collatz, however, something interesting happens: its transitive closure, unite(: repeat(n, Collatz) ←n :{naturals}), relates every natural that's ever been tested to 1. We don't have a proof (that I've heard of in early 2017) that this works for all naturals, but folk have accumulated a lot of experimental evidence that fails to contradict this conjecture. Here's a diagram, showing what Collatz maps various naturals to:

(A slightly more extensive diagram can be viewed by running the graphviz package's dot program on this file describing the mapping.)

As noted above, every output from odd input is 6.n +4 for some natural n; it is thus equal to 4 modulo 6; these numbers get green bubbles above. Every natural is also the output from twice it; so each 6.n +4 is the output from two inputs; every other natural is the output from exactly one input, its double. In particular, every multiple of 6 can only be reached from twice it, which is also a multiple of 6, so the ancestry of multiples of six is boring; indeed, as a result, so is that of any multiple of three; so I use grey for the rings round these and leave out that boring ancestry. Other odd nodes get a blue ring and the remaining numbers (equal to 2, modulo 6) get black circles. Each black is reached from a green and leads to either a green or a blue; each grey or blue leads to a green. If I keep adding numbers to the diagram, every green would be reached from two others.

You may notice that some small numbers are quite distant from the {4, 2, 1} loop that everything (except 0) leads to; indeed, I've had to go quite a way to include 7 and 9 in this diagram. If you start at 27, you'll have to iterate Collatz more than 100 times to get to the usual loop. We can define a mapping ({naturals}: d |{naturals}) by d(n) = count(| repeat(i, Collatz, n) ←i :{naturals}), the number of distinct values we see upon iterating Collatz indefinitely, starting at n. We then get d(0) = 1, d(1) = d(2) = d(4) = 3 and all other naturals in the diagram above have larger values for d. Each p = power(2+n, 2) for natural n has d(p) = 3+n, while ({3}: d |{1, 2, 4}) and d(0) = 1; otherwise, for each natural i and odd n, d(n.power(i, 2)) = d(n) + i, so the odd numbers tell us everything interesting about d. Here's a graph of d(2.n +1) as a function of natural n:

It's pretty spiky !

If you extend Collatz to {integers}, its action on negative inputs does have some isolated cycles: for example, −5 → −14 → −7 → −20 → −10 → −5. However, this just amounts to replacing the +1 with a −1 in Collatz's 3.n +1 for odd n; the resulting iteration on positive naturals tells you what Collatz does on negative integers. The interesting puzzle is to see whether there's any proof that every positive integer's iteration of Collatz does eventually reach 1. Either that or exhibit a cycle, like the one above for −5; but folk have looked really hard for such a cycle and (so far) failed to find one.

### Constraining a minimal exception

Every natural is in a Collatz trajectory leading down through its multiples by powers of two. If the natural is a multiple of 3, so are all of these multiples, so none of them is reached from an odd value by trebling and adding 1; so this chain of multiples by powers of two is the only trajectory leading to any multiple of three and every odd multiple of 3 is the first odd value in its Collatz trajectory. For any natural k, we have partial trajectories

• 8.k +1 → 24.k +4 → 12.k +2 → 6.k +1
• 4.k +1 → 12.k +4 → 6.k +2
• 2.k +1 → 6.k +4
• 4.k +3 → 12.k +10 → 6.k +5

with the first step being a treble and add one to a value that is also reached via its usual chain of multiples by powers of two. In each case but the first, note that the chain shown starts at a natural less than the one at which it ends. Every natural is either a multiple of three (already considered) or of the form of the end-point of one of these four trajectories. So naturals that are not multiples of 3 are reached via at least two trajectories; and every natural equivalent to 2, 4 or 5 modulo 6 is reached via some value less than it.

Let the catchment of 1 be the collection of naturals whose Collatz iteration eventually reaches 1; it is well-established that there are some quite large naturals subsumed by the catchment of 1. (I have written and run software that has verified that all numbers up to M = 3131964 are in the catchment of 1; in the process, that software visited P = 3670775 other naturals (17% more) in the catchment of 1; the largest of these was 622717901620; that's 169642.08 times 3670775. The ratio of P to M has been 1.17 and a bit since M = 1655657, with only minor variation in the and a bit; I wonder whether this is a clue.)

Suppose some naturals are not in the catchment of 1. Any natural whose Collatz trajectory passes though a natural not in the catchment of 1 is likewise not in the catchment of 1; and every natural in the Collatz trajectory starting from a natural not in the catchment of 1 is likewise not in the catchment of 1. In particular, every power of two times any natural not in the catchment of 1 is also not in the catchment of 1. We know, by experiment, that all naturals not in the catchment of 1 are large (more than one million, from above).

Consider the least natural, N, not in the catchment of 1; its trajectory never passes through a lesser natural; nor does any trajectory from a lesser natural lead to it. In particular, N is necessarily odd (since an even that's not in the catchment of 1 has half of itself in its trajectory, which is less than it) and equal, modulo 6, to one of 0, 3 and 1 (as 2, 4 and 5 are reached from values less than them); being odd rules out 0, so N is 6.k +1 or 6.k +3, for some natural k. These give us trajectories

• 6.k +1 → 18.k +4 → 9.k +2
• 6.k +3 → 18.k +10 → 9.k +5

In the first case, even k would get us something less than N at the next step; in the second case, odd k would do the same; sinch this doesn't happen, our cases reduce to 6.odd +1 and 6.even +3, i.e. 12.h +7 or 12.h +3, so N is equivalent to 7 or 3 modulo 12. Our trajectories are now

• N = 12.h +7 → 36.h +22 → 18.h +11 → 54.h +34 → 27.h +17 = 9.(3.h +1) +8
• N = 12.h +3 → 36.h +10 → 18.h +5 → 54.h +16 → 27.h +8 = 9.(3.h) +8

Either way, N = 4.m −1 for some m that's not a multiple of three (m is 3.h +2 in the first case, 3.h +1 in the second), and its fourth successor is 9.m −1; were m equal to 1 modulo 4, m = 4.j +1, this would have second successor 9.j +2 < 16.j +3 = N, which can't happen as N is minimal. So N is 4.m −1, m isn't a multiple of 3 and m −1 isn't a multiple of 4. So, modulo 12, m is 2, 4, 7, 8, 10 or 11; modulo 48, N is 7, 15, 27, 31, 39 or 43.

We start by specifying N to be natural; then we examine its forwad and backward histories, constrained to never fall below it, and infer constraints on N's value modulo some natural (typically a multiple of various factors of 2 and 3). This constrains values a few steps the forward or backwards from the previously-examined ends and parameterises the possible values of N in terms of an offset relative to a multiple of some modulus (that's bigger than the offset). At each stage, we use that parameterisation to describe the previously-examined ends and examine which values can lead to them, or they can lead to; we explore this outwrds until the thus-parameterised bounds would imply a next step outwards that would be below N for some values of the parameter(s), thereby eleiminating those choices of the parameters for the given offset; this increases the modulus by a larger number than it increases the number of offsets and we can repeat. This gives us a refinement process, by which to narrow the options for N. In principle, this should elad (after some rather large number of refinement steps) to a constraint whose modulus exceeds the actual N; at this point, N must be one of the offsets that we still deem possible.

Written by Eddy.