Positional numeral denotations

Inventing an independent name and symbol for each natural number would not be practical (especially as there are infinitely many of them); so we need a system for denoting arbitrary naturals, using only a modest number of distinct symbols. Natural language generally provides one (albeit often with peculiar wrinkles and eccentricities); but it tends to be better at dealing with the early naturals than the late ones – once you get past a lot of millions, it can start to get unwieldy (albeit less rapidly so than the {{}, {{}}, {{}, {{}}}, {{}, {{}}, {{}, {{}}}}, …} denotations implied by the standard denotations for collections). Mathematics and science have various notations for denoting numbers; I'll here introduce the most straightforward, which deals only with the naturals, leaving others for contexts in which they are more appropriate.

Sufficiently General Base and Digits

For

any natural n, of which {{}} is a member (i.e. n > 1),
any choice of textual decomposition granularity (single character, word, etc.) and
any set of denotations for n's members, each by a text understood as a single token at the given granularity (for example, the names, at the word granularity, or single-character denotations, at the character granularity, for the members of ten),

one can give meaning to sequences of the given denotations of n's members as denotations for naturals, at the next coarser level of textual decomposition granularity (word, phrase, etc.). I'll first define some terms that'll let me specify how to do that, somewhat more intelligibly than that over-long sentence !

It is more usual to specify the meaning of positional numeral denotations in terms of sums of products, each of which multiplies a digit by a power of the base; however, for this, one needs arithmetic, where the present specification only requires the naturals and the ability to count using them. Specifying it this way requires me to prove, as a theorem of arithmetic, that the resulting denotations do in fact possess the properties obtained by the usual arithmetic specification; but it also lets me use the denotations before I have defined arithmetic. Once I have that arithmetic, I can extend this specification to include numeral denotations for fractional parts; one of the reasons I allow leading zeros in the following is that doing so simplifies the account of how to write fractional parts.

In this context, each given denotation, at the given granularity, for a member of n is known as a digit; we shall infer meanings for sequences of digits from the meanings of the given digits; the inferred meanings depend on the choice of n, which is known as the base of the system. The specification proceeds by building up texts at a coarser granularity of decomposition as sequences of digits. In particular, the requirement that 1 = {{}} be a member of the base implies that, as it is natural, it subsumes 1 and hence also has 0 = {} as a member; this also means it's non-empty, hence is the successor of one of its members, unite(n). Furthermore, we are given that there are digits available to denote the members of n, hence in particular unite(n), 1 and 0. Note that, in the case n = 2 (the least possible base), the first two of these are the same, unite(n) = 1.

It can also be interesting to define numerals using digits of mixed sign, particularly when the base is odd; in base 2.n+1, for example, one could have digits for the members of n and variants on these marked in some way (e.g. with an overline) to represent their negations; either skipping the negated zero or accepting it as a synonym for the usual one. Such a system has various benefits of its own, particularly when rounding fractional quantities to the nearest integer (by truncation); however, to specify it I need the integers, negative as well as positive, and the arithmetic I'm studiously avoiding (for now). There are also interesting possibilities for the use of these in bosonic computation, where the boson's spin has an odd number of possible states, with just the right minus n through plus n range of values. Perhaps I'll write an extended definition that includes those, some day.

Numerals

A numeral is a non-empty finite sequence of digits, combined together to form a text at the next coarser granularity up from that of the digits themselves; that is, if the digits are single characters, a numeral has the form of a word (a sequence of digits with no spaces between them, e.g. 23140); or, if the digits are words, a numeral has the form of a clause or phrase (e.g. two three one four zero). (Note that the latter does not necessarily match natural language's analogous machinery for building names for larger numbers out of those for digits and those for assorted powers of a base; when using base ten, two three one four zero is a numeral text and, for all its similarities, a different text from twenty-three thousand, one hundered and forty.)

A numeral may also contain embedded grouping separators (typically punctuation or spacing characters), inserted to break its digits up into groups, in a manner specified by context (and generally dependent on the cultural conventions usual in conjunction with the natural language of the context); for example, in English text, it is usual, when there are more than three digits, to include a comma to the left of the right-most three digits and, wherever three digits appear immediately to the left of a comma, and some further digits appear to to their left, to include a comma between these further digits and the three given (i.e. one comma every three digits, starting from the right, wherever there are more than three digits in a row). Such grouping separators, where context does specify their use, does not change the meaning of the sequence of digits – the numeral denotes the same natural as the corresponding numeral obtained by simply removing the separators – which are included solely as a visual aid to readers, to help keep track of positions within the sequence of digits.

We can define an ordering on numerals, based on our order on naturals. First, our digits are associated with naturals, so describe one digit as less than another precisely if its associated natural is less than the other's. Next, to compare two numerals, first count the digits in each that either aren't the zero digit or have some non-zero digit to their left; if one numeral has more such digits than the other, the former is greater than the latter. If neither has any such digits, i.e. neither has any non-zero digits, they are equal; otherwise, starting with the left-most non-zero digit of each, compare them digit-by-digit: if they agree in a digit, compare the next digit to the right in each; if this brings you to equal right-most digits, then the two numerals were, as sequences of digits, identical after any zero-digits at the left, so we consider them equal. Otherwise, at the first digit (after leading zeros) in which they differ, the one with the greater digit is greater. Modulo the equivalence defined in the course of this, we can then count the numerals less than any given numeral; the numeral represents the natural result of that counting.

Consequences

Each digit corresponds (as a digit) to a natural and the digits less than it are precisely the ones corresponding to each natural less than (i.e. in) the given natural; so the count of these is just the natural our original single digit corresponds to. When our digit is considered as a numeral, the only numerals less than it are the digits less than it, considered as numerals, and the numerals obtained by adding a prefix of zero digits to each; since these last numerals are equivalent to the single-digit numeral they end in, we don't count them separately from that digit and the numerals (modulo equivalence) less than our single digit are precisely the single digits less than our single digit, of which the count is precisely the natural corresponding to our single digit.

Given the above specification, there is a straightforward way, given a numeral, to obtain its successor – i.e. a numeral representing the natural that is the successor of the natural corresponding to the given numeral. Possibly by prefixing with a leading digit representing zero, which definitely isn't unite(n), any numeral's sequence of digits can be decomposed into a prefix (on the left), a single digit that doesn't represent unite(n) and a suffix (on the right) consisting entirely of the digit that represents unite(n); the prefix and suffix may each be empty. In this form, the single digit, that doesn't represent unite(n), represents some other natural in n; since unite(n) is the maximal natural in n, the natural our single digit represents is less than unite(n) so adding one to it gives a result in n, that is represented by some digit. The successor of our numeral is then the same (possibly empty) prefix followed by this successor digit followed by a suffix of zero-digits, one for each digit representing unite(n) in the original suffix.

I demonstrate elsewhere that the numerals thus specified do have various familiar properties, including that this successor operation works as claimed.

Illustration: base ten (a.k.a. decimal)

Let's now see what the above (which is deliberately general) says for a concrete (and, hopefully, familiar) case. In base ten, using the symbols 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, at character granularity, we can build up word-granularity numerals out of these decimal digits. (The ancient Latin for ten was decem, from which is derived the adjective decimal for things related to ten.) Each digit, construed as a numeral (i.e. a sequence of digits that happens to have only one digit in it), denotes the natural specified for that digit above. Prefixing any of them with arbitrarily many zeros doesn't change it, so 007 = 7. The union of our base, unite(ten), is 9; so any numeral not ending in 9 can be read as a prefix followed by its right-most digit, with an empty suffix of no 9s; its successor is the same prefix followed by the successor of that last digit, with an empty suffix of no 0s.

For the single-digit numerals, the prefix is empty and we simply replace each digit by its successor (when there is one) to get the numeral's successor; this is duly a denotation for the next natural – so far so boring. When we get to 9 = unite(ten), there is no digit denoting its successor, ten; 9's successor is our base. To put our numeral in the form for which successor is defined, we needs some digit in it that isn't 9, so we prefix it with a 0, which makes no difference to its meaning; 09 is still nine. This is now in the right form – empty prefix, single digit 0, suffix of one 9 – and we replace the single digit 0 with its successor 1 and each 9 in the suffix with 0 to get 10 (one zero), which (in base ten) represents ten.

Indeed, no matter what base we chose, as long as 0 and 1 are used as digits with their usual meanings (as given above), the numeral 10 shall (in the chosen base) denote the natural in use as our base – so, if you want to tell your readers what base you're using, please use some other denotation than 10 for it; all use of this system is base one zero but saying so tells the reader nothing and writing base 10 says only that. If you are entirely confident that your readers shall understand 10 to mean ten, then you don't need to tell them what base you're using; otherwise, if a reader thought you were using some other base, they'd read base 10 as confirming that. If you mean base ten, write that, not base 10, unless (and this is occasionally appropriate) you mean to be utterly ambiguous !

Once we've got to 10, we can read it as a prefix 1 followed by a single digit 0 followed by a suffix of no 9s; its successor is thus the same prefix, 1, followed by the successor, 1, of our single digit, followed by no 0s; i.e. 11. We get the same prefix and suffix as before and continue to 12, and so on up to 18, 19; and we can, as before, prefix any of these with as many zeros as we like without changing its meaning, so 00016 = 16. Once we get to 19, its last digit is 9 = unite(ten) again; however, this time, there is some earlier digit that isn't 9, so we can read it as an empty prefix, a single digit 1 and a suffix of one 9; its successor is thus the same empty prefix, the successor, 2, of the single digit and a suffix of one 0, making 20 the successor of 19. We can now replace the last digit with its successor to obtain 21, 22, 23 and so on up to 28, 29; we have a suffix of one 9 again, preceded by 2, so to obtain 30 as 29's successor, followed by 31, 32 and so on up to 39. We can continue this in the usual manner until we get to 99 (nine nine), at which point we have a suffix of two 9s with nothing before it, so we prefix it with 0 as single digit (with an empty prefix before that) to get in the right form; its successor is (our empty prefix followed by) our single digit's successor, 1, and as many 0s as we had 9s, so we get 100 (that is one zero zero) as 99's successor.

Hopefully the pattern is now clear (and, for most, familiar). We work with an empty suffix of no 9s for a while, replacing the last digit with its successor. We sporadically have a single 9 as suffix and a digit before it that we can replace with its successor, while replacing the 9 with a 0. Even more sporadically we have a suffix of two 9s, still with a digit before it, that we can replace with its successor followed by two 0s in place of the 9s. Eventually, we'll get to 999 and need to add a 0 prefix to get it into the form we can work with to get its successor, 1000. After that we resume as we did before, now even more sporadically getting to numerals ending in 999 but with a preceding digit that we can replace by its successor while replacing the suffix with 000, until we finally get to 9999 and need a leading 0 again. We can continue this indefinitely, with ever more sporadic longer sequences of 9s showing up.

Written by Eddy.