# Computing the Fibonacci numbers

This page is really about how to tell a computer what to do in such a way that it goes about it efficiently; so it has to involve some mathematics (so that we can work out how much work the computer has to do to complete the computation), so it may as well address an entirely mathematical problem.

Fibonacci's sequence is defined by:

• f(1) = 1 = f(2)
• f(i+2) = f(i+1) + f(i)

Formally, this only specifies f(i) in so far as i is a whole number (i.e. one with no fractional part); but it does specify f(i) for every whole number. We can work forwards to see that f(3) = 2, f(4) = 3, f(5) = 5, f(6) = 8, f(7) = 13, f(8) = 21 and so on. We can turn the second rule around to say f(i) = f(i+2) −f(i+1) and work backwards to get f(0) = 0, f(−1) = 1, f(−2) = −1, f(−3) = 2, f(−4) = −3, f(−5) = 5, f(−6) = −8 and so on. None of this gives us any clue as to a value for f(1/2) or f(π), so Fibonacci's specification only tells us f's values at whole numbers (positive, zero and negative; collectively known as integers).

Notice that f(0) = 0, f(1) = 1 would work as an entirely adequate replacement for the first rule (and I could equally phrase it as (:f|2) = 2). Also, observe that f(−odd) = f(odd) but f(−even) = −f(even) for each odd or even input explored above. If we suppose this to be true for each even or odd j < i, for some positive whole number i (e.g., we've already shown this for i = 7), we can induce that:

if i is even
f(−i) = f(2−i) −f(1−i)

but i is even, so 2−i is also even and 1−i is odd, so

f(−i) = −f(i−2) −f(i−1) = −f(i)
otherwise, i is odd, yielding
f(−i) = f(2−i) −f(1−i)

with i odd, so 2−i is odd and 1−i is even, whence

f(−i) = f(i−2) +f(i−1) = f(i)

so the rule we supposed to hold for j < i also holds for j = i; hence for each j < i+1, whence for j = i+1 and so on; hence for all j. We can exploit this to simplify our computations.

I can use some simple theory to work out that any f satisfying the second rule (ignoring, for the moment, the first) must be a sum of terms, each of which is of form f(i) = k.power(i, q) with k arbitrary and q.q = q +1 – which implies that (2.q−1).(2.q−1) = 4.q.q −4.q +1 = 4.(q +1) −4.q +1 = 5, whence 2.q = 1 ±√5 – so that our actual f must be f(i) = k.power(i, (1+√5)/2) +h.power(i, (1−√5)/2) for some k and h. (That (1+√5)/2 is the golden ratio; I'll discuss this furthere, below.) We could apply the first rule to work out k and h; but it's easier to apply 0 = f(0) = k +h to get h = −k and then use 1 = f(1) = k.((1+√5)/2 −(1−√5)/2) = k.√5, so that f(i) = (power(i, (1+√5)/2) −power(i, (1−√5)/2))/√5. If I'm happy with the precision of floating point arithmetic this lets me compute f very efficiently; however, floating point arithmetic is all about adequate approximations and tends to produce mildly embarrasing results like 4 × 0.035 = 0.14000000000000001 (when it's easy to see that the right answer is 0.14 exactly). The first rule gives f(1) and f(2) as whole numbers (they're both 1, which is as obviously whole as a number can be) and the second rule gives any other f value by adding or subtracting f values for inputs closer to 1 and 2 than the new one we're obtaining, so we can safely induce that all f values are whole numbers, with no fractional part; which makes it silly to compute them via floating point numbers (i.e. ones with a fractional part). That those numbers in fact represent irrationals doesn't make it any nicer. So I'll be looking at how we can compute f's values precisely and ignore the short-cut via this nice piece of theory. None the less, notice that the value of f(i) grows exponentially with i's magnitude.

For this page's purposes, I'm going to use python as the programming language in my examples: however, the reader is welcome to read the code as pseudo-code (and I'll restrict myself to python that reads well as such), understanding indented lines as being subordinates controlled by the last less-indented line. The key-word def begins the definition of a function.

## The really obvious way

Most modern programming languages (for these purposes, ForTran isn't modern) allow one to write functions which call themselves; this is called recursion. A language can be clever enough to dodge the problem I'm about to point out, but it's instructive to see why simple-minded languages run into problems with the obvious way to implement Fibonacci's sequence. We see how it was originally specified, so we can write ourselves a function:

```
def Fibspec(i):
if i == 1 or i == 2: return 1
if i > 1: return Fibspec(i−2) +Fibspec(i−1)
return Fibspec(i+2) −Fibspec(i+1)

```

This simply states the rules given above: the first rule is the first line, f(1) = 1 = f(2) (the `==` operator tests for equality, as distinct from the `=` operator, assignment, which gives its left operand(s) the value(s) its right operand(s) previously had). The second line is just the second rule; and the third line is the reverse of it, needed to get answers for i < 1.

Unless the programming language you're using is very clever, it's going to do an awful lot of computation, if you use a definition like this. Let me explain why. First, observe that if I ask for `Fibspec(1)` or `Fibspec(2)`, I get my answer back immediately: it's 1. Thus far, `Fibspec` is doing well. Next, if I ask for `Fibspec(3)` it'll test that 3 isn't 1 or 2, notice that 3 > 1 and proceed to work out (as its answer) `Fibspec(1) +Fibspec(2)` – each of which it works out directly; so (aside from some over-head, which I'll ignore, in how the language lets a function call itself recursively) it has to do one addition to get the answer 2. (It also has to do two subtractions; indeed, the number of subtractions involved in computing f(i), for i > 0, is exactly twice the number of additions; but this – along with the two recursive function calls per addition – just means the total work involved is a simple constant times the work for one addition; so let's stick to counting the additions.) If I ask for `Fibspec(4)`, it'll likewise do one addition to combine `Fibspec(2)` with `Fibspec(3)`; computing the latter costs it one addition, as before, so it costs two additions to get the answer `Fibspec(4)` = 3. Notice that the number of additions we've needed to do, each time, is just one less than the answer we obtained. For `Fibspec(5)`, we work out the previous, costing us 2 +1 additions, and do one more addition, getting the answer 5 for 4 additions. Again, the number of additions is only one less than the answer.

In fact, if we can compute f(i) and f(i+1) at a cost of one addition less than their respective values, then computing f(1+2) costs us one addition to add them together plus f(i)−1 to compute f(i) and f(i+1)−1 to compute f(i+1); for a total of 1 +f(i)−1 +f(i+1)−1 = f(i) +f(i+1) −1 = f(i+2)−1. So the number of additions we have to do to compute `Fibspec(i)` is going to be only one less than the answer we get. It shouldn't be hard to see that similar applies for i < 1; computing `Fibspec(i)` requires slightly more (rather than less) subtractions (plus twice as many additions and recursive function calls) than the magnitude of the answer, in this case. Given that this answer grows exponentially with the magnitude of i, this means that `Fibspec` is a horribly expensive way to compute the Fibonacci numbers.

## The obviously better way

So let's look for a better way to compute f(i). If you look at what we were doing before, you should be able to see that the real problem is that, when computing f(i+2), we first compute f(i), then compute f(i+1); but, in the course of doing the latter, we re-compute f(i). That's obviously dumb: if we'd just remembered it from before, we could save that extra computation. Since the same applies at each step back towards 1 and 2, we're clearly better off simply working forwards:

```
def Fibiter(i):
if −1 < i < 1: return 0
old, ans = 0, 1
while i > 1:
old, ans = ans, ans + old
i = i − 1

while i < −1:
old, ans = ans, old − ans
i = i + 1

return ans

```

Note that python allows several operands on each side of assignment; `old, ans = 0, 1` sets `old` to 0 and `ans` to 1; and `old, ans = ans, ans + old` computes `old + ans` before doing the assignments, so `ans` gets the prior value of `old + ans`. Notice, also, that I've thrown in a piece of robustness for the computation: if i isn't a whole number, the function shall still return an answer (rather than, say, spinning forever in a loop that's been confused by an unexpected value); and that answer is what you would get by calling `Fibiter` with a whole number close to i.

This time, we're just keeping track of two values – initially `old` = f(0) and `ans` = f(±1) – and working our way up or down to i one step at a time, doing one addition and one subtraction on each step, so it should be clear that the number of additions and subtractions we have to do is equal to the magnitude of i, give or take one. So, with a little easy thinking, we've now got a way to compute f(i) at cost proportional to i rather than proportional to f(i); given that the latter grows much faster than i, this is clearly an improvement. (Furthermore, although this needn't be true in all languages, calling functions (recursively or otherwise) is more expensive, in imperative languages, than iterating a loop; so not only do we make fewer steps of computation, but the cost of each step is reduced, too. We incur some small overhead in having two new local variables, so computing f(i) for a few small positive i will be minutely slower, but f(0) is now very cheap and all other f(i) shall be cheaper to compute.)

The interested reader would do well to think for a while, before reading on, about how it might be possible to do even better than this.

## The deviously clever way

Some clever person noticed (many years ago) something marvelously subtle that we can exploit. However, before I can explain how that works, I need to introduce you to the efficient way to compute powers, given a multiplication. While you can compute a power by repeated multiplication

```
def powiter(n, x):
if n < 0: return powiter(−n, 1./x)
ans = 1
while n > 0:
ans = ans * x
n = n − 1
return ans

```

(in which the number of multiplications we do (and, incidentally, the number of subtractions we do) is equal to the power we're raising to), there's a much more efficient way to do it, in which the number of multiplications we do is just the number of binary digits it takes to write out n plus the number of non-zero digits among these (so at most twice the logarithm of n to base two). Once you understand that, I'll be ready to show you how to compute Fibonacci more efficiently.

Note that, if we're happy to use floating point arithmetic, we could also obtain power(n, x) as exp(n * log(x)), whose cost doesn't depend on n at all – although the costs of calling exp and log run to quite a lot of multiplications, divisions, additions and subtractions; so this only works out cheaper when n is quite big. So, for a (more-or-less) fixed large cost, we can compute power, hence f(i) via the golden ratio approach, at cost independent of i, provided we're happy with floating point. However, as noted above, I'm trying to avoid floating point – which, for large values of i, is apt to leave us having to round a big floating-point value to its nearest integer, which becomes imprecise once the output of f(i) is big enough that the floating point representation's mantissa has too few digits to represent it exactly.

### Efficient power

```
def powbits(n, x):
if n < 0: return powbits(−n, 1./x)
ans = 1
while n > 0:
if n % 2: # n is odd
ans = ans * x
n = n − 1
n = n / 2
x = x * x
return ans

```

I encourage the interested reader to think about why this is correct before reading on. The % operator computes a remainder: p − (p % q) is always a multiple of q; so n % 2 is one (which `if` considers to be true) when n is odd and zero (which `if` considers false) when n is even.

We can use this with floating point arithmetic, just as we can for whole number arithmetic; and, for modest values of n, it'll be faster than exp(n * log(x)), though with the same precision issues as I note above. As we'll soon see, however, this isn't faster than we can compute f(n); only via exp and log can floating point win, and then only for large n, at which point the precision problems are apt to matter. So, for precise computations, what I'll show soon is in fact as fast as the floating point approach, except possibly for a bounded range of values of i that are large enough for exp(i * log(x)) to be cheaper than powbits(i, x) but small enough that f(i) is faithfully represented by a floating point mantissa.

What `powbits` does is to express `n` in binary and multiply `ans` by power(b, x) for each power of two, b, that's represented by a non-zero digit of n, when written in binary. Computing the power(b, x) is done by repeatedly squaring x. To help you understand how `powbits` works, consider this recursive form of it:

```
def powrecurse(n, x):
if n < 0: return powrecurse(−n, 1./x)
if n % 2: return x * powrecurse((n − 1) / 2, x * x)
if n > 0: return powrecurse(n / 2, x * x)
return 1

```

The first line is as before. If n is zero, we'll fall all the way through to the end and get 1, which is just power(n, x) with n = 0. If n is even and positive, we get to the last but one line: power(n, x) is, in this case, just power(n/2, x*x), so that's what we return. For positive odd n, n−1 is even, so power(n−1, x) is just power((n−1)/2, x*x) and multiplying it by x yields power(n, x). So I hope you can see that `powrecurse` does compute power(n, x).

In a really clever programming language, `powrecurse` might even be compiled efficiently: but most imperative languages aren't that clever, so let's see why `powbits` is doing the same thing as `powerecurse`. Each time we go round the loop in `powbits` is equivalent to one layer of recursion in a call to `powrecurse`: the first time round the loop, we have our original x; if n is odd, we multiply `ans` by x and leave the subsequent iterations to multiply `ans` by pow((n−1)/2, x*x); otherwise, we simply leave those subsequent iterations to multiply `ans` by pow(n/2, x*x). By the time we've fallen off the end of the loop, we've worked out what we were originally asked for, but we've only gone round the loop once per binary digit of n (and we've done it without relying on our language to be clever enough to implement tail recursion with an accumulator). Before I move on, let me just throw in how I'd actually implement `powbits` in python:

```
from operator import mul, truediv
def powbits(n, x, v=1):
"""Returns v * x**n, for integer n; v defaults to 1."""
if n < 0: n, join = −n, truediv
else: join = mul
while n > 0:
n, b = divmod(n, 2)
if b: v = join(v, x)
x *= x
return v

```

Since we're going to want a variable in which to accumulate the answer, incrementally multiplying by powers of x, and the caller not infrequently wants to multiply the answer by something, we may as well let the caller pass that to the function, in place of the 1 we'll use by default; hence parameter `v` replaces local variable `ans`. This also makes `powbits` usable when x isn't a plain number, e.g. when it's a matrix, and doesn't support multiplication by plain numeric 1, requiring instead a suitable unit of its own kind, that we can pass in as v. In support of that, the negative power case is here handled by dividing v by x instead of by replacing x by its inverse, mediated by over-riding the `mul` function – the operator module's `mul(v, x)` just returns `v * x` but we replace this with ```truediv(v, x) = v / x``` for negative n (and `truediv` uses the division for which 1/2 is 0.5 instead of rounding to 0 because 1 and 2 were given as integers). Furthermore, this even lets us use ```powbits(n, p, q)``` with negative n to divide q repeatedly by p, even when p has no multiplicative inverse, provided q actually has p as a sufficiently repeated factor (and their type implements division to recognise that case; as, for example, `study.maths.polynomial.Polynomial` does).

The function `divmod` (a built-in of the language) returns a pair of values, the quotient and remainder from performing a division; so ```n, b = divmod(n, 2)``` halves n, rounding down if n was odd, and sets b to the remainder it got when doing so. So this version does the same as the previous one, but is a little terser and more flexible … and that first line, in triple-double-quotes, is documentation – because real code should always be documented.

### Fast Fibonacci

So now let's go back to looking at Fibonacci. In `Fibiter`, we have two variables, `old` and `ans`, that hold a pair of values of the Fibonacci sequence; at each iteration, in either loop, we compute new values for them from (only) their prior values. That computation is very simple: `old` gets `ans`'s prior value and `ans` gets the sum or difference of their prior values. If we used a vector, with two entries, to hold these two variables, each step of the iteration would thus be performing a simple linear transformation on the vector: this can be expressed as the contraction of a matrix with the vector. Since each iteration contracts the same matrix with the vector, overall we're simply contracting power(i) of the appropriate matrix with the start vector, [0,1]. So the loops in `Fibiter` are like the loop in `powiter`, but applying a matrix to a vector instead of multiplying numbers; and the net effect is to compute a power of a matrix and contract it with the vector. If we can compute the power of the matrix in `powbits`'s way, we can reduce the cost from of order the input to `Fibiter` to of order the number of bits needed to express it – i.e., its logarithm (to base two).

Now, the matrix for our forward iteration looks like F = [[0,1],[1,1]], with the second pair as the bottom row; contracting it with [`old`,`ans`], with `ans` below `old`, gives [`ans`,`old+ans`] as the new value for [`old`,`ans`]. When we multiply F by itself repeatedly, we get F·F = [[1,1],[1,2]], F·F·F = [[1,2],[2,3]], F·F·F·F = [[2,3],[3,5]]; in fact (as the interested reader should be able to verify easily) we get power(i, F) = [[f(i−1),f(i)],[f(i),f(i+1)]]; so we can compute f(i) as the second co-ordinate of power(i−1, F)·[0,1], for positive i. Indeed, python will actually let me define a class that implements matrix multiplication such that I can simply use `powbits` to implement this, with similar for F's inverse to handle negative i.

This sort of transformation, representing a function's internal state by a vector and each step of an iteration by a linear map applied to that vector, so as to replace the iteration by some more efficient computation derived from the linear map, can be applied to reasonably many contexts, making it worth checking one's code for opportunities to exploit such a transformation. Notice, however, that (in the present example) power(i, F) contains some redundancy: two of its entries are equal and one of the others is the sum of the last with one of the equal ones. We can surely do (a bit) better.

Remember, earlier, that we worked out that f(i) = k.power(i, q) would satisfy f(i+2) = f(i+1) +f(i) precisely if q.q = q +1; so let's look at polynomials modulo this equation, i.e. with square = power(2) = (: x.x ←x :) treated as if it were equal to (: x+1 ←x :). This reduces the space of polynomials to a two-dimensional space; each so-reduced polynomial is simply (: A.x +B ←x :) for some A, B. If we multiply it by power(1) = (: x ←x :), it becomes (: A.x.x +B.x = (A+B).x +A ←x :), so the pair of coefficients is subjected to exactly the transformation `Fibiter` applied to `ans` and `old`, with ans as A and old as B. Thus power(0) = (: 1 ←x :) represents a start-state, slightly different from that of Fibiter, with old = 1 and ans = 0; and each successive power(i, x) reduces to f(i).x +f(i−1). Thus taking powers of x in this space of reduced polynomials is equivalent to computing Fibonacci's sequence.

The reduced space of polynomials represents our iteration in a way I can easily motivate from the polynomial that appears in the analytic treatment of the series; that works generally for similar iterations. However, we're lucky in the details of how we use it to encode our iteration. In other cases, such as the Perrin iterator, the corresponding analysis can require more thought; none the less, the essential treatment of the problem is the same. Reduce polynomials modulo the one that mimics our iteration, then look at the resulting multiplication to find a way to represent tuples of iterated values and a polynomial to multipliy by to implement the iteration.

When we multiply two of our reduced polynomials, the cross-terms in x.x reduce away; (A.x +B).(C.x +D) = A.C.x.x +(B.C +A.D).x +B.D = (B.C +A.D +A.C).x +(A.C +B.D). The code doesn't actually need to know we're working with polynomials; it just sees us working with pairs of numbers and using an eccentric multiplication on such pairs: [A, B].[C, D] = [A.D +A.C +B.C, A.C +B.D]. The pairs [0, 1] and [1, 0] represent power(0) and power(1), respectively; the former is our multiplicative identity and the successive powers of the latter shall give us Fibonacci's sequence. We just need to code the equivalent of `powiter` for the multiplication of our reduced polynomials. So first we define (using python tuples `(a, b)` rather than lists [A, B], simply because the language provides convenient support for these):

```
def fibtimes(p, q):
(a, b), (c, d) = p, q # unpack as pairs
return a * (c + d) + b * c, a * c + b * d

```

which is just our peculiar multiplication. (We could, equally, subclass the built-in `tuple` type in python, of which ```(a, b)``` and `(c, d)` are instances, and define the methods on it to support this as multiplication in the ordinary way, rather than having to call `fibtimes` explicitly; we could then use `powbits`, above, rather than a custom implementation of power.) Then we implement the power operation for this multiplication restricted to only computing powers of `(1, 0)` (i.e. power(1) or x ←x), as

```
def Fibpow(i):
(a, b) = (1, 0) # x
(c, d) = (0, 1) # ans
while i > 0:
i, n = divmod(i, 2)
if n: c, d = fibtimes((a,b), (c,d))
(a, b) = fibtimes((a,b), (a,b))
return (c, d) # x**i

```

We can actually do a little better than computing `Fibpow`, extracting c from the pair it returns and munging to handle sign; since we only need one part of its return in any case, we can simply return that; and we can handle negative exponents by exploiting a multiplicative inverse for (: x ←x :), namely (: x−1 ←x :), since x.(x−1) = x.x −x which is equivalent to x+1 −x = 1. So we can use [1, −1] as (: 1/x ←x :), in place of [1, 0] as (: x ←x :), and flip the sign of i, if negative. At the same time, we can avoid squaring `(a, b)` the last time round the loop (where its value shalln't actually be used again) by moving the rest of the loop after it and unwinding a copy of it out into the preamble, where we can simplify it:

```
def Fibfast(i):
if i < 0: a, b, i = 1, −1, −i
else: a, b = 1, 0

i, n = divmod(i, 2)
if n: c, d = a, b
else: c, d = 0, 1

while i > 0:
a, b = fibtimes((a,b), (a,b))
i, n = divmod(i, 2)
if n: c, d = fibtimes((a,b), (c,d))
return c

```

(Alternatively, the unrolled first iteration could initialize slightly differently and record which of c, d to return at the end – details I leave as an exercise for the interested reader.) Since we've now reduced our computation to an analogue of power, the amount of computation needed to compute f(i) only grows, with i, as fast as the number of digits it takes to write i out in binary (i.e. the logarithm of i to base two), which grows much less rapidly than i. We are doing rather more computation at each of these fewer steps (rather than just one addition and one subtraction, we're doing: either four multplications, three additions and some easy binary stuff (taking remainder by two and dividing by two is really cheap); or eight multiplications, six additions and the cheap binary stuff), so the `Fibiter` solution is faster for small values of i; but the `Fibfast` solution wins for larger values of i. If you're always computing answers for small values of i, and doing so often enough that you care about performance, `Fibiter` is easily beaten by an implementation which remembers the answers it's worked out in the past; which can be just as efficiently implemented, using a variant of `Fibspec` which caches its answers, as by using `Fibiter`. In python, that would be written:

```
def Fibcache(i, cache=[0, 1]):
if i < 0:
if i % 2: return Fibcache(−i)
return −Fibcache(−i)

try: ans = cache[i]
except IndexError:
assert i > 1, "cache started with two entries"
ans = Fibcache(i−2) +Fibcache(i−1)
assert len(cache) == i
cache.append(ans) # ans is now cache[i]
return ans

```

(One could also chose to limit the cache size; before the `try` stanza, check for `i > BIG`, for some suitably big limit on cache size, and fall back on Fibfast(i), optionally optimising to exploit the knowledge `i > 0`.) Aside from the special handling of negative i, some of the cleverer (usually functional) programming languages might even compile their equivalents of `Fibspec` to something along these lines. On the other hand, if you were only interested in f(i) for a sparse few large i, `Fibfast` wins (strongly); and it'd be asking a lot to expect the compiler to spot that it can re-arrange `Fibspec`, or anything even remotely resembling it, into that. In fact, if you compiler is that clever, you should probably (politely) invite it to take the Turing test.

### Or in C++

One can, of course, do the same in any half-way decent programming language, so here's an illustration in C++:

```#include <stdint.h>

int64_t Fibspec(int8_t i) {
if (i < 0) return Fibspec(i + 2) − Fibspec(i + 1);
if (i > 1) return Fibspec(i − 2) + Fibspec(i − 1);
return i;
}

int64_t Fibiter(int8_t i) {
if (i == 0) return i;
int64_t old = 0, ans = 1;
while (i > 1) {
const int64_t next = old + ans;
old = ans;
ans = next;
−−i;
}
while (i < −1) {
const int64_t prev = old − ans;
old = ans;
ans = prev;
++i;
}
return ans;
}

template <typename F = double, typename C = unsigned>
F powiter(C n, F x) {
if (n < 0) return powiter(−n, F(1) / x);
F ans = 1;
while (n > 0) {
ans *= x;
n −= 1;
}
return ans;
}

template <typename F = double, typename C = unsigned>
F powbits(C n, F x, F v = F(1)) {
if (n < 0) return powiter(−n, F(1) / x, v);
while (n > 0) {
if (n & 1)
v *= x;
n >>= 1;
x *= x;
}
return v;
}

template <typename F = double, typename C = unsigned>
F powrecurse(C n, F x, F v = F(1)) {
if (n < 0)
return powrecurse(−n, F(1)/x, v);
if (n > 0)
return powrecurse(n >> 1, x * x, n & 1 ? x * v : v);
return v;
}
```

Note that fib(93) will overflow int64_t, so even int8_t suffices to represent all the inputs we can sensibly support, in the absence of big-number support such as python provides by default. The pair-arithmetic begs to be rewritten in class-structured form as:

```class FibPair {
int64_t lin, con;
FibPair(int64_t n, int64_t c) : lin(n), con(c) {}
FibPair(const FibPair &) = default;
FibPair &operator=(const FibPair &) = default;
FibPair operator*(FibPair other) const {
return FibPair(lin * (other.lin + other.con) + con * other.lin,
lin * other.lin + con * other.con);
}
static FibPair power(unsigned n, FibPair x, FibPair v) {
if (n == 0) return v;
return power(n >> 1, x * x, n & 1 ? x * v : v);
}
public:
static int64_t FibFast(int8_t n) {
FibPair x = n < 0 ? FibPair(1, −1) : FibPair(1, 0);
return power(n < 0 ? −n : n, x, n & 1 ? x : FibPair(0, 1)).lin;
}
};
```

providing FibPair::FibFast as the equivalent of Fibfast. It's also possible to write an equivalent of Fibcache, using a std::vector as cache in place of its list. Finally, here's an implementation of the floating-point solution, whose computational cost (nominally) doesn't increase with the input:

```#include <cmath>
int64_t Fibfloat(int8_t n) {
constexpr double lnRoot5 = log(5) / 2;
constexpr double lnGolden = log(.5 + sqrt(1.25));
const double lnRaised = n * lnGolden;
const double p = exp(lnRaised − lnRoot5);
const double q = exp(−lnRaised − lnRoot5);
return llround(n & 1 ? p + q : p − q);
}
```

## Golden Ratio

The sum of powers formula for the Fibonacci sequence used powers of the golden ratio, (1 +√5)/2, the positive root of q.q = q +1; this shows up in diverse other places (e.g. some artists like the sides of their canvas in this proportion). That equation can be rearranged as q = 1 +1/q, which leads to the golden ratio having the continued fraction representation

• 1 +1/(1 + 1/(1 + 1/(1 + 1/(…))))

If you truncate this (skip the 1/(…) at some depth) and expand it out, you start at the bottom with 1/1 and, at each step, invert the fraction and add one; if the fraction is a/b that gives you 1 +b/a = (a+b)/a which, inexorably, turns our 1/1 into 2/1, 3/2, 5/3, 5/8, 13/8 and generally f(i+1)/f(i) for successive i, yielding 1 +f(i)/f(i+1) = f(i+2)/f(i+1) at the next step. So truncating the continued fraction gives a fraction that's a ratio of successive terms of our series; however, there remains the question of whether that's actually any good as an approximation to the golden ratio.

The golden ratio is between 1 and 2; so successive powers of it get bigger. The other term in our sum for f(n) is a power of (1 −√5)/2; since 5 lies between 4 and 9, √5 lies between 2 and 3, so 1 −√5 lies between −2 and −1; half of this lies between −1 and −1/2, so successive powers of it (alternate in sign and) get smaller. Consequently, for large n, f(n) is dominated by power(n) of the golden ratio and the ratio of successive terms in the sequence, f(n+1)/f(n), approximates the golden ratio, doing so better for larger n. Thus, indeed, truncating our continued fraction representation of the golden ratio does give us a good approximation, better the deeper we go before truncating.

That ratio, furthermore, is always in coprime form: f(n+1) and f(n) have no common proper factor. This is easy to see if you simply run Euclid's algorithm to discover their highest common factor: the algorithm, at each step, reduces the larger value modulo the smaller and then repeats itself with the reduced value and the former smaller, which is now the larger value. For n>0, f(n)>0; thus, for n>1, f(1+n) = f(n) +f(n−1) > f(n); and, for n > 2, 2.f(n) = f(n+1) +f(n)− f(n−1) > f(n+1); so, for n > 2, f(n) < f(n+1) < 2.f(n) and f(n+1) reduced mod f(n) is just f(n+1) −f(n) = f(n−1). Thus each step of Euclid's algorithm just winds back down Fibonacci's sequence, replacing f(n+1) and f(n) with f(n) and f(n−1) until we get to f(2) = 1 and f(1) = 1, whose highest common factor is 1; from which, as Euclid's algorithm perserves the highest common factor of its pair of numbers, we can infer that f(n+1) and f(n) have 1 as highest common factor and thus are coprime.

This leads to successive entries in Fibonacci's sequence making good integers to use for a rectangle whose sides have to be whole numbers, if we want the aesthetically pleasing proportions of a golden rectangle. (This has ratio of sides equal to the golden ratio: cutting from it a square on its shorter side leaves a smaller rectangle in the same proportions; adding to it a square on its longer side yields another.) I use this, for example, in the default size of my grid for Conway's game of life. Since that also uses the Klein bottle's topology by default, with the shorter axis simply cycling and the longer one flipping the shorter each time round, a glider travelling round it will, by the time it's traversed the longer axis twice, be travelling parallel to how it started out and offset by 2.f(n+1) modulo f(n). Since 2.f(n+1) = 2.f(n) +2.f(n−1) = 2.f(n) +f(n−1) +f(n−2) +f(n−3) = 3.f(n) +f(n−3); so 2.f(n+1) modulo f(n) is just f(n−3).

So, modulo f(n), we now know f(n+1) = f(n−1) = −f(n−2) and 2.f(n+1) = f(n−3). Let's look next at 3.f(n+1) = f(n−1) +f(n−3) = f(n−3) −f(n−2) = −f(n−4). Then 4.f(n+1) = 2.f(n−3) = f(n−2) +f(n−5), which I don't immediately see an easy way to reduce to a single entry ! Speaking of the sequence modulo various things, let's look at it modulo assorted early naturals (with at the point – always where a 0 would follow a 1 – where it starts repeating itself):

1. 0, 1, 1, …
2. 0, 1, 1, 2, 0, 2, 2, 1, …
3. 0, 1, 1, 2, 3, 1, …
4. 0, 1, 1, 2, 3, 0, 3, 3, 1, 4, 0, 4, 4, 3, 2, 0, 2, 2, 4, 1, …
5. 0, 1, 1, 2, 3, 5, 2, 1, 3, 4, 1, 5, 0, 5, 5, 4, 3, 1, 4, 5, 3, 2, 5, 1, …
6. 0, 1, 1, 2, 3, 5, 1, 6, 0, 6, 6, 5, 4, 2, 6, 1, …
7. 0, 1, 1, 2, 3, 5, 0, 5, 5, 2, 7, 1, …
8. 0, 1, 1, 2, 3, 5, 8, 4, 3, 7, 1, 8, 0, 8, 8, 7, 6, 4, 1, 5, 6, 2, 8, 1, …  Written by Eddy.