This page is really about how to tell a computer what to do in such a way that it goes about it efficiently; so it has to involve some mathematics (so that we can work out how much work the computer has to do to complete the computation), so it may as well address an entirely mathematical problem.
Fibonacci's sequence is defined by:
Formally, this only specifies f(i) in so far as i is a whole number (i.e. one with no fractional part); but it does specify f(i) for every whole number. We can work forwards to see that f(3) = 2, f(4) = 3, f(5) = 5, f(6) = 8, f(7) = 13, f(8) = 21 and so on. We can turn the second rule around to say f(i) = f(i+2) −f(i+1) and work backwards to get f(0) = 0, f(−1) = 1, f(−2) = −1, f(−3) = 2, f(−4) = −3, f(−5) = 5, f(−6) = −8 and so on. None of this gives us any clue as to a value for f(1/2) or f(π), so Fibonacci's specification only tells us f's values at whole numbers (positive, zero and negative; collectively known as integers).
Notice that f(0) = 0, f(1) = 1 would work as an entirely adequate replacement for the first rule (and I could equally phrase it as (:f|2) = 2). Indeed, specifying f at any two distinct whole numbers would suffice to determine it at every whole number. Also, observe that f(−odd) = f(odd) but f(−even) = −f(even) for each odd or even input explored above; this is easy enough to prove inductively.
For this page's purposes, I'm going to
use python as the programming language in
my examples: however, the reader is welcome to read the code
as pseudo-code
(and I'll aim to restrict myself to python that reads well
as such), understanding indented lines as being subordinates controlled by the
last less-indented line. The key-word def begins the definition of a
function.
It's possible to prove that any sequence f satisfying the second rule above is, for some k and h, f(n) = k.power(n, g) +h.power(−n, −g) where g = (1 +√5)/2 is known as the golden ratio, for which g.g = 1 +g, hence 1/g = g −1. The first rule above (in its shifted form giving f(0) = 0, f(1) = 1) then implies 0 = k +h, 1 = k.(g −1/(−g)) = k.(2.g −1) = k.√5, whence
This even lets us extend f to values other than whole numbers by interpreting power(i, q) as exp(i.log(q)), albeit the fact that (1−√5)/2 < 0 leads to its non-whole powers being multivalued (because log(−1) is any odd multiple of π.√−1); so fib becomes a relation rather than a function, when extended beyond the integers.
If I'm happy with the precision of floating point arithmetic
this
lets me compute f very efficiently; however, floating point arithmetic is all
about adequate approximations and tends to produce mildly embarrasing results
like 4 × 0.035 = 0.14000000000000001 (when it's easy to see that the right
answer is 0.14 exactly) – indeed, using it to evaluate the formula (using
IEEE floating-point arithmetic, with the currently usual double-precision
64-bit floating point type, which has 53 bits of mantissa) gives me f(4) as
3.0000000000000004 and f(−4) as −3.0000000000000004, with similar
errors for bigger values on each side of zero. Worse yet, if I express the
second term as −(−1/g)**n
rather
than −(−g)**(−n)
in defining the function,
f(−2) comes out as −1.0000000000000002; little details in how the
same function is coded can matter.
>>> g = .5 * (1 +5**.5) >>> g * (1 -g) −1.0000000000000002 >>> def fib(i): return (g**i −(−g)**(−i)) / 5**.5 ... >>> def f(i): return (g**i −(−1/g)**i) / 5**.5 ... >>> fib(4) 3.0000000000000004 >>> f(−2) −1.0000000000000002 >>> fib(−2) −1.0 >>> fib(−3) 2.0 >>> fib(−4) −3.0000000000000004
I can of course fix that by rounding to the nearest integer
(the python
function int()
rounds its input down, so
adding .5 to the input rounds to nearest, give or take handling of exact
halves), but even that'll break down round about f(77), which is bigger than
power(53, 2), so when the errors are comparable to the differences between whole
numbers. Indeed, the first error shows up input 71, not equal to the sum of its
predecessors (and, in particular, getting an even answer,
which should only happen at inputs
that are multiples of 3):
>>> int(fib(69) +.5), int(fib(70) +.5), int(fib(71) +.5) (117669030460994, 190392490709135, 308061521170130) >>> int(fib(69) +.5) + int(fib(70) +.5) 308061521170129
The first rule gives f(1) and f(2) as whole numbers (they're both 1, which
is as obviously whole as a number can be) and the second rule gives any other f
value by adding or subtracting f values for inputs closer to 1 and 2 than the
new one we're obtaining, so we can safely induce that all f values are whole
numbers, with no fractional part; which makes it silly to compute them
via floating point numbers
(i.e. ones with a fractional part). That
those numbers in fact represent (using diadic rational approximations)
irrationals doesn't make it any nicer. So I'll be looking at how we can compute
f's values precisely and ignore the short-cut via this nice piece of
theory. None the less, notice that the value of f(i) grows exponentially with
i's magnitude.
Most modern programming languages (for these purposes, ForTran 77 isn't
modern, although it does have more modern dialects) allow one to write functions
which call themselves; this is called recursion. A language can be
clever enough to dodge the problem I'm about to point out, but it's instructive
to see why simple-minded languages run into problems with the obvious
way
to implement Fibonacci's sequence. We see how it was originally specified, so
we can write ourselves a function:
def Fibspec(i): if i == 1 or i == 2: return 1 if i > 1: return Fibspec(i −2) +Fibspec(i −1) return Fibspec(i +2) −Fibspec(i +1)
This simply states the rules given above: the first rule is the first line,
f(1) = 1 = f(2) (the ==
operator tests for equality, as
distinct from the =
operator, assignment, which gives its left
operand(s) the value(s) its right operand(s) previously had). The second line
is just the second rule; and the third line is the reverse of it, needed to get
answers for i < 1.
Unless the programming language you're using is very clever, it's going to
do an awful lot of computation, if you use a definition like this. Let me
explain why. First, observe that if I ask for Fibspec(1)
or Fibspec(2)
, I get my answer back immediately: it's 1. Thus
far, Fibspec
is doing well. Next, if I ask
for Fibspec(3)
it'll test that 3 isn't 1 or 2, notice that 3 > 1
and proceed to work out (as its answer) Fibspec(1) +Fibspec(2)
– each of which it works out directly; so (aside from some over-head,
which I'll ignore, in how the language lets a function call itself recursively)
it has to do one addition to get the answer 2. (It also has to do two
subtractions; indeed, the number of subtractions involved in computing f(i), for
i > 0, is exactly twice the number of additions; but this – along with
the two recursive function calls per addition – just means the total work
involved is a simple constant times the work for one addition; so let's stick to
counting the additions.) If I ask for Fibspec(4)
, it'll likewise
do one addition to combine Fibspec(2)
with Fibspec(3)
;
computing the latter costs it one addition, as before, so it costs two additions
to get the answer Fibspec(4)
= 3. Notice that the number of
additions we've needed to do, each time, is just one less than the answer we
obtained. For Fibspec(5)
, we work out the previous, costing us 2
+1 additions, and do one more addition, getting the answer 5 from 4 additions.
Again, the number of additions is only one less than the answer.
In fact, if we can compute f(i) and f(i+1) at a cost of one addition less
than their respective values, then computing f(1+2) costs us one addition to add
them together plus f(i)−1 to compute f(i) and f(i+1)−1 to compute
f(i+1); for a total of 1 +f(i)−1 +f(i+1)−1 = f(i) +f(i+1) −1 =
f(i+2)−1. So the number of additions we have to do to
compute Fibspec(i)
is going to be only one less than the answer we
get. It shouldn't be hard to see that similar applies for i < 1;
computing Fibspec(i)
requires slightly more (rather than less)
subtractions (plus twice as many additions and recursive function calls) than
the magnitude of the answer, in this case. Given that this answer grows
exponentially with the magnitude of i, this means that Fibspec
is a
horribly expensive way to compute the Fibonacci numbers.
So let's look for a better way to compute f(i). If you look at what we were doing before, you should be able to see that the real problem is that, when computing f(i+2), we first compute f(i), then compute f(i+1); but, in the course of doing the latter, we re-compute f(i). That's obviously dumb: if we'd just remembered it from before, we could save that extra computation. Since the same applies at each step back towards 1 and 2, we're clearly better off simply working forwards:
def Fibiter(i): if −1 < i < 1: return 0 old, ans = 0, 1 while i > 1: old, ans = ans, ans +old i = i −1 while i < −1: old, ans = ans, old −ans i = i +1 return ans
Note that python allows several operands on each side of
assignment; old, ans = 0, 1
sets old
to 0
and ans
to 1; and old, ans = ans, ans + old
computes
the two values on the right before doing the assignments, so ans
gets the prior value of ans + old
and old
gets the prior value of ans
. Notice, also, that I've
thrown in a piece of robustness for the computation: if i isn't a whole number,
the function shall still return an answer (rather than, say, spinning forever in
a loop that's been confused by an unexpected value); and that answer is what you
would get by calling Fibiter
with a whole number close to i.
This time, we're just keeping track of two values –
initially old
= f(0) and ans
= f(±1) –
and working our way up or down to i one step at a time, doing one addition and
one subtraction on each step, so it should be clear that the number of additions
and subtractions we have to do is equal to the magnitude of i, give or take one.
So, with a little easy thinking, we've now got a way to compute f(i) at cost
proportional to i rather than proportional to f(i); given that the latter grows
much faster than i, this is clearly an improvement. (Furthermore, although this
needn't be true in all languages, calling functions (recursively or otherwise)
is more expensive, in imperative languages, than iterating a loop; so not only
do we make fewer steps of computation, but the cost of each step is reduced,
too. We incur some small overhead in having two new local variables, so
computing f(i) for a few small positive i will be minutely slower, but f(0) is
now very cheap and all other f(i) shall be cheaper to compute.)
The interested reader would do well to think for a while, before reading on, about how it might be possible to do even better than this.
Some clever person noticed (many years ago) something marvelously subtle that we can exploit. However, before I can explain how that works, I need to introduce you to the efficient way to compute powers, given a multiplication. While you can compute a power by repeated multiplication
def powiter(n, x): if n < 0: return powiter(−n, 1./x) ans = 1 while n > 0: ans = ans * x n = n −1 return ans
(in which the number of multiplications we do (and, incidentally, the number of subtractions we do) is equal to the power we're raising to), there's a much more efficient way to do it, in which the number of multiplications we do is just the number of binary digits it takes to write out n plus the number of non-zero digits among these (so at most twice the logarithm of n to base two). Once you understand that, I'll be ready to show you how to compute Fibonacci more efficiently.
Note that, if we're happy to use floating point arithmetic, we
could also obtain power(n, x)
as exp(n * log(x))
,
whose cost doesn't depend on n at all – although the costs of calling exp
and log run to quite a lot of multiplications, divisions, additions and
subtractions; so this only works out cheaper when n is quite big. So, for a
(more-or-less) fixed large cost, we can compute power, hence f(i) via the golden
ratio approach, at cost independent of i, provided we're happy with floating
point. However, as noted above (where I
used python
's built-in power operator, g**n
being what
I write as power(n, g), and this built-in probably does indeed
use exp
and log
in this way), I'm trying to avoid
floating point – which, for large values of i, is apt to leave us having
to round a big floating-point value to its nearest integer, which becomes
imprecise once the output of f(i) is big enough that the floating point
representation's mantissa has too few digits to represent it exactly.
def powbits(n, x): if n < 0: n, x = −n, 1./x ans = 1 while n > 0: if n % 2: # n is odd ans = ans * x n = n −1 n = n // 2 x = x * x return ans
I encourage the interested reader to think about why this is correct before
reading on. The % operator computes a remainder: p − (p % q) is always a
multiple of q; so n % 2 is one (which if
considers to be true) when
n is odd and zero (which if
considers false) when n is even.
The //
operator is whole-number division, rounding down (for
positive right operand (denominator); it rounds up for a negative one) to always
produce a whole-number result. The two are related by a % b +b * (a // b)
== a
.
We can use this with floating point arithmetic, just as we can
for whole number arithmetic; and, for modest values of n, it'll be faster
than exp(n * log(x))
, though with the same precision issues as I
note above. As we'll soon see, however, this isn't faster than we can
compute f(n)
; only via exp and log can floating point win, and then
only for large n, at which point the precision problems are apt to matter. So,
for precise computations, what I'll show soon is in fact as fast as the floating
point approach, except possibly for a bounded range of values of i that are
large enough for exp(i * log(x))
to be cheaper
than powbits(i, x)
but small enough that f(i) is faithfully
represented by a floating point mantissa.
What powbits
does is to express n
in binary and
multiply ans
by power(b, x) for each power of two, b, that's
represented by a non-zero digit of n, when written in binary. Computing the
power(b, x) is done by repeatedly squaring x. To help you understand
how powbits
works, consider this recursive form of it:
def powrecurse(n, x): if n < 0: return powrecurse(−n, 1./x) if n % 2: return x * powrecurse((n −1) // 2, x * x) if n > 0: return powrecurse(n // 2, x * x) return 1
The first line is as before. If n is zero, we'll fall all the way through
to the end and get 1, which is just power(n, x) with n = 0. If n is even and
positive, we get to the last but one line: power(n, x) is, in this case, just
power(n/2, x*x), so that's what we return. For positive odd n, n−1 is
even, so power(n−1, x) is just power((n−1)/2, x*x) and multiplying
it by x yields power(n, x). So I hope you can see that powrecurse
does compute power(n, x).
In a really clever programming language, powrecurse
might even
be compiled efficiently: but most imperative languages aren't that clever, so
let's see why powbits
is doing the same thing
as powerecurse
. Each time we go round the loop
in powbits
is equivalent to one layer of recursion in a call
to powrecurse
: the first time round the loop, we have our original
x; if n is odd, we multiply ans
by x and leave the subsequent
iterations to multiply ans
by pow((n−1)/2, x*x); otherwise,
we simply leave those subsequent iterations to multiply ans
by
pow(n/2, x*x). By the time we've fallen off the end of the loop, we've worked
out what we were originally asked for, but we've only gone round the loop once
per binary digit of n (and we've done it without relying on our language to be
clever enough to implement tail recursion with an accumulator). Before I move
on, let me just throw in how I'd actually
implement powbits
in python:
from operator import mul, truediv def powbits(n, x, v=1): """Returns v * x**n, for integer n; v defaults to 1.""" if n < 0: n, join = −n, truediv else: join = mul while n > 0: n, b = divmod(n, 2) if b: v = join(v, x) x *= x return v
Since we're going to want a variable in which to accumulate the answer,
incrementally multiplying by powers of x, and the caller not infrequently wants
to multiply the answer by something, we may as well let the caller pass that to
the function, in place of the 1 we'll use by default; hence
parameter v
replaces local variable ans
. This also
makes powbits
usable when x isn't a plain number, e.g. when it's a
matrix, and might not support multiplication by plain numeric 1, requiring
instead a suitable unit of its own kind, that we can pass in as v. In support
of that, the negative power case is here handled by dividing v by x instead of
by replacing x by its inverse, mediated by over-riding the mul
function – the operator module's mul(v, x)
just
returns v * x
but we replace this with truediv(v,
x) = v / x
for negative n (and truediv
uses the
division for which 1/2 is 0.5 instead of rounding to 0 because 1 and 2 were
given as integers). Furthermore, this even lets us use powbits(n, p,
q)
with negative n to divide q repeatedly by p, even when p has no
multiplicative inverse, provided q actually has p as a sufficiently repeated
factor (and their type implements division to recognise that case; as, for
example,
my study.maths.polynomial.Polynomial
does).
The function divmod
(a built-in of the language) returns a pair
of values, the quotient and remainder from performing a division; so n, b
= divmod(n, 2)
halves n, rounding down if n was odd, and sets b to the
remainder it got when doing so. So this version does the same as the previous
one, but is a little terser and more flexible … and that first line, in
triple-double-quotes, is documentation – because real code should always
be documented.
Implementations of power()
commonly also accept
an optional modulus parameter; when it is supplied, the arithmetic is performed
modulo it. In such an implementation, it can arise that the repeated squaring of
the value being raised to a power, x
above, can become equal to the
multiplicative identity of its type; in particular, when raising a whole number
to a power modulo a prime, this is naturally prone to happening. So an
implementation might well want to exit its loop early if that happens. In our
case, however, the value we'll be taking a power of doesn't have any power
that's an identity.
So now let's go back to looking at Fibonacci. In Fibiter
, we
have two variables, old
and ans
, that hold a pair of
values of the Fibonacci sequence; at each iteration, in either loop, we compute
new values for them from (only) their prior values. That computation is very
simple: old
gets ans
's prior value
and ans
gets the sum or difference of their prior values. If we
used a vector, with two entries, to hold these two variables, each step of the
iteration would thus be performing a simple linear transformation on the vector:
this can be expressed as the contraction of a matrix with the vector. Since
each iteration contracts the same matrix with the vector, overall we're simply
contracting power(i) of the appropriate matrix with the start vector, [0,1]. So
the loops in Fibiter
are like the loop in powiter
, but
applying a matrix to a vector instead of multiplying numbers; and the net effect
is to compute a power of a matrix and contract it with the vector. If we can
compute the power of the matrix in powbits
's way, we can reduce the
cost from of order the input to Fibiter
to of order the number of
bits needed to express it – i.e., its logarithm (to base two).
Now, the matrix for our forward iteration looks like F = [[0,1],[1,1]], with
the second pair as the bottom
row; contracting it with
[old
,ans
],
with ans
below
old
, gives
[ans
,old+ans
] as the new value for
[old
,ans
]. When we multiply F by itself repeatedly,
we get F·F = [[1,1],[1,2]], F·F·F = [[1,2],[2,3]],
F·F·F·F = [[2,3],[3,5]]; in fact (as the interested reader
should be able to verify easily) we get power(i, F) =
[[f(i−1),f(i)],[f(i),f(i+1)]]; so we can compute f(i) as the second
co-ordinate of power(i−1, F)·[0,1], for positive i. Indeed, python
will actually let me define a class that implements matrix multiplication such
that I can simply use powbits
to implement this, with similar for
F's inverse to handle negative i.
This sort of transformation, representing a function's internal state by a vector and each step of an iteration by a linear map applied to that vector, so as to replace the iteration by some more efficient computation derived from the linear map, can be applied to reasonably many contexts, making it worth checking one's code for opportunities to exploit such a transformation. Notice, however, that (in the present example) power(i, F) contains some redundancy: two of its entries are equal and one of the others is the sum of the last with one of the equal ones. We can surely do (a bit) better.
Remember earlier, where we used a solution
to g.g = g +1 to obtain solutions of form f(n) = k.power(n, g)
+h.power(−n, −g); so let's look at polynomials modulo this equation,
i.e. with square = power(2) = (: x.x ←x :) treated as if it were equal to
(: x +1 ←x :). This reduces the space of polynomials to a two-dimensional
space; each so-reduced polynomial is simply (: A.x +B ←x :) for some A, B.
If we multiply it by power(1) = (: x ←x :), it becomes (: A.x.x +B.x = (A
+B).x +A ←x :), so the pair of coefficients is subjected to exactly the
transformation Fibiter
applied to ans
and old
, with ans as A and old as B. Thus power(0) = (: 1 ←x
:) represents a start-state, slightly different from that
of Fibiter
, with old = 1 and ans = 0; and each successive power(i,
x) reduces to f(i).x +f(i−1). Thus taking powers of x in this space of
reduced polynomials is equivalent to computing Fibonacci's sequence.
The reduced space of polynomials represents our iteration in a way I can easily motivate from the polynomial that appears in the analytic treatment of the sequence; that works generally for similar iterations. However, we're lucky in the details of how we use it to encode our iteration. In other cases, such as the Perrin iterator, the corresponding analysis can require more thought; none the less, the essential treatment of the problem is the same. Reduce polynomials modulo the one that mimics our iteration, then look at the resulting multiplication to find a way to represent tuples of iterated values and a polynomial to multipliy by to implement the iteration.
When we multiply two of our reduced polynomials, the cross-terms in x.x
reduce away; (A.x +B).(C.x +D) = A.C.x.x +(B.C +A.D).x +B.D = (B.C +A.D
+A.C).x +(A.C +B.D). The code doesn't actually need to know we're working
with polynomials; it just sees us working with pairs of numbers and using an
eccentric multiplication on such pairs: [A, B].[C, D] = [A.D +A.C +B.C, A.C
+B.D]. The pairs [0, 1] and [1, 0] represent power(0) and power(1),
respectively; the former is our multiplicative identity and the successive
powers of the latter shall give us Fibonacci's sequence. We just need to code
the equivalent of powiter
for the multiplication of our reduced
polynomials. So first we define (using python tuples (a, b)
rather than lists [A, B], simply because the language provides convenient
support for these):
def fibtimes(p, q): (a, b), (c, d) = p, q # unpack as pairs return a * (c +d) +b * c, a * c +b * d
which is just our peculiar multiplication. (We could, equally, subclass the
built-in tuple
type in python, of which (a, b)
and (c, d)
are instances, and define the methods on it to support
this as multiplication in the
ordinary way, rather than having to call fibtimes
explicitly; we
could then use powbits
, above, rather than a custom implementation
of power.) Then we implement the power
operation for
this multiplication
restricted to only computing powers of (1,
0)
(i.e. power(1) or x ←x), as
def Fibpow(i): (a, b) = (1, 0) # x (c, d) = (0, 1) # ans while i > 0: i, n = divmod(i, 2) if n: c, d = fibtimes((a,b), (c,d)) (a, b) = fibtimes((a,b), (a,b)) return (c, d) # x**i
We can actually do a little better than computing Fibpow
,
extracting c from the pair it returns and munging to handle sign; since we only
need one part of its return in any case, we can simply return that; and we can
handle negative exponents by exploiting a multiplicative inverse for (: x
←x :), namely (: x−1 ←x :), since x.(x−1) = x.x −x
which is equivalent to x+1 −x = 1. So we can use [1, −1] as
(: 1/x
←x :), in place of [1, 0] as (: x ←x :), and flip the
sign of i, if negative. At the same time, we can avoid squaring (a,
b)
the last time round the loop (where its value shalln't actually be
used again) by moving the rest of the loop after it and unwinding a copy of it
out into the preamble, where we can simplify it:
def Fibfast(i): if i < 0: a, b, i = 1, −1, −i else: a, b = 1, 0 i, n = divmod(i, 2) if n: c, d = a, b else: c, d = 0, 1 while i > 0: a, b = fibtimes((a,b), (a,b)) i, n = divmod(i, 2) if n: c, d = fibtimes((a,b), (c,d)) return c
(Alternatively, the unrolled first iteration could initialize slightly
differently and record which of c, d to return at the end – details I
leave as an exercise for the interested reader.) Since we've now reduced our
computation to an analogue of power, the amount of computation needed to compute
f(i) only grows, with i, as fast as the number of digits it takes to write i out
in binary (i.e. the logarithm of i to base two), which grows much less rapidly
than i. We are doing rather more computation at each of these fewer
steps (rather than just one addition and one subtraction, we're doing: either
four multplications, three additions and some easy binary stuff (taking
remainder by two and dividing by two is really cheap); or eight multiplications,
six additions and the cheap binary stuff), so the Fibiter
solution
is faster for small values of i; but the Fibfast
solution wins for
larger values of i.
If you're always computing answers for small values of i, and doing so often
enough that you care about performance, Fibiter
is easily beaten by
an implementation which remembers the answers it's worked out in the past; which
can be just as efficiently implemented, using a variant of Fibspec
which caches its answers, as by using Fibiter
. In python, that
would be written:
def Fibcache(i, cache=[0, 1]): if i < 0: return Fibcache(−i) if i % 2 else −Fibcache(−i) try: ans = cache[i] except IndexError: assert i > 1, "cache started with two entries" ans = Fibcache(i −2) +Fibcache(i −1) assert len(cache) == i cache.append(ans) # ans is now cache[i] return ans
(One could also chose to limit the cache size; before the try
stanza, check for i > BIG
, for some suitably big limit on cache
size, and fall back on Fibfast(i), optionally optimising to exploit the
knowledge i > 0
.) Aside from the special handling of negative
i, some of the cleverer (usually functional) programming languages might even
compile their equivalents of Fibspec
to something along these
lines. On the other hand, if you were only interested in f(i) for a sparse few
large i, Fibfast
wins (strongly); and it'd be asking a lot to
expect the compiler to spot that it can re-arrange Fibspec
, or
anything even remotely resembling it, into that. In fact, if you compiler is
that clever, you should probably (politely) invite it to take the Turing
test.
One can, of course, do the same in any half-way decent programming language, so here's an illustration in C++:
#include <stdint.h> #include <utility> int64_t Fibspec(int8_t i) { if (i < 0) return Fibspec(i +2) −Fibspec(i +1); if (i > 1) return Fibspec(i −2) +Fibspec(i −1); return i; } int64_t Fibiter(int8_t i) { if (i == 0) return i; int64_t old = 0, ans = 1; while (i > 1) { old = std::exchange(ans, old +ans); −−i; } while (i < −1) { old = std::exchange(ans, old −ans); ++i; } return ans; } template <typename F = double, typename C = unsigned> F powiter(C n, F x) { if (n < 0) return powiter(−n, F(1) / x); F ans = 1; while (n > 0) { ans *= x; n −= 1; } return ans; } template <typename F = double, typename C = unsigned> F powbits(C n, F x, F v = F(1)) { if (n < 0) return powbits(−n, F(1) / x, v); while (n > 0) { if (n & 1) v *= x; n >>= 1; x *= x; } return v; } template <typename F = double, typename C = unsigned> F powrecurse(C n, F x, F v = F(1)) { if (n < 0) return powrecurse(−n, F(1)/x, v); if (n > 0) return powrecurse(n >> 1, x * x, n & 1 ? x * v : v); return v; }
Note that fib(93)
will overflow int64_t
, so
even int8_t
suffices to represent all the inputs we can sensibly
support, in the absence of big-number support such as python provides by
default. The pair-arithmetic begs to be rewritten in class-structured form
as:
class FibPair { int64_t lin, con; FibPair(int64_t n, int64_t c) : lin(n), con(c) {} FibPair(int64_t c) : lin(0), con(c) {} FibPair(const FibPair &) = default; FibPair &operator=(const FibPair &) = default; FibPair operator*(FibPair other) const { return FibPair(lin * (other.lin + other.con) + con * other.lin, lin * other.lin + con * other.con); } static FibPair power(unsigned n, FibPair x, FibPair v) { if (n == 0) return v; return power(n >> 1, x * x, n & 1 ? x * v : v); } public: static int64_t FibFast(int8_t n) { FibPair x = n < 0 ? FibPair(1, −1) : FibPair(1, 0); return power(n < 0 ? −n : n, x, n & 1 ? x : FibPair(0, 1)).lin; } };
providing FibPair::FibFast
as the equivalent
of Fibfast
. It's also possible to write an equivalent of Fibcache,
using a std::vector
as cache in place of its list. Finally, here's
an implementation of the floating-point solution, whose
computational cost (nominally) doesn't increase with the input:
#include <math.h> int64_t Fibfloat(int8_t n) { constexpr double lnRoot5 = log(5) / 2; constexpr double lnGolden = log(.5 +sqrt(1.25)); const double lnRaised = n * lnGolden; const double p = exp(lnRaised − lnRoot5); const double q = exp(−lnRaised − lnRoot5); return llround(n & 1 ? p +q : p −q); }Written by Eddy.