Conversation with my father in 2001/November got me thinking about the halting problem ...
A close relative of Gödel's theorem is
known as the halting problem
. Turing explored what, in theory, a
computer is capable of: he did so before anyone the era of electronic computers,
but his work remains decisive in describing the ones we have (albeit one
generally works with von Neumann's adaptation of Turing's ideas).
One of the problems Turing pointed out was that, even if you know a certain
computation would, upon termination, produce the correct answer to some problem:
you can't be sure whether the computation will ever terminate. It's all very
well having a computer program to compute (say) the meaning of life, the
universe and everything - but if it's never going to finish running, it's no use
to you. So Turing wondered whether there might be some way of writing a program
which would tell you whether a given computation would terminate - or
halt
, as the jargon has it.
The crucial issue here is to determine when to be patient - the computation
just takes a very long time to run - and when not to bother - it's never going
to halt. Of course, our program, H, to determine whether a computation will
halt needs to halt in order to tell us a result. We shalln't mind (much) if H
keeps running for ever when the computation it's studying does halt -
we can always run the two in parallel and interrupt H in the event of P halting
- what we really want is to be sure that, if P won't ever halt, then H will tell
us so (and, in particular, halt in order to do so). Turing proved that the
halting problem cannot be solved: he did so by a powerful and elegant use of a
diagonal argument
, which I outline below.
A computation
in the sense above comprises a program and the input
data it is given when run. In some cases, one can determine that a program will
halt regardless of its data; however, in general, one cannot determine whether
running a program will take forever unless one knows what data the program will
be given. So the inputs to our program to determine whether a computation
terminates will be representations of the computation's program and data. Now,
we can reduce both the data and the program (via some choice of encoding of the
text making up each, in whatever character set one used to encode each) to a
sequence of bits, and we can always interpret a sequence of bits as a natural
number
(i.e. 0, 1, 2, ... etc.; a.k.a. the counting numbers or finite
ordinals) by reading it as the binary representation of a number. Thus we can
encode the program as a natural number and each input as a natural number.
Now, let's introduce some notation. First, we can interpret any program as
a mapping from its inputs to its outputs. Thus, for a program P taking inputs
a,...,z, P(a,...,z) denotes the output P yields when given a,...,z as inputs.
I'll say that P(a,...,z) halts
as a short-hand for if P is run with
inputs a,...,z then it halts
.
We can encode any input as a natural (indeed, we could encode all the inputs to a given program as a single natural, but I shalln't do that universally - I want H taking two natural inputs), by encoding the relevant datum as a sequence of bits and reading this sequence as a natural. We don't need to represent this encoding for the case of data, but we'll need one special case of it - the encoding of a program as a natural - because we use that to prepare H's first input. So I'll first introduce our encoding of programs: ({naturals}: E |{program texts}) is a mapping which encodes a program as a natural number. E is monic (no two programs encode as the same natural number - indeed, two programs will encode as different natural numbers even if they are functionally equivalent) so we can obtain a mapping ({programs}: C :{naturals}) = reverse(E) = (: t ← E(t) :), albeit C might not accept every natural as an input. Since, when a natural n is the encoding of some program, C(n) is that program, say one taking inputs a,...,z, so we can write an invocation of this program with these inputs as C(n,a,...,z).
I'll use functions defined in the programming language python as my programs
in illustrations
- I chose python because it makes fairly intuitive pseudo-code. The reader
should note that indentation matters in python: subordinate statements (e.g. of
an if... or while... clause) can either be put after their primary (the if or
while), on the same line, or appear as an indented block. A triple-quoted
string at the start of a function is documentation for the function; text from a
#
character to the line's end is a comment, ignored by the python
interpreter. For illustrative purposes, here's a potential implementation of E:
def E(text): """Convert a program to a natural. Uses python'slongintegers as an implementation of the naturals. """ result = 0L # long zero: forces this type on subsequent arithmetic # ord() builtin implements ASCII encoding of text; output is between 0 and 256 for ch in text: result = result * 256 + ord(ch) return result
Next, let S = (| E :{programs taking one natural input}). Note that, by
encoding all the inputs as a single natural, every program can be encoded as a
member of S (not that this will affect our argument). For simplicity, instead
of the utterly general halting deteminer originally requested, let us merely ask
for its restriction to programs taking one natural argument: a program H = (:
({yes,no}: H(s,n) ←n |{naturals}) ←s |S) for which (when it halts)
H(s,n) tells us whether C(s,n) halts. For convenience, suppose that if C(s)
doesn't accept input n, or if s isn't a member of S, then the computation
denoted by C(s,n) aborts
, i.e. halts with some error
indicator
set, in which case H(s,n) either returns yes or doesn't halt.
As discussed above, the important property we want from H is its ability to say no: this is undamaged by packaging H inside a wrapper which only terminates if C doesn't:
def A(s, m): """Wrapper for H: at most one of A(s,m) and C(s,m) halts.""" # H(s,m) might not halt, in which case nor does this: if H(s,m) is 'yes': # C(s,m) halts while 1 > 0: pass # infinite loop return 'no' # C(s,m) doesn't halt
This gives us a program A which takes two naturals: if A(s,n)
halts, it says no
and C(s,n) doesn't halt. Furthermore, if C(s,n) does
halt, A(s,n) doesn't (because H(s,n) either doesn't halt or, on halting, returns
yes, so that A goes into its infinite loop). Next, we can consider the program
def B(n): """The Diagonal of A""" return A(n,n)
which gives us a program B which takes one natural input, whence j = E(B) is in S; i.e. B is C(j) for some j in S; so now ask whether A(j,j) halts. Clearly, C(j,j) is B(j) is A(j,j) - and they do exactly the same computations, give or take packaging which takes a finite amount of computation. However, if either C(n,m) or A(n,m) halts, the other doesn't (from A's construction) so at least one of C(j,j) and A(j,j) doesn't halt; but they're the same computation, so if one doesn't halt, nor does the other.
What we've now proved is that, whatever program we write to determine whether computations will complete, (either it'll be wrong some of the time or) there will be some computations which never finish and for which our program, attempting to determine whether the computation will finish, also never finishes. This means it's impossible for a Turing complete computer to guarantee that it'll never get stuck in a non-terminating computation.
The proof formally depends on your programming language being Turing
complete
- i.e. capable of expressing all mathematics
- though, in
practice, it didn't need the language to be capable of much. There are various
ways out of the trap - but they all involve giving up a fair amount of freedom
as to what one can write a computer program to do - in short, they require you
to work with a programming language which isn't Turing complete. There's also
various ways one can avoid the trap altogether - most obviously, by writing a
program which will usually
tell you whether some computation will
terminate, but admits that it can't tell in few enough cases that you don't care
- this is a useful tool, for all that it fails to live up to its Platonic
ideal.
My friend Conor is at work on a little language (with strong type checking,
among other properties) which isn't Turing complete and in which you can write a
halting-determiner (indeed, IIRC, you can't compile a program unless it halts).
He then embeds that in a Turing complete language whose type system is encoded
in the little language (and, indeed, any program in the little language can be
used in specifying a type for a program in the little language, hence in
either). This would encourage one to write as much of any program as possible
in the little language, only using the Turing completion extensions where one
must, which will probably make it much easier to write a useful enough
termination detector even for the Turing complete case.