As I've discussed elsewhere,
I prefer
constructive reasoning over use
of *reductio ad absurdum*,
a.k.a. argument by contradiction. My discomfort about that mode of reasoning,
at its sharpest, centres on doubts about its trustworthiness as a way of
arriving at truth, which are intimately related to the fact that double negation
isn't an identity (although it may well be a projection; indeed triple negation
may be the same as single negation). It should thus be no surprise that, just
as our minds are poor at handling double negation (partly because some languages
use it as an emphatic negation, rather than an affirmation), the arguments that
flow from *reductio* are harder to follow than their constructive
counterparts, where the same result can be shown positively. Admittedly,
the *reductio* argument may play out faster than the constructive one,
which does aid cognition, so sometimes, particularly when short, reductio
arguments are clearer than their constructive counterparts, at least until you
think of a comparably terse one.

Pause to consider a scatter-plot with two dimensions; one is the complexity
of a constructive proof of a result, the other the complexity of proof
when *reductio* is permitted; each point plotted is then a theorem, marked
at its values for the two parameters of it we're using as dimensions. There may
be simple results that flow so directly constructively that the tersest argument
open to those willing to use *reductio* doesn't actually exercise it; they
are, after all, allowed to use constructive proofs. Such theorems thus lie on
the diagonal line, in our scatter-plot, along which the two co-ordinates are
equal; and all points on our plot, that don't lie on the diagonal, are on the
same side of it, namely towards the constructive proof is more complex

side of the diagonal. So reparameterise our space by one co-ordinate along the
diagonal (roughly the average of the two originals) and another (roughly the
half difference of the originals) to the more constructive is complex

side of it.

We can, indeed, include each theorem at a diversity of
points, one for each pair of proofs, allowing or fobidding *reductio*, of
that result. Since we can chose independently between those, this will actually
mark each theorem at a lattice of points, the Cartesian product of two sets of
values, the sets of complexities of known proofs, respectively allowing or
forbidding *reductio*, of the given result. Of these two sets, the
constructive one is strictly contained in the other; so the least upper bound of
the constructive set is necessarily higher than that of the other; so there is a
pair of proofs that give our result a point, on the the diagonal or to its
constructive-is-complex side, that all other proofs are no less complex than in
either set. (I'm here quietly ignoring that our complexity measure might be
expressed in terms of our conceptual vocabulary; *reductio*'s is richer,
i.e. more complex, so may be penalised by it. This would give a constructive
proof a higher complexity score when considered as a potentially-*reductio*
proof; but I simplistically suppose a reparameterisation of the axes would get
you back to the picture I describe.) Our whole scatter plot is limited to the
region with both original parameters positive; the region in which there are
valid points representing a given theorem is the image of translating this
positive quadrant to put its zero corner at one point of the lattice; as such,
this point is the only interesting one on our plot, so it's the one we want in
order to faithfully represent that theorem.

However, we do sometimes think up new proofs of a result, that give it a new
lower bound in one of the two original parameters, so it's worth remembering
that the point we're actually using to represent our theorem, in a practical
realisation of the plot, might not actually be the ideal one that we can imagine
we would have, if the whole space of possible proofs were trawled
systematicslly; any point in the plot might get moved. The true representation
of the theorem is a pair of sets of co-ordinates, one of which subsumes the
other; for some theorems, the constructive sub-set is actually empty, so this
representation doesn't get to put any points on the scatter plot; but, for
theorems which are amenable to constructive proof, the pair of sets of
complexity values gives rise to a lattice in the plot that we can thus
characterise by a bottom left

corner point. We have a finite sample of
the set of possible proofs, among which we've tended to look for simple ones, so
we reasonably hope the best we've come up with give a bottom left corner
tolerably close to where the ideal point for the given theorem belongs. At the
very least, it seems reasonable to look at the scatter plot as a whole –
even allowing that many of the points on it may be a bit further from the
positive-co-ordinate axes than they should be – and suppose it is a
somewhat fair guide to the shape of the plot that the ideal point for each
theorem would give us; its points may each be shifted a little in
some constructive ≥

direction,
but we can assume this only slightly changes the shape of the cloud in the
scatter plot. *reductio* > zero complexity

The point we're using to represent each theorem's lattice is, in any case,
the one with lowest value for the average of the two original complexity
co-ordinates, within the lattice; selecting this point constrains the
half-difference co-ordinate, even though the lattice has other points (with
higher average co-ordinate) at half-difference values from zero up to at least
the value at this point (and typically well beyond that). The set of points to
which we can possibly ever revise our point for a given theorem lies in a
rectangle whose top right corner is the point we arere using to represent it and
whose bottom left corner is thea ideal point by which we might wish we could
represent it. The hope is that, for most points, this rectangle is small, so
the disturbance to the observed distribution, that we would get by correcting to
use ideal representations for all theorems, should not be too major. One
theorem is more constructive-is-complex

than another if the former's
simplest proof allowing *reductio* beats its simplest constructive proof by
a wider margin than arises for the latter.

For some simple results, *reductio* provides a very quick way to
convince yourself of a result, that takes significantly more labour to prove
without it. Some results require a more complex argument even
when *reductio* is used; we not uncommonly need more complex constructive
arguments to prove these. Sometimes, the complexity of uses of *reductio*
can unravel to expose a constructive proof with significantly
reduced constructive-is-complex

score; for example, when one proves a
positive directly by a simple transformation of the argument that
the *reductio* argument used to prove some double-negative. Otherwise,
among theorems with a constructive proof (which some results provable
with *reductio* indeed won't have; they're the ones constructivists
forswear and don't appear in our scatter-plot), the saving in complexity
that *reductio* can give (i.e. the half-difference parameter) don't grow as
fast as the average; i.e. the scatter-plot lies mostly close to its diagonal
bound.

Now, the high cognitive cost of double negatives and, kindred to it, of
understanding *reductio* proofs, though possibly ignored by our complexity
measure, grows with the extent to which *reductio* is used in the proof; so
a terser proof using *reductio* in a complex way may carry a higher
cognitive load than a carefully-chosen longer constructive argument. So if we
were to change from some information-theoretic complexity measure on our proofs
(as I've tacitly assumed above) to some measure of cognitive load, I am fairly
confident the scatter-plot moves closer to the diagonal. Which is all to say:
constructive arguments can be easier to understand and, even when
using *reductio*, limiting use of it helps treduce cognitive load.

I happened on a set
of logic
puzzles in which I see the given answers using *reductio* and can see
ways to do without it. So let's compare, and see how much *reductio* saves
us in brevity (a tolerable surrogate for simplicity) and think about how its use
affects cognitive load. First, though, I encourage you to follow that link and
enjoy some better writing than mine; I'm going to select my examples,
illustrating how to avoid using *reductio*, rather than dealing with every
example.

The puzzles involve a population partitioned into two sets;
for each set, its members make true statements to one another but false
statements to members of the other set. Calling the latter lying is in some
sense meaningless, because they all have perfect knowledge of who belongs in
which set, so what a member of the other set says to you, you can simply invert
to get a true statement; and the speaker thereby conveys a truth to the listener
(albeit by speaking its opposite). This isn't really lying

(the speaker
can't plausibly expect to deceive the listener); furthermore, neither party
gains any advantage over the other by it (precisely because they never succeed
in deceiving one another). True liars tell the truth often enough that you
believe them even when they don't. However, the formalism here is one in which
all parties have perfect knowledge of the subject matter of what any of them
say, so none of their utterances serve any purpose among themselves – the
only effect is upon the outsider, looking in, that we are invited to be as we
study the puzzles. The description in terms of opposing groups lying to one
another is merely a motivation for considering a formal system of rules and
exploring its consequences.

The first two explore limits on the system; an utterance from which we can
infer nothing; and a utterance that can't arise (i.e. there is no way, within
the formal system, for the situation described to arise). An
typical *reductio* argument leads, from one branch of a choice, into such
an impossible situations so as to eliminate that branch of the choice; by a
process of elimination, this is used to argue for believing a surviving branch,
from which to arrive constructively at some conclusion. The next two examples
show (constructively) how one can infer a simple fact from a simple utterance
(and the inferred fact, though not the statement uttered, is derived from it in
a straightforward way). Each does it by showing that both branches of a choice
lead to the same conclusion, so the conclusion must be true. Example 4 shows
that if one claims to be of a given party, the person to whom the claim is made
is in fact of that party.

We now come to example 5: one says to
another at least one of us is a democrat

, from which we can infer that
they're both democrats. The given proof partitions the possible affiliations of
the two parties into both republican, one of each or both democrat; it shows how
each of the first two choices leads to the given utterance not being spoken, so
discards them and leaves us with the third; this is a classic *reductio*
proof. So instead look at a different way of reasoning from a simplification of
the same fork: either the two are of one party, in which case the utterance is
true so one of them is a democrat, or the two are of different parties, in which
case (without looking at all at the utterance) one of them is a democrat. This
is a constructive step (both sides of a branch lead to a single conclusion); and
its conclusion is the utterance, which we thus know to be true, so we infer that
the two are of the same party, with one a democrat, hence both are democrats.
The fact that this second step combines with one branch of the first step to
produce a contradiction is ignored in this argument; we constructively infer an
intermediate fact, from which we then infer our conclusion. It tells us that
one of the branches we proved, in considering it, doesn't actually arise; but
we don't need that fact to reason from the branch to the conclusion. So now
compare the two arguments:

- If both were Republicans, they wouldn't say this, because it's false; and, if one were from each party, they wouldn't say this, because it's true! So they must both be Democrats.
- If they're of different parties, one of them is a democrat; otherwise, the utterance is true, i.e. one of them is a democrat. Either way, as one of them is a democrat, the utterance is true, so they're of the same party and the other one is also a democrat.

It takes a while (at least for one raised on *reductio*) to get
used to ignoring that the conclusion makes part of the reasoning by which we
reached it appear redundant; but it does so after the fact and needed that
reasoning to reach the fact that makes that reasoning appear redundant.
Exploring the fact that some false statement leads to a conclusion doesn't harm
the truth of that conclusion; what matters is that some true fact does lead to
the conclusion; and if, until we have the conclusion, we don't know some
statement is false, we may need to reason from that statement in order to
establish our conclusion. Anyway: constructive 50 words, *reductio* 31,
saving of 19, which is 38%. Form your own opinions about the cognitive
load.

Example 6 has one telling another At most one of us is a Democrat

;
this is equivalent to At least one of us is a Republican

, so we can apply
the preceding result (with the two names swapped) to conclude that both are in
fact Republicans. So much for simple two-party puzzles.

In example 7 A tells B C is from my
party

, B tells C A is not from my party

and C tells A I am a
Democrat

. Example 4 has shown that the last tells us that A is a Democrat.
The source uses one *reductio* step and one constructive step to conclude
that the other two are Republicans; contrast that with

- B's utterance, given A is a Democrat, amounts to saying B is a Republican; as this is addressed to C, we conclude C is a Republican (by example 4's reasoning, again).
- We thus see that A's utterance is false, so was addressed to someone not of A's party, so B is also a Republican (and told the truth to C).

In example 8, A tells B C is a
Democrat

, C tells A You, B and I are not all Democrats

. The given
reasoning (with a typo in its first inference, but a correct conclusion) uses
two *reductio* steps: compare to

- If C's utterance is false, A and C are of the same party; otherwise, C telling A a truth means they are of the same party. Either way, they're of the same party, so the utterance is true and someone is a Republican.
- As they're not all Democrats, C (and thus A) being a Democrat would require B to be a Republican, obliging A to lie to B; and C being a Republican would make A's utterance to B a lie; either way, we infer that B is of the other party to A and C.
- Thus A's utterance is a lie, A and C are Republicans and B is a Democrat.

In example 9, each of A through X tells the
next (in alphabetic order) Z is from my party

, ending with X telling Y
this; Y tells Z A is not from my party

and Z tells A I am a
Democrat

. Now, Z is from my party

is equivalent to I am from Z's
party

so will only be spoken *to* someone of Z's party (by example
4's reasoning); thus B through Z are all of the same party, Y tells the truth to
Z, so A is of the other party. Z's claim to A tells us A is a Democrat, so all
the others are Republicans (and, indeed, tell the truth to each other, while Z
lies to A, as A lied to B).

In example ten the parties are labelled by
numbers instead of letters: each labelled by an odd number less than some
natural N says to the next-numbered party (with an even number) N is from my
party

; while every party labelled by an even number less than N says to the
next-numbered party (with an odd number > 1) N is

; and N tells 1 *not* from my
partyBetween you and the person labelled 2, at least one
of you is a Democrat

. Now the first two utterances are equivalent to I'm
of N's party

and I'm not of N's party

and (by example 4's logic) we
conclude that the thing it claims about I

is in fact true of whoever it's
addressed to. So the even numbers up to (and possibly including) N are of N's
party and the odd numbers greater than 1 (and also possibly including N) are not
of N's party; since N necessarily is of N's party, it's actually not one of the
latter, so N is even. N's statement to 1 is now equivalent to at least one
of you and I is a Democrat

, which is example 5; so the parties with odd
numbers greater than 1 are Republican, while 1 and the evens, including N, are
Democrats.

Now, I'm not sure how solidly constructive my arguments are, but they do at
least avoid overt use of *reductio*. They replace it with reasoning from
both branches of an or

to a common result, from which we then reach our
conclusions. The validity of this depends on the or

being true; that is,
starting from an A or B

that's true. The habits of *reductio* take
every statement A to nave a negation, ~A, for which A or ~A

is presumed
true (this assumes A is *decidable*) and A and ~A

is presumed
false (i.e. A is not a witness to our system being *inconsistent*).
Thus, with those habits, one reasons from either of these assumed statements;
but one may also reason from some other or

statements given by one's
system, for example every islander is either a Democrat or a Republican

;
and we may need to supplement this with some other statements given by the
system, such as no islander is both a Republican and a Democrat

; albeit
we might restate these two as the islanders are partitioned into two disjoint
sets, one of Republicans, the other of Democrats.

Either way, as long as
the system gives us an or

statement, anything we can infer from both
branches of it must be true.

… all of which lead to me thinking about how to describe logic in
terms of relations; specifically, indeed, the relation implies

among
statements in a system of formal logic. I eventually moved that discussion
to a page of its own, since it's not
so much a thought of the moment as an idea I've been mulling
for some time..