]>
A binary operator
is
a denotational form used when combining two
values, particularly when the mode of combination is applicable to many more
values. The form interposes a symbol, representing the binary operator, between
two expressions denoting values; the combined text then denotes a value that
results from combining the given values. As such, it is an expression denoting
a value, so it can be combined by another binary operator with another
expression denoting a value, and so on. It is binary
in the sense that
it combines two values at a time; but, by repeated application, it can
thus combine many. The values combined by a binary
operator are known as
its operands
.
Context, in giving meaning to a symbol as a binary operator, must indicate what values it is capable of combining and how it combines them. Where more than two operands are so combined, context may specify the order of evaluation or establish that it makes no difference, in order to resolve the ambiguity that might otherwise arise. I allow that a binary operator may be ambiguous (for some a, b there may be several values that a*b can denote) but most are in fact unambiguous, at least modulo context's relevant equivalence construed as equality. Any unresolved ambiguity in evaluation order may also contribute ambiguity, independent of any inherent in the binary operators themselves, when many values are combined.
Two primary classes of binary operator shall emerge: those I'll
term categoric
(e.g. proper composition
of homomorphisms) may be fussy about their operands, restricting which they can
combine in what order; while the more classical binary operator
(e.g. arithmetic) has a set of operands and freely combines any operand with any
operand, in either order (albeit the result of combining them may depend on
order); I call such binary operators flat
to distinguish them from
categoric ones. In either case, it is usual for the composite values that arise
from combining two operands to be, themselves, operands; and for the binary
operator to introduce no ambiguity when combining several values. I'll now
formalise these two properties, before going into the details of the two classes
of binary operators.
Given a binary operator * and a collection V of its operands, I'll describe
V as closed under
* precisely if V subsumes {a*b: a, b in
V}. I'll describe a binary operator as, simply, closed
precisely if its {operands} is closed under it.
An intersection of collections, each of which is closed under *, is itself
necessarily closed under * since any a, b in it, that * knows how to combine,
are necessarily in each of the collections intersected, each of which is closed
so also contains a*b, which is thus a member of the intersection. For any
collection V of operands of *, V's closure under
* is the
intersection of all collections, closed under *, that subsume V; this is a
minimal collection of *'s operands that does subsume V, provided * itself is
closed (to avoid intersect(empty) as the closure).
One of the ways we can represent a binary operator * is as the embedding (: (: b*a ←a :) ←b :) which represents each left operand by a relation between values and right operands. When this embedding is iso (it is always a mapping; so it is enough to show it monic) we can use it to reason about the action of our binary operator in terms of the composition on relations. When * is closed on V, the embedding is a mapping ({relations (V: :V)}: |V); given that {relations (V: :V)} is closed under composition, it is not unreasonable to hope that our embedding's collection of left values is likewise closed under composition. When it is, if the embedding is also monic and the binary operator associative (see below), the embedding serves as a faithful representation of our binary operator.
For any collection U of operands of a binary operator *, we
can restrict the associated embedding and its left values on the right with U to
get (: (: a*b ←b :U) ←a |U) and thereby obtain the binary operator on
U whose associated embedding this is; I shall refer to this as
*'s restriction
to U. (In contrast, the normal
restriction of relations applied to *'s associated embedding gives (: (: a*b
←b :) ←a |U), which doesn't restrict its left values; the binary
operator associated with this mapping combines a in U with an arbitrary right
operand b of *, rather than also restricting to right operands b in U. For this
reason, I do not treat a binary operator as definitively synonymous with its
associated embedding; they support distinct meanings of restrict
.) Note
that this does not restrict the left values of the embedding's left values, so
every a*b with a, b in U is valid for this restricted binary operator, even if
a*b isn't in U; thus *'s restriction to U is closed (as a binary operator)
precisely if U is closed under * (hence also under its restriction to U). When
U is not closed under *, restricting (: (U: a*b ←b :U) ←a :U) would
give a closed binary operator; but some a*b with a, b in U would be valid for *
but (due to a*b not being in U) not valid for this restriction.
Distinguishing the binary operator from the mapping that
encodes it also allows me to define a distinct meaning
of subsume
for a binary operator. One binary operator, *,
subsumes another, @, precisely if: whenever @ knows how to combine a with b, so
does *; and every value @ admits for a@b is a value * admits for a*b. For
unambiguous *, @ is necessarily unambiguous also; if @ is flat, in this case, it
is then simply a restriction of *. However, in general, @ may be pickier than *
about which operands it choses to combine. When * is unambiguous, restriction
and such pickiness are the only options; but for ambiguous * we can also simply
have @ combine a@b to produce only some of the values a*b can take.
I'll describe an expression that combines values using binary operators
as valid
precisely if each use of each binary operator
has, as its operands, values the given binary operator can combine; the binary
operator template actually makes the expression assert this anyway, but overtly
saying it's valid makes the sentences below read more clearly – a*b is
valid
is only slightly clearer than * does know how to combine a and
b
but I encourage the interested reader to think about how the specification
of associativity, that follows, would end up without the validity
short-hand.
As already discussed, combining several values using binary operators has
the potential to produce different answers depending on
the order of evaluation. When
combining values using only a single binary operator, that operator is described
as associative
precisely if the order of evaluation makes
no difference. For a binary operator * this requires that, for any a, b, c for
which either (a*b)*c or a*(b*c) is valid, both of the given expressions are
valid and they agree. This is
sufficient to ensure associativity (in the sense just specified), as every order
of evaluation, of a given sequence of operands alternating with * as operator,
can be transformed to every other by such re-arrangements of such three-way
combinations.
Given a sequence of n operands alternating with repeats of a single operator, let the number of possible orders of evaluation be p(n). We clearly have p(1) = 1 = p(2). We can infer p(1+n) from (:p:1+n) by considering the outermost parenthesisation (i.e. the top-most node in the parse tree): this necessarily splits the list into two non-empty parts, whose lengths sum to 1+n, and we can parenthesise each part in any way we like; taking the length of the (non-empty) left part as 1+i for some natural i, we must have 1+i < 1+n so that the right part is non-empty, hence: p(1+n) = sum(: p(1+i).p(n−i) ←i :n), at least for n > 1. The first few are then: 1, 1, 2, 5, 14, 42, 132, 429, … and, generally, p(1+n) = (2.n).(2.n −1)…(n+2)/n/…/2/1 = (2.n)!/n!/(n+1)!; but I'm not about to try to prove that here !
I'll also say that a binary operator * associates with
a binary operator @ precisely if, whenever both (a*b)@c and a*(b@c) are valid
they agree and likewise for (a@b)*c and a@(b*c). Note that this is a weaker
condition than associativity, as it only applies where the expressions are given
to both be valid; so a binary operator * can associate with itself, in this
sense, even if it isn't associative due to some (a*b)*c being valid where
a*(b*c) isn't (or vice versa). When * and @ associate with one another, we can
freely write a*b@c or a@b*c without parentheses as long as at least one of the
given readings is valid – although including parentheses may be a kindness
to the reader when only one reading is valid, to make clear which that
is.
Associativity allows us to combine arbitrarily long lists of operands without introducing any ambiguity (aside from any inherent in our binary operator itself), which makes it natural to consider the binary operator in terms of its bulk action.
Closure ensures that the result of combining operands is an operand, so can
be combined with other values, raising exactly the question that associativity
answers; and associativity is only relevant in so far as a result of combining
operands is an operand, so is at its most relevant when the binary operator is
indeed closed. So I'll refer to a closed associative binary operator as
a combiner
; most binary operators of interest either are
combiners or complement them in some way (as subtraction and division complement
addition or multiplication, undoing
their respective effects) that
enables us to infer their action on operands from actions of the combiner from
which they're derived.
Note that formal logic and functional programming also use the
word combiner
in a highly technical sense, for some specific operators
that may or may not combine values; here, however, I'm using the word in a sense
closer to its natural language meaning (that I note some Haskell programmers
also use, when talking about things that combine values). A combiner combines
things while abiding by certain rules that are commonly taken for granted; the
above merely formalises those assumptions to facilitate precise reasoning about
their consequences and their interaction with the effects of other assumptions
that we shall, at least in some cases, find it natural to make.
In a ski resort it is usual to have some lifts providing access to
maintained runs. Sections of lift start at the ends of runs or of other lifts;
each run typically starts at the end of some lift and ends at the start of
another. If a run passes a start or end of a lift, we can split the run in two
so that one part of it ends here and the other starts here; if a section of lift
ends at a place from which there are no maintained runs, but from which just one
other lift takes off, we can interpret the two as a single lift. A guide to the
resort can then describe the start and end points (some of which may be the
sites of other ameneties, such as restaurants) and characterise the routes, both
runs and lifts, between them. The skier can then compose a route out of the
runs and lifts thus described, by combining runs and lifts subject only to the
constraint that each route fragment starts where the previous left off. The
composition of routes around the resort can thus be construed as a
combiner, and then
, which only allows combination of a left operand
ending in a given place with a right operand starting in the same place; the
composite starts where the left operand started and ends where the right operand
ends, which constrains what we can combine it with. This is in stark contrast
with the addition and multiplication of
everyday arithmetic on numbers, which
combine any number with any number freely.
I'll describe a binary operator * as flat
on some
collection V precisely if a*b is valid for every a, b in V. Aside from category
theory, orthodoxy mostly focuses on flat combiners. On {relations} composition,
&on;, is indeed a flat combiner; but it has
restrictions that aren't flat. These last are combiners on
{relations}; each also has flat restrictions, e.g. clean composition is a flat
combiner on {relations (V:|V)} or on {mappings (V:|V)} for any collection V.
The familiar addition and multiplication of numbers are both flat. For the most
part, classical binary operators are flat combiners.
I'll describe a combiner * as categoric
precisely
if:
The first two conditions generalise requiring an identity
at
each start-point and each end-point, that would make each b = e*b = b*i for some
e and i, each of which serves as a surrogate for an object
at which b
starts or ends. The last, albeit painfully technical, expresses the idea that
there are end-points and the only constraint on whether * can combine values is
the need to have a common end-point in between the operands to be combined.
The composition of homomorphisms, based on clean composition of relations, serves as the archetype for category theory, which is the study of categoric binary operators.
In particular, the last categoric condition ensures that associativity does something useful for us. Suppose we have u, v, w for which u*v and v*w are both valid; the second condition above lets us write v = r*s for some r, s; this makes u*(r*s) and (r*s)*w valid, hence so are (u*r)*s and r*(s*w); applying the third condition to either of these and r*s then implies (u*r)*(s*w) is valid, for example b = u*r and a = r with c = s, d = (s*w) have a*c and b*c valid, so validity of a*d implies that of b*d; associativity then gives us u*r*s*w = u*v*w. Thus, whenever u*v is valid and v*w is valid, we also have u*(v*w) and (u*v)*w valid. This makes it meaningful to combine (arbitrarily long) lists of values, subject only to * knowing how to combine each pair of adjacent entries in the given order; this suffices to ensure * can combine the whole list; and associativity ensures the result of this bulk action doesn't depend on order of evaluation.
Although flat combiners commonly are categoric (only the second condition above has any possibility of failing), in the given sense, it is usually clearer to describe them in terms that ignore this; the study of categoric combiners is burdened with all the constraints on which values the operator can combine, which merely gets in the way of the study of a flat combiners, where these constraints are trivial.
A homomorphism between two structures is a relation between the values involved in each which expresses the structure of one within the other. Thus a homomorphism from one binary operator, * on V, to another, @ on U, is a relation (U: f |V) for which, in so far as * knows how to combine v, w in V to obtain some value x as v*w (allowing that * may be ambiguous, so there may be more than one such x): for every u that f relates to v, t that f relates to w and s that f relates to x, @ does know how to combine u with t to get s as u@t. When f is a mapping and the binary operators are unambiguous, this simply says f(v*w) = f(v)@f(w).
When * is ambiguous, a mapping f might collapse out that ambiguity, by mapping all the values each v*w can take to the same member of U; if it does not, @ must be ambiguous to match the ambiguity in * enough to let f(v)@f(w) take all of the values that f(v*w) can take. In contrast, @ may be ambiguous even if * isn't, or may be more ambiguous than the foregoing requires when * is, provided each f(v)@f(w) can take all the values f(v*w) can; in the presence of ambiguity, but with f still a mapping, we thus require {f(v)@f(w)} to subsume {f(v*w)}, but the expressions f(v)@f(w) and f(v*w) need not entirely agree.
Equally, when f is a relation but not a mapping, what's required is still just that {f(v)@f(w)} subsumes {f(v*w)}, i.e. every member of U realisable as f(v*w), via the ambiguities in * and in f for given v and w, must also be realisable as f(v)@f(w), via the ambiguities in @ and in f.
When * and @ are unambiguous, every mapping (U: |V) that a homomorphism f subsumes is constrained by the above to be a homomorphism, which makes it more interesting to simply study the homomorphic mappings. Building up homomorphisms that are relations from these isn't necessarily trivial, but the study of mappings is far more straightforward. Consequently, it is most usual to study homomorphic mappings rather than general relations as homomorphisms.
Because binary operators potentially have a left↔right
asymmetry, that's swapped around by transposing the binary operator, we can also
have homomorphisms from one binary operator, *, to the transpose of
another, @; I'll describe these as cohomomorphisms
. Just
as homomorphisms from a binary operator to itself are called automorphisms,
a coautomorphism
is a cohomomorphism from a binary
operator to itself – i.e. a homomorphism from a binary operator to its
transpose. The homomorphisms of categoric binary operators get to be known as
functors between their categories; their cohomomorphisms are known as
cofunctors. The identity on operands isn't a coautomorphism unless the binary
operator is commutative; which, of course, makes the co- distinction largely
meaningless; and allows us to specify commuting in terms of the identity being a
coautomorphism.
Composing two cohomomorphisms gives a homomorphism as composite; f(g(x^y)) = f(g(y)*g(x)) = f(g(x))@f(g(y)) swaps the order then swaps it back again. In particular, coautomorphisms can be self-inverse; and just one such suffices to represent all other cohomomorphisms to or from a given binary operator as homomorphisms by composition before or after the given self-inverse coautomorphism. This tends to mean that, when a self-inverse cohomomorphism is natural to a structure-type, the theory gives a special rôle to this one cohomomorphism and thereby avoids talking about other cohomomorphisms as such.
Suppose we have a homomorphic map (U: f |V) from binary operator * on V to
binary operator @ on U. The action of * in V induces a binary operator in (| f
:V) that consumes outputs of f to give f's image of what * gets by combining the
inputs that produced them. For the present section, I'll refer to * as
the source
binary operator, this image of it as the image
binary
operator and to @ as the host
binary operator. As f is a homomorphism,
the host subsumes the image (in the special sense of subsume, specific to binary
operators, above). Typically, the image has all
properties of the source, regardless of whether the host shares those
properties; some properties of the host naturally impose themselves on the
image; in some cases, this may only be possible if the source shares the
property; in other cases, (: f |) can reduce the source to a binary operator
with the property.
When the source is flat, so is its image; if the host isn't, this restricts what mappings can be homomorphisms. When the source is closed, so is its image; but the host might be willing (either because the source isn't flat or because the host has more ambiguity) to combine some operands of this image to produce values not in the image of f; thus (| f :V) might not be closed under the host, even though it is closed under the image. When the source is associative, so is its image, again without constraining the host, save that it at least has some operands that it combines associatively. Thus any homomorphic image of a combiner is a combiner, even if its host isn't.
It is entirely possible for the source (hence the image) to be fussy about
which values it combines yet be homomorphically mapped to a host which isn't, or
is less so; indeed, the composition on T-homomorphisms, for any structure-type
T, forms a categoric binary operator and there's a forgetful functor
that
embeds it in the flat composition on general relations; the image is as fussy as
the source but the host combines the homomorphism without regard to whether they
morph between T-structures for which this makes sense; the host's composites
then typically aren't T-homomorphisms of anything, although many of them are apt
to be empty, which may be a T-homomorphism of an empty T-structure, if there is
one, regardless of how wildly inappropriate this is for the homomorphisms being
combined to make it. All of which is why homomorphic composition has to be
rather more tightly policed than the general composition of relations.
When the host is associative, so is the image; for the homomorphism to be monic, the source must also be associative; otherwise, the homomorphism's right-relation, (U: f |), can be used to induce an associative relation from the source. Allowing that the source might not be closed, it's necessary to specify that V is its collection of operands and combinations, so that f does map each combination to something in U and duly induces an image binary operator. Bearing that in mind, closure of the host does little to constrain the image, much less the source, unless the homomorphism is onto (U| f |V), in which case closure of the host does indeed impose closure on both image and source.
When the host's restriction to the image's operands isn't the image
(although it necessarily subsumes it), we can naturally pull back
a pre-image
of the host on the source's operands, that necessarily
subsumes the source. Indeed, if we have a mapping from some collection to the
operands of a host binary operator, we can use it to induce a binary operator on
the mapping's inputs, for which the mapping is trivially a homomorphism to the
host, although the result may be rife with ambiguities if the mapping doesn't
play nicely with the host.
As I mentioned while describing closure and associativity, we can also consider a combiner in terms of its bulk action or of the way it embeds its operands in {relations}; these model the combiner in terms of the concatenation of lists and the composition of relations, respectively. Each expresses any given property of a combiner in its own way; each thereby sheds light on different facets of the property. It proves useful to consider combiners from all three perspectives; this page and the two just linked to each cover one of these perspectives separately; but the full picture is clearest when these three perspectives are combined. Below, I define some properties applicable to binary operators generally and sketch others that, though always applicable, take on sufficiently different forms for flat and categoric combiners that I prefer to leave their formal definitions to the respective specialised discussions, rather than getting bogged down in an attempt at sufficient generality to cover both (when, even if I did, each branch would still need to explain what that meant for itself).
If we have two binary operators, * and @, we typically need
to formally parenthesise sub-expressions using either, with (a*b)@c and a*(b@c)
distinct, as mentioned above when discussing the possibility that they associate
with one another. An alternative to associating with one another is for one
to distribute over
the other, in the following sense. I'll say: that
* left-distributes over
@ precicely if: for any a, b, c for which either
a*(b@c) or (a*b)@(a*c) is valid, so is the other and they agree; that
* right-distributes over
@ precisely if: for any a, b, c for which
(a@b)*c or (a*c)@(b*c) is valid, so is the other and they agree; and that
* distributes over
@ precisely if it right-distributes
over @ and left-distributes over @. The familiar case of this is that
multiplication distributes over addition: a.(b+c) = (a.b) +(a.c). In boolean
logic, where combiners representing and
and or
act on truth
values
(taking value true or false), each distributes over the other.
One can equally characterise * distributing over @ in terms of the mappings (: k*v ←v :) and (: v*k ←v :), for fixed k, being autoprhisms of @; when * is associative, this makes (: (: u*v ←v :) ←u :) and its transpose a homomorphism and a cohomomorphism from * to automorphisms of @.
When a binary operator has one operand, e, which, when combined with
another, v, produces v as value, i.e. the binary operator swallows uses of e,
then e is described as an identity
. When two operands
combine to make an identity, those two operands are described
as inverse
to one another. The precise details, in each
case, are complicated by questions of which other values a given e can
be combined with; which is a non-issue in the flat
case but crucial in the categoric
case.
In general, a binary operator may care which operand is on its left and
which is on its right; a*b need not be the same as b*a, nor need the validity of
either even imply that of the other. For a given binary operator *, operands a
and b are said to commute
(with one another) precisely if
a*b = b*a (for which, of course, both must be valid). In such a case, each may
be said to commute with
the other (under the given binary operator).
The binary operator is described as commutative
or
(synonymously) abelian
(after Niels
Henrik Abel, who did important early work on group theory and died
tragically young) precisely if, whenever it knows how to combine two values,
either way round, it also knows how to combine them the other way round and they
commute. So * is abelian precisely if, for any a, b for which either a*b or b*a
is valid, so is the other and they agree, i.e. a*b = b*a whenever either is
valid.
The word commute
means change with
; you can
exchange the operands. The term commuter
, for a worker who travels daily
to and from a distant work-place, originates with train companies (back when
trains were a new and disruptive technology) selling return tickets to folk
travelling from out of town to look for work in big cities (e.g. Manchester), on
a deal that let the traveller commute
(i.e. exchange) the return half for
a discount on a season ticket if (typically because they found work) they chose
to do so before returning home. So, originally, the verb commute
meant
the act of exchanging the return half of the train ticket for a discount on a
season ticket, that the worker did once at the start of a period of employment;
'though it has now come to mean what workers do every day, to get to and from
work, regardless of their mode of transport.
Even when a binary operator isn't abelian, some of its operands
may commute with all others. If the binary operator is a combiner, the
collection of such operands is closed under the binary operator (for if a*x =
x*a for all x and b*x = x*b for all x, then (a*b)*x = a*(b*x) = a*(x*b) =
(a*x)*b = (x*a)*b = x*(a*b) also for all x), so the restriction of the combiner
to such operands is an abelian sub-combiner. Its collection of values is
orthodoxly known as the centre
of the combiner; I shall
also use that term to refer to the sub-combiner.
If a categoric combiner is abelian, its values can necessarily be decomposed
into a disjoint union of collections of values, on each of which the combiner is
flat, for which the combiner does not know how to combine values from disjoint
collections, just as addition can combine lengths, or combine masses, or combine
pure numbers, but cannot combine values of distinct kinds. (We may think of
such binary operators as piecewise flat
, even when they are not abelian
or even combiners.) This may be shown by considering the relation r that
relates a to b precisely if the combiner knows how to combine a with b; as the
combiner is abelian it, in that case, also knows how to combine b with a, so r
is symmetric. Furthermore, if r relates a to b and b to c, then the combiner
knows how to combine (a*b)*c and a*(b*c) which, furthermore, are equal; but it
is commutative, so a*b = b*a and, as the combiner can combine this with c,
combining (b*a)*c implies it can also combine b*(a*c) whence, in particular, it
can combine a with c; thus r relates a to c and is transitive. Transitive
symmetric relations are equivalences (each on its collection of values, on which
it is necessarily reflexive), so our combiner is flat on each equivalence class
of r.