]> The complex numbers

The complex numbers

The complex numbers are an algebraic abstraction for several equivalent structures that one can arrive at independently. I'll here introduce them via the most intuitively tractable of these and then show the properties they provide.

Two-dimensional transformations

Given a ringlet R, we can read collection {[r, s]: values r, s of R} = {lists (R: |2)}, of lists of two of its values, as an R-module in the usual way. This formally embeds R's values in the additive automorphisms of {lists (R: |2)}, via (: (: [r.x, r.y] ←[x, y] :) ←r :) and uses the result as a multiplicative action of R on {lists (R: |2)}; the resulting module's automorphisms are exactly those additive automorphisms of {lists (R: |2)} that commute with the scalings this embedding produced as outputs. We can in fact construe {lists (R: |2)} as a module over any commutative subring of its ring of additive automorphisms; what I'll now do is exhibit a particular subring with some neat properties, then illustrate it with the case where the ringlet we started with is the (hopefully familiar) real continuum.

Now, to pull this trick off, I need a complete addition, so I won't actually be working with {lists (R: |2)} because I don't presume that R is additively complete; if it is, it's a ring, but I want to show that we can do this for any ringlet. Fortunately, as addition in R is cancellable, so is the addition in {lists (R: |2)} so we can complete its addition in a standard way. The result, which I'll call E, is formally an equivalence on pairs of formal differences of lists (R: |2); however, its members can faithfully be represented by lists whose entries I'll describe as scalar; these values are arithmetic expressions in terms of values of R, using 0 even if R lacks an additive identity and negation even if R doesn't support it, because our additive completion has supplied us with those. Formally, what I'll write as [−s, r] for r, s in R is a short-hand for (: [a, a +r]←[b +s, b]; a, b in R :) that may more helpfully be thought of as [0,r] −[s,0]; but the important thing is that I can write things like [−t, 0] for a value t of R without R needing to have an additive inverse for t or even an additive identity, 0; the expressions in my list don't actually need to be values of R, as long as our additive completion step can make them out of values of R.

Now, there are many automorphisms of our addition on E; the ones that commute with scaling by any value of R, (: (: [r.x, r.y] ←[x, y] |E) ←r |R), are of form (: [a.x +b.y, c.x +d.y] ←[x, y] |E) for assorted scalars a, b, c and d. Let I = (: [−y, x] ←[x, y] |E) and remember that the collection E is synonymous with its identity relation, so I can use E to denote (: [x, y] ←[x, y] :). Our typical scaling was then r.E with r a value of R; additive completion then lets us have scalings r.E for any scalar r. If we include I in our ringlet, we'll have to also include all results of multiplying it by such scalings and of adding things in our ringlet; this gets us C = {r.E +s.I: scalar r, s}. Our addition on C is simply the pointwise one arising from reading it as a mapping (E: |E) and its multiplication shall be composition of automorphisms. Since E is an identity, its composite with any additive automorphism of E is that automorphism; and any sum of scalars is a scalar, so adding two members of C gets us a member of C. It remains to consider products; we already know E.E = E and I.E = I = E.I, so let's first deal with I.I and then with a general product of members of C:

= (: [−y, x] ←[x, y] |E)&on;(: [−u, v] ←[v, u] |E)
= (: [−v, −u] ←[v, u] |E)
= (: −e ←e |E)
= −E
(r.E +s.I).(u,E +v.I)
= r.u.E.E +s.u,I.E +r.v.E.I +s.v.I.I
= (r.u −s.v).E +(s.u +r.v).I

which is indeed in C, so C is closed under addition and multiplication. Notice that I.I = −E gives us I.I.I = −I and I.I.I.I = I.(−I) = E. That C's addition is commutative and cancellable follows from that of R being so; as long as R's multiplication is also commutative, so is C's. We have E as a multiplicative identity. Associativity of addition follows from that in R; that of multiplication requires us to consider a third factor and (thanks to R's multiplication being associative) we get:

( (r.E +s.I).(u.E +v.I) ).(x.E +y.I)
= ((r.u −s.v).E +(s.u +r.v).I).(x.E +y.I)
= (r.u.x −s.v.x −s.u.y −r.v.y).E +(r.u.y −s.v.y +s.u.x +r.v.x).I
= (r.(u.x −v.y) −s.(v.x +u.y)).E +(r.(v.x + u.y) +s.(u.x −v.y)).I
= (r.E +s.I).((u.x −v.y).E +(v.x +u.y).I)
= (r.E +s.I).( (u.E +v.I).(x.E +y.I) )

so we can write (r.E +s.I).(u.E +v.I).(x.E +y.I) without introducing ambiguity and our multiplication is associative. Our addition in C is complete, so we obtain a ring on C, induced from our ringlet R.

Now let's consider which automorphisms of addition on E, that commute with R's simple scalings of it, also commute with C; it suffices that the commute with I, so let's take the difference between I's products, in both orders, with a typical automorphism of addition on E that commutes with scaling:

I&on;(: [a.x +b.y, c.x +d.y] ←[x, y] |E) −(: [a.x +b.y, c.x +d.y] ←[x, y] |E)&on;I
= (: [−c.x −d.y, a.x +b.y] ←[x, y] |E) −(: [−a.v +b.u, −c.v +d.u] ←[u, v] |E)
= (: [−c.x −d.y +a.y −b.x, a.x +b.y +c.y −d.x] ←[x, y] |E)
= (: [−(b +c).x +(a −d.y, (a −d).x +(b +c).y] ←[x, y] |E)

which must be the zero mapping for the automorphism to commute with I; that requires d = a and c = −b, which is exactly the condition that the automorphism is in C. So the automorphisms of E, as a C-module, are precisely the scalings by C. The mapping (C| x.E +y.I ←[x, y] |E) is manifestly iso, identifying E with C; and represents C's multiplicative action on E as C's own multiplication (i.e. composition as automorphisms of the addition). The reverse of this mapping can be written (E| c([1, 0]) ←c |C) since (x.E +y.I)([1, 0]) = [x, 0] +[0, y] = [x, y]. So consider the mapping c([0, 1]) ←c; this gives (x.E +y.I)([0, 1]) = [−y, x] = I([x, y]). Scaling by −I, we thus get (y.E −x.I)([0, 1]) = [x, y] so the reverse of c([0, 1]) ←c is (C| y.E −x.I ←[x, y] |E), which is again iso. For fixed e in E, we can generally map (E: c(e) ←c |C) but it won't necessarily give us all of E; we don't necessarily have multiplicative inverses in R. However, we can induce a multiplication on E from any such embedding of C in E that does give us all of E; chosing the first illustration just given, we have [x, y].[u, v] = [x.u −y.v, x.v +y.u], making E itself a ringlet isomorphic to C.

When a ringlet R meets the prerequisites for doing this construction, I refer to C as the complexified ring of R or, when R's values are thought of as numbers, the R-complex numbers or the complex version of whatever R's members are. Since the construction tacitly completes R, the naming of C conflates ringlets with their completions, so I'll tend to use generic terms for R's type. For example, I'll refer to the complexified version of the natural ringlet as the complex whole numbers and the result of doing the same to the positive ratios as the complex rationals.

Illustration: real-complex numbers

We can do this with the real continuum as our ringlet; the result is known as the complex numbers, with no real-qualification; it may be thought of as the canonical complex numbers.

In this case, as {reals} is additively complete, we have E = {lists ({reals}: |2)} in the obvious way. There's a standard metric g on E, defined by g([x, y], [u, v]) = x.u +y.v; a linear automorphism of E has the form (E: [a.x +b.y, c.x +d.y] ←[x, y] :E) and preserves g precisely if, for all real x, y:

x.x +y.y
= (a.x +b.y).(a.x +b.y) +(c.x +d.y).(c.x +d.y)
= a.a.x.x +2.a.b.x.y +b.b.y.y +c.c.x.x +2.c.d.x.y +d.d.y.y
= (a.a +c.c).x.x +2.(a.b +c.d).x.y +(b.b +d.d).y.y

(Since g is symmetric, we have 4.g(w, z) = g(w+z, w+z) −g(w−z, w−z); so preserving g's output when both inputs are equal suffices to preserve its outputs for distinct inputs.) Setting [x, y] = [0, 1] we discover 1 = b.b +d.d; setting [x, y] = [1, 0] we discover 1 = a.a +c.c; and (then) setting [x, y] = [1, 1] we discover a.b +c.d = 0. These conditions manifestly suffice to make the final form be equal to the initial form. From a.a +c.c = 1 we can infer that a and c lie in the interval {real t: −1 ≤ t ≤ 1}, which is the range of outputs of the trigonometric functions Sin and Cos; so we can infer that a is Cos(f) for some angle f, whence c = ±√(1 −Cos(f).Cos(f)) is the Sin of some angle with the same Cos as f; so, without loss of generality, f was that angle and we have a = Cos(f), c = Sin(f) for some angle f. We can apply the same reasoning to obtain b = Sin(g) and d = Cos(g) for some angle g and observe 0 = a.b +c.d = Cos(f).Sin(g) +Sin(f).Cos(g) = Sin(f+g), whence f+g is some whole multiple of the half turn. When it is a multiple of the turn, we obtain c = Sin(f) = −b, a = Cos(f) = d; when f+g is an odd multiple of the half turn, we have c = Sin(f) = b, a = Cos(f) = −d. The two cases correspond to a rotation through f and the result of composing this after the reflection (E: [x, −y] ←[x, y] :E). The latter's composite can be expressed as a reflection in a line at angle f/2 to [1, 0]. We thus obtain the usual classification of the isometries (linear maps that preserve length) of the two-dimensional Euclidean plane, as rotations and reflections.

The rotations form an abelian group: if we compose rotations through two angles, the composite is the rotation through the sum of angles, regardless of the order of the two rotations; rotating through angle zero is the identity; and we can reverse a rotation to obtain its inverse. There are also simple real scalings on E; (E: [s.x, s.y] ←[x, y] :E) scales by s for any real s. It is easy to see that such scalings commute with rotations: if you rotate and then scale, the result is the same as if you scaled first and then rotated. We can thus extend our abelian group of rotations to an abelian group of rotations and non-zero scalings (it wouldn't still be a group if we included the zero scaling, which has no inverse). Define

for a, b real; any composite of a rotation with a scaling is of form S(s.Cos(f), s.Sin(f)) for some angle f and real s; and any reals a, b can be obtained from some such f and s, albeit requiring s = 0 when a = 0 = b.

So now consider the collection C = {S(a, b): a, b are real} of linear maps on E expressible as composites of rotations and (potentially zero) scalings. This includes the zero linear map on E; but the rest of its members form a group under composition. Consider, then, what happens when we add two of these:

S(a, b) +S(c, d)
= (E: [a.x +b.y, a.y −b.x] ←[x, y] :E) +(E: [c.x +d.y, c.y −d.x] ←[x, y] :E)
= (E: [a.x +b.y, a.y −b.x] +[c.x +d.y, c.y −d.x] ←[x, y] :E)
= (E: [a.x +b.y +c.x +d.y, a.y −b.x +c.y −d.x] ←[x, y] :E)
= (E: [(a +c).x +(b +d).y, (a +c).y −(b +d).x] ←[x, y] :E)
= S(a +c, b +d)

Thus C is closed under addition, which is manifestly abelian (it always is for linear maps); and, when zero is set aside, forms a multiplicative group under composition. With the usual scaling on linear maps, we can now write S(a, b) as a.S(1, 0) +b.S(0, 1) and S(1, 0) is simply the usual identity on E, so let's write it (for now) as E (since a collection simply is its identity mapping); and let I = S(0, 1), so S(a, b) = a.E +b.I. We already know E&on;E = E, E&on;I = I = I&on;E, so let's now look at

= S(0, 1)&on;S(0, 1)
= (E: [y, −x] ←[x, y] :E)&on;(E: [v, −u] ←[u, v] :E)
= (E: [−u, −v] ←[u, v] :E)
= −E

the half turn; I is the quarter turn, so this is only to be expected. Using this, we can now work out the details of arbitrary sums and composites of members of C; we already know about sums, so let's look at:

(a.E +b.I)&on;(c.E +d.I)
= S(a, b)&on;S(c, d)
= (E: [a.u +b.v, a.v −b.u] ←[u, v] :E)&on;(E: [c.x +d.y, c.y −d.x] ←[x, y] :E)
= (E: [a.(c.x +d.y) +b.(c.y −d.x), a.(c.y −d.x) −b.(c.x +d.y)] ←[x, y] :E)
= (E: [(a.c −b.d).x +(a.d +b.c).y, (a.c −b.d).y −(a.d +b.c).x] ←[x, y] :E)
= S(a.c −b.d, a.d +b.c)
= (a.c −b.d).E +(a.d +b.c).I
= a.c.E&on;E +b.c.I&on;E +a.d.E&on;I +b.d.I&on;I

in which we can see &on;, at least when its operands are E and I, behaving just the way a multiplication would, in its interaction with addition. Let's see whether that works with other operands than E and I: the crucial property (distributivity) is that z.(w+t) = z.w +z.t for all z, w, t, with &on; in place of multiplication and each of z, w, t replaced by an S(,), so let's look at:

S(a, b)&on;(S(c, d) +S(e, f))
= S(a, b)&on;S(c+e, d+f)
= S(a.(c+e) −b.(d+f), a.(d+f) +b.(c+e))
= S(a.c +a.e −b.d −b.f, a.d +a.f +b.c +b.e)
= S(a.c −b.d, a.d +b.c) +S(a.e −b.f, a.f +b.e)
= S(a, b)&on;S(c, d) +S(a, b)&on;S(e, f)

and we already know both &on; and + are commutative (the order of operands doesn't change the answer), so we now know they interact with themselves and each other, on the members of C, exactly as multiplication and addition do on numbers. So I now proceed, in C, to write &on; as multiplication – writing z.w for z&on;w when z, w are in C – and the members of C as if they were numbers. To this end, I'll interpret any real number s as s.E, since E is simply the unit of multiplication, and write i in place of I for S(0, 1); our typical member of C is then a+i.c with a and c real. Multiplication and addition are commutative and associative as usual; multiplication distributes over addition (i.e. w.(z+t) = w.z +w.t, as above) and i.i = −1 (repeating a quarter turn gets you a half turn). When C is so interpreted, its members are known as the complex numbers.

So a complex number is (in this realisation) just the composite of a rotation and a scaling on the two-dimensional Euclidean plane; addition among them is simply their usual addition as linear maps; we implicitly embed the reals in them as the scalings; one of the quarter turns is named i and composition among complex numbers is construed as multiplication (since it has all the right properties, in relation to itself and addition, to be so construed). The half turn is simply scaling by −1 and is what you get by composing a quarter turn with itself, hence i.i = −1. The other quarter turn is −i and, likewise, (−i).(−i) = −1; so i and −i serve as square roots of −1. Notice that our choice of which quarter turn to label as i was arbitrary; if we systematically swap i and −i, we should end up saying all the same things, albeit possibly rearranging how we say them.

Given a complex number in the form x +i.y, we may want to express it in the form of a scaling and a rotation. It should be clear from above that rotation through angle a is just Cos(a) +i.Sin(a) and thus any x +i.y is simply a rotation precisely if x.x +y.y is 1; otherwise, unless it's zero, we can obviously scale it to make (x +i.y)/√(x.x +y.y) a rotation. Thus the general x +i.y is simply scaling by √(x.x +y.y) combined with a rotation through the angle whose Cos and Sin are obtained by dividing x and y, respectively, by this scaling.


One may, as noted above, realise the complex numbers in diverse other ways; it suffices that we have a pair of square roots ±i of −1 and a copy of the reals, along with multiplication and addition that behave much the same way they do on reals – they're commutative (order of operands doesn't matter), multiplication distributes over addition (z.(w+t) = z.w +z.t), 0 is an additive identity (0+z = z), 1 is a multiplicative identity (1.z = z), everything has an additive inverse (−z +z = 0) and everything but 0 has a multiplicative inverse (z/z = 1). I'll say, for real x and y, that x +i.y is a complex number or simply is complex and write {complex} for {x +i.y: x, y are real}.

Speaking of multiplicative inverses, notice that 1 = −i.i so i and −i are mutually inverse. For real x, y we have (x +i.y).(x −i.y) = x.x −i.i.y.y +i.y.x −x.i.y = x.x +y.y, which is real, so we have 1/(x +i.y) = (x −i.y)/(x.x +y.y). This gives a first hint that we may be interested in the mappings:

Indeed, these are so commonly relevant that each has a short-hand: for complex z, |z| stands for norm(z) and z* for conjugate(z). Note that conjugate is self-inverse; it is, furthermore, a homomorphism of {complex} (so (z.w)* = z*.w*, (z +w)* = z* +w*). The fixed-points of conjugate (those z with z* = z) are the reals; and we can recover x and y from z = x +i.y as x = (z* +z)/2 and y = i.(z* −z)/2. In this, x = (z* +z)/2 is known as the real part of z and i.y = (z −z*)/2 is known as its imaginary part.

Linear spaces over {complex}

Just as one can have linear spaces over {reals}, one can equally have linear spaces over {complex}, with {scalars} = {complex} in this case (and this is pretty much exactly why I discuss linear spaces in terms of abstract scalars instead of simply always using reals). A complex-linear space V then has dual(V) = {linear maps ({complex}: |V)} and all of the usual structure follows naturally. However, one extra feature arises, thanks to the presence of conjugation on {complex}.

Given complex-linear spaces U, V, a mapping (U: f |V) is described as antilinear precisely if it respects addition (as for a linear map) and, for all v in V and complex k, f(k.v) = k*.f(v); i.e. it conjugates scaling, rather than simply respecting it, as a linear map would. The composite of a list of linear and antinlinear maps is linear if the list contained an even number of antilinear maps and antilinear if it contained an odd number of them.

We could introduce the concept of an anti-dual of a complex-linear space, but it's not actually useful: any antilinear map ({complex}: |V) is necessarily simply conjugate composed after some linear map ({complex}: |V) in dual(V), so the anti-dual of V would just be {conjugate&on;f: f in dual(V)}, which scarcely seems worth naming separately.

Of more interest is the case of antilinear maps from a linear space to its dual; these provide a more satisfactory notion of length in a complex-linear space than the usual metric as linear map to dual. Given that {complex} isn't naturally ordered, like {reals}, it doesn't naturally have a sense of positive members; and any linear map (dual(V): g |V) necessarily has some v in V for which g(v, v) isn't real, so there's no natural equivalent of the positive-definite quadratic form on a complex-linear space. However, as we'll see, an antilinear metric works more nicely.

Sesquilinear and Hermitian

Given a complex-linear space V, an antilinear map (dual(V): f |V) is described as sesquilinear; if it furthermore satisfies conjugate(f(v, w)) = f(w, v) for all w, v in V, it is described as Hermitian. When we pass the same member of V as both inputs to a Hermitian form, swapping the inputs makes no difference; yet swapping the inputs and conjugating also makes no difference (as it's Hermitian), so the value must in fact be self-conjugate – i.e. real.

A sesquilinear map f effectively consumes two members of V and produces a scalar output; it consumes the first input antilinearly and the second linearly. If we transpose it (that is, swap the order in which it receives its two inputs), the one that was used linearly shall be used antilinearly, and vice versa. Thus, for given u in V, ({complex}: f(v, u) ←v |V) is antilinear, so not in dual(V); but its outputs are complex, so we can compose it with conjugate to obtain a linear map; so ({complex}: conjugate(f(v, u)) ←v |V) is in dual(V). Of course, this varies antilinearly with u, so (dual(V): ({complex}: conjugate(f(v, u)) ←v |V) ←u |V) is once more sesquilinear: it is of the same kind as f. This conjugated-transpose is known as the Hermitian conjugate of f.

For a sesquilinear map f to be equal to its own transpose would require f(u, v) = f(v, u) for all u, v in V; but this would also require f(i.u, v) = f(v, i.u) whence f(v, u) = f(u, v) = −f(v, u) can only be 0 and f is zero (which is indeed both linear and antilinear, but boringly so). That arose because f and its transpose used their inputs in inconsistent ways; in contrast, f's Hermitian conjugate uses its inputs the same way f does, so has the potential to be equal to f – in which case, f is Hermitian.

When we give the same v in V to both inputs of a sesquilinear form f, to get f(v, v), the effect of scaling is always a real scaling: if we scale v by complex k, we get f(k.v, k.v) = k*.k.f(v, v) and k*.k is simply the square of norm(k), which is real (and, for non-zero k, positive). For a Hermitian form, the same applies and f(v, v) was real already; scaling v by non-zero k scales f(v, v) by a positive k*.k, so doesn't change its sign. Thus we have the potential for a Hermitian form to satisfy f(v, v) ≥ 0, with equality only when v is 0, in which case we can describe it as positive definite exactly as in the real case.


Just as for (not necessarily linear) mappings between real-linear spaces, one can define differentiation for mappings between complex-linear spaces. Since the complex numbers are, themselves, a real-linear space, one can interpret the complex-linear spaces as real-linear ones and differentiate the mapping between them, understood as a mapping between real-linear spaces; the derivative at each point shall, indeed, be a real-linear map representing the complex-linear map obtained by differentiating the mapping, understood as a mapping between complex-linear spaces. However, it's possible for a mapping between complex-linear spaces to not be differentiable even though, when the spaces are read as real-linear, the mapping is real-differentiable; this arises because not every real-derivative is a representation of some complex derivative. This is simplest to see in the simple one-dimensional case of a mapping from {complex} to itself.

Given a mapping ({complex}: f :{complex}), we can represent it as the real mapping F = ({lists ({reals}: |2)}: [s,t] ←[x,y]; s+i.t = f(x+i.y) :{lists ({reals}: |2)}). It's real-differentiable at [x,y] if there are real a, b, c, d for which f(x+h +i.(y+k)) −f(x +i.y) is, for small h, k, well approximated by (a.h +b.k) +i.(c.h +d.k). In contrast, f is only complex-differentiable if there's some m+i.n for which the difference is well approximated by (m +i.n).(h +i.k) = (m.h −n.k) +i.(m.k +n.h), implying a = m = d and −b = n = c. So, when F is real-differentiable, we only get f complex-differentiable if a = d and b +c = 0. There are, in effect, only half as many free components in a complex derivative as there are in a real one; being complex-differentiable is a tighter constraint (hence says more about the function) than being real-differentiable. This leads to some very powerful properties for functions from {complex} to itself, if they're differentiable everywhere, or even when we allow exceptions to that at isolated points.

Valid CSSValid XHTML 1.1 Written by Eddy.