Brute Force and Ignorance

Mathematics provdes some powerful tools for solving many problems. This often means that a given problem can be solved in various ways. Some of those ways may be easier to think of, or to follow as they lead from problem to solution. Closely related to this, some solutions may be more illuminating than others – which is about how clearly the reader shall understand why the result is true. It is usually much easier to be confident of the truth of a clear argument than that of more convoluted reasoning – sometimes an apparently clear argument can be misleading or an obvious result can be false, but the more straightforward the reasoning and the fewer steps there are to scrutinise, the easier it is to detect flaws and hidden assumptions.

I tend to think of this as the mathematical analogue of the difference between neatly splitting logs by lining up an axe with the grain of the wood, as compared to hacking away at the log with a chain-saw. The chainsaw always works, where it may at times be hard (e.g. when there are knots in the wood) to find a good line to hit the log along, but when one can do the job with the axe the result is gracefull and efficient, in ways the chainsaw can never approach. A phrase I've met in engineering circles (in particular among military engineers) sums up the chain-saw solution quite nicely: brute force and ignorance, or BF&I, is a well-establish class of techniques for dealing with problem. Even though there may be a satisfyingly elegant way to solve a problem – that would be much quicker once one has thought of it, and likely cheaper, quieter and more efficient at the same time – it is usually easier to see how to solve a problem with BF&I than to see how to solve it gracefully.

It is almost always more satisfying to solve a problem in the more graceful way, and it'll impress one's peers (although not necessarily management, when they think well, apparently that was easy because they don't understand what they just saw), but there is great value in knowing how to solve problems by BF&I, too. If nothing else, BF&I can tell you what the answer is, which can help you to think about why it's that, which can be a great help in discovering the more elegant solution. Even when (as we'll see at least once below) that search gets stuck in an unsatisfying a bit less brutal, but still not really elegant rut, it can lead to noticing interesting things along the way.

Geometric examples

For example, in geometry, one can reason out some results in ways that leave the reader with a clear sense of why the result is inevitable, even obvious once explained; yet the same result may sometimes be obtained by less obvious paths. In particular, the combination of algebra and trigonometry provides us with a systematic way to reason about geometric truths, that can usually be bludgeoned into answering questions in geometry: yet many problems in geometry are far clearer when solved geometrically than when one uses algebra and trigonometry to solve the problem. I'll include some examples below, and here list a few examples on other pages.

Finding the golden ratio in a regular pentagon's ratio of the distance between two non-adjacent vertices to the length of an edge, first by brute force (trigonometry and algebra), then more gracefully.
Shearer's split diagonal (just below)
One square tilted in relation to two others (below)
The exponential spiral is equiangular (below)

Shearer's split diagonal

So let's consider an example, posed by Catriona Shearer on 2019-10-11. A square's diagonal is split into a long part c and a short part a by a point that's distance b from one of the off-diagonal corners: if c = a +b, what's the angle between the diagonal and the line from that point to that off-diagonal corner ?

There's a simple and elegant way to solve that problem, but I couldn't initially see it, so I sat down and solved it by brute force – i.e. by using trigonometry and algebra.

Our diagonal's length is a +c = 2.a +b, so the side of our square is this over √2; the components of b parallel to the sides of the square are thus a/√2 in one direction and c/√2 in the other; squaring, summing and doubling, with c = a +b, we get 2.b.b = a.a +c.c = 2.a.a +2.a.b +b.b whence (b −a).(b −a) = b.b −2.a.b +a.a = 3.a.a and b/a = 1 ±√3; but both b and a are positive, so this must be 1 +√3, giving us b = (√3 +1).a, a = b.(√3 −1)/2 and c = b.(√3 +1)/2; the diagonal is a +c = b.√3, so the square's sides are b.√(3/2). (Intuition might now be screaming something about those factors of √3 meaing there's a turn/6 angle somewhere; and there's actually a nice little clue in the fact that the diagonal is 2.a +b; but let's press on with BF&I.) We can now apply the cosine rule with b and c coming out of the angle and b.√(3/2) opposite the angle, to get

3.b.b/2 +2.b.c.cos(?) = b.b +c.c
2.cos(?) = (c.c −b.b/2)/b/c = c/b −b/c/2 = (√3 +1)/2 −1/(√3 +1) = 1

whence cos(?) = 1/2 and the angle we were asked after is turn/6.

Well, when the answer's as neat as that, there's probably a more elegant way to obtain it. Still, it was a Saturday morning and I'd finished my coffee, so it was time to get on with the day. While I was out shopping, I thought about it. I tried various ideas that didn't shed any light on the problem before the utterly neat solution suddenly popped into my head: just reflect the figure in the other diagonal. At this point, we split c into a piece of length a in the corner opposite our original a plus a middle part, between the two parts of length a: and that middle part's length is just c −a = b; each end of it is joined to our off-diagonal corner by a line also of length b, so we have an equilateral triangle and of course the angle we were asked about had to be turn/6 (a.k.a. 60°). I've hidden the diagram that shows this, as it'd be a spoiler for anyone reading the start of this example (on a screen big enough that they can see this far down): select the check-box at right to display it.

Notice that this second solution is direct and simple: anyone who knows the bare minimum of geometry – that the sum of angles within a triangle is always equal to a half turn, and must be shared equally among the angles if the sides are all equal – can work it out this way. In contrast, the other solution required me to use Pythagoras's theorem, solve a quadratic equation and then use the cosine rule (which can be viewed as a generalisation of Pythagoras's theorem). Even mathematicians to whom the BF&I solution is easy still prefer the elegant and graceful solution; yet the fact that we can fall back on the BF&I solution lets us solve problems even when we're too dumb to understand why our answers are right. That is both a blessing (yay ! solved :-) and a curse – because sometimes we know something is true, despite not really understanding why.

A more complex example

Yet another of Catriona's puzzles. This one can, in fact, be generalised: replace 18 with 2.a.a for any a and 32 with 2.b.b for any b; then replace 25 with a.a +b.b. My brute force solution was to use cartesian co-ordinates and solve for the location of the top left end of the red line, relative to the other end; this revealed that two corners (not constrained to be corners of the other two squares) of the tilted square are necessarily on the diagonals through the meeting-corner of the first two squares: one is distance a−b from the corner, on the shared diagonal of the two squares, the other is distance a+b away on the perpendicular diagonal. That puts the corner inside the bigger of the first two squares at the centre-point of their shared diagonal. So: how much of that can be worked out more elegantly ?

I'll take the shared corner of the two parallel squares as origin and describe positions in relation to it. The first thing to notice is that the origin is at the right-angle corner of a triangle whose hypotenuse is the diagonal of the tilted square; so the circumcircle of that tilted square must actually pass through the origin. Now, the top edge of the tilted square is a chord of the circle and so subtends, at the origin, the same angle as it subtends at the bottom left corner of that square, as both angles are at points on the circle, on the same side of the chord. The latter angle is plainly half a right-angle, hence so is the former, whence we can infer that the red line is actually the angle bisector of the right angle at the origin. Now the other diagonal of the tilted square, connecting the corners that aren't shared with the parallel squares, is a diameter of the circle, so subtends a right angle at the origin; whence, indeed, the bottom right corner of the tilted square is on the shard diagonal of the two parallel squares.

That doesn't yet tell us the sought-after length, but it does establish some handy facts about how the parts are arrange. I'll now get the answer by applying a powerful tool before coming back to look for a more elegant approach.

The Ptolemaic sledge-hammer

The proof by Ptolemy's theorem looks at the quadrilateral within the tilted square's circumcircle that uses two sides of the tilted square and one side of each of the other two squares; we know the lengths of these. One of the diagonals of this quadrilateral is the line whose length we want to know; the other is a diagonal of the tilted square, whose length we know. Let s = √(a.a +b.b) be the length of the sides of the tilted square; then Ptolemy's theorem multiplies each side of one of the other squares by the opposite side of the quadrilateral, which has length s, and adds the results, asserting that the sum is equal to the product of the two diagonals; each side has a factor of s in it (either from the sides of the tilted square, or from its diagonal) and a factor of √2 (the diagonal of the tilted square has length s.√2;, while the sides of the other squares are a.√2 and b.√2). Once we cancel these factors, we're left with the red line equal to a +b.

This is more elegant than my initial brute force solution (which is so ugly I've left it out entirely), but I still consider Ptolemy's theorem to be, if not quite brute force, at least a mighty powerful tool. So can we find a more elegant solution, using less powerful tools ? One thing we can do is to unpack the proof of Ptolemy's theorem and notice any parts of it that take a simpler form in our case; indeed, Ptolemy's theorem is really just a result of each chord's length being the circle's diameter times the sine of the angle the chord subtends at a point on the circle.

I've marked two of the angles, between diagonals and sides of the quadrilateral, with a single arc; these are equal as each is subtended by the a.√2 chord; let these angles have size u. I've marked two others with a double arc; these are the complement of u in a right angle, turn/4 −u, whose sine is Cos(u), each subtended by the b.√2 chord. The remaining angles between diagonals and sides of our quadrilateral are all half right-angles, turn/8. The right-angle triangle with u and its complement at the ends of a diagonal of the tilted square has sides s.√2 (the hypotenuse, a diameter of the circle), a.√2 and b.√2; so Sin(u) = a/s, Cos(u) = b/s. The length we were asked for subtends angle u +turn/8 on one side and its complement in half a turn on the other side; the sine of either, given Cos(turn/8) = 1/√2 = Sin(turn/8), is just Sin(u +turn/8) = (Sin(u) +Cos(u))/√2; when we multiply this by the diameter, s.√2, we duly get s.Sin(u) +s.Cos(u) = a +b.

Digression

So the result just reduces to the sine of a sum formula: and we're in the special case where one of the two angles being added is turn/8, whose sine and cosine are equal. So is there an elementary way to show that Sin(u +turn/8).√2 = Sin(u) +Cos(u) ? An obvious way to approach that is to construct a figure in which those quantities appear as lengths.

Consider a unit square, with √2 diagonal, within its circumcircle; this can give us turn/8 as the angles between its diagonals and edges. Construct a line outside the square, from one corner, at angle u to one of the sides out of that corner. This forms angle u +turn/8 with the diameter of the circle that's the diagonal of the square from our chosen corner to the opposite corner of the square. Where this line next cuts the circle, we can connect the other end of the resulting chord to the far end of that diagonal to complete a right-angle triangle; this connecting edge has length Sin(u +turn/8).√2. We can equally continue the chord to the point where it meets a line perpendicular to it through the square's corner off the given diagonal on the same side as the chord, forming another right-angle triangle, with u in one corner and a unit side of the square as hypotenuse. So the extended chord and the perpendicular that joins it to the other corner have sides Cos(u) and Sin(u).

If we now quarter-turn this last right-angle triangle about the off-diagonal corner, mapping its hypotenuse to the other unit side into that corner, its Cos(u) side forms part of our earlier Sin(u +turn/8).√2 side, which is parallel to our original Sin(u) side, whose rotated image is the perpendicular from the off-diagonal corner to where the image of Cos(u) ends along the Sin(u +turn/8).√2 side. This last is parallel to our original Cos(u) side and we now have a quadrilateral formed by the Sin(u) edge, its rotated image, the tail of Cos(u) past where Sin(u +turn/8).√2 cut it and the surplus of Sin(u +turn/8).√2 past the end of the image of Cos(u). Since all four sides are perpendicular and two perpendicular sides are Sin(u), this is a square and we have duly established that Sin(u +turn/8).√2 = Cos(u) +Sin(u).

So that has at least inspired me to find a nice proof of a special case of the angle sum formula (among other things, it only works for u < turn/8), and I suspect we could now work our way back through the parts before to tidy things up and maybe get a somewhat more graceful solution, but by this point I'm looking at the figure differently and it's given me an idea.

Start Over

First notice that, if we drop a perpendicular onto the red diagonal, from each of the corners where the tilted square is anchored to the others, this perpendicular cuts the diagonal into a on one side and b on the other, since this perpendicular and the red diagonal from an isosceles right-angle triangle with a side of one of the parallel squares. The perpendiculars are, indeed, of lengths a and b; and each completes, with the part of the red diagonal that has the other as length, a right angle triangle with u or its complement in the top left corner. Even without knowing the answer, we can drop those perpendiculars and know that their lengths are a and b; we also know the angles in the top left corner match the angles of our right-angle triangle whose hypotenuse is a diagonal of the tilted square and perpendicular sides are sides of the two squares that meet at the origin. Since the right-angle triangles formed by our perpendiculars and the top left corner are similar to this larger right-angle triangle, we can duly infer the lengths of their parts along the red diagonal; and that necessarily gives us the answer. So now let's re-wind to start and apply what we've learned.

First, notice that the two untilted squares only contribute one edge each; aside from saying that the one edge each square contributes is perpendicular to the other, the rest of each square is only present to, via its area, say how long that one edge is. The solution uses an isoscelese right-angle triangle with that edge as hypotenuse; so we could replace each of the parallel squares by a square whose diagonal is that one edge (so that one half of it is the isoscelese right-angle triangle just mentioned).

We thus have three squares whose diagonals form a right-angle triangle; the areas of the squares whose diagonals are the perpendicular sides are a.a and b.b, while the square whose diagonal is the hypotenuse has the sum of these as its area. Since the other two squares' diagonals connect ends of this hypotenuse to a right angle, their shared corner lies on the circle whose diameter is the hypotenuse; and that circle is the circumcircle of the square whose diagonal is the hypotenuse. So, as before, we infer that the red line bisects the right angle; but now each square on a perpendicular side of the triangle has an edge along that angle bisector. We also get the hypotenuse square's bottom right corner on the far edge of one of the others; their edges perpendicular to the angle bisector are colinear.

The portion of the hypotenuse-square that isn't inside the other two squares is cut, by the red line, into two triangles; each has a right angle where its boundary with one of the other squares meets the red line; each has an angle in the top left corner; their two angles there add up to a right angle, so the other angle in each is equal to the other triangle's angle at that corner. The two triangles thus have the same angles; and their hypotenuses are sides of the original tilted square, so equal; thus these two triangles are equal. Each has one side of a non-hypotenuse square as a side, so their two perpendicular sides indeed have lengths a and b; one of these is along the red line, the rest of which is a side of a square with the other as length. Thus we infer that the red line's length is the sum of the two non-hypotenuse squares's sides. Notice, in passing, that our diagram has in fact become one of the classic proofs of pythagoras's theorem (albeit half-turned from the version linked).

So sometimes, starting with a brute-force solution and working backwards, we can discover an elegant solution, lurking in its workings.

The equiangular or exponential spiral

Let's take another example. I met this when a teacher introduced my class to various spirals defined in various terms. One of these is the equiangular spiral, whose tangent meets the radius from its centre in the same angle at every point on the curve. Another is the exponential spiral, for which, as a ray from the centre to a point on the spiral sweeps round to track along the spiral, the mapping from angle turned to radius is a homomorphism from the additive structure on angles to the multiplicative structure on radii: that is, the radius r of the sweeping ray and angle a swept vary as r/R = exp(a/A) for some constant length R and angle A. (Just to confuse matters, I've seen some folk use the name logarithmic spiral for the exponential spiral. It shows up in nature as the growth pattern of a snail's shell, for example.) Our teacher (SDB) sent us away with the task of proving, by the next class, that these were the same spiral.

I started by sketching out in my head the steps one must take to do the BF&I solution – reduce the curve to local cartesian co-ordinates, differentiate one with respect to the other, interpret the derivative as tangent of the angle the curve makes to a co-ordinate axis, infer the angle it thus makes to a radius, verify that this is constant. It was quite obvious this would be ugly and take lots of hard work, with ample scope for making mistakes. So I thought about it at length in hopes of finding a less cumbersome solution – and found one.

In the next class, my teacher and a classmate (Ben) had solutions that Did This The Hard Way – i.e. the BF&I solutions I'd contemplated and opted to avoid – while I had a nice elegant solution. Sadly, aged 17, I hadn't yet learned how to communicate things that I understood. (This is a common problem. It's easy to articulate a truth that you understand in a way that someone else who understands it – say, an examiner – shall recognise as showing you understand it. One of the problems with most education systems – and, especially, the testing used to determine how well pupils have fared in them – is that an answer that persuades the examiner you knew what you were doing is considered good enough. What's much harder (but far more useful, and satisfying) is to articulate it in a way that enables someone who didn't understand it, before you told them, to join you in understanding it.) I utterly failed to communicate my proof, although Simon eventually managed to work out what I was gabbling about. Let me take a shot at it now.

Our exponential curve is characterised by r/R = exp(a/A), with R and A constant, r and a as polar co-ordinates. A rotation of the plane through angle b maps any point [r, a] to the point [r, a+b]; an enlargement by factor k maps [r, a] to [k.r, a]. Both rotations and enlargements preserve angles. So let's rotate and enlarge our curve; the point [r, a] = [R.exp(t/A), t] gets mapped to [k.R.exp(t/A), t+b] which has a = t+b and r = k.R.exp(t/A) = k.R.exp(a/A −b/A) = (k.R/exp(b/A)).exp(a/A). So if we rotate through an angle b and then scale by a constant k = exp(b/A), we map our curve to r = R.exp(a/A), i.e. itself. Of course, we've mapped the point [R.exp(t/A), t] to the point [R.exp(b/A).exp(t/A), t+b], but this is still a point on the same curve. In particular, for any two points on the curve, there is a mapping of this form that maps one of the points to the other, while mapping the whole curve to the whole curve. In the process, the same scale-and-rotate operation must map the tangent and radius at any point of the curve to the tangent and radius at the image of that point on the curve. Since neither scaling nor rotation changes angles, the tangent must then meet the radius in the same angle; and we can do this for any pair of points on the curve, so every point on the curve has the same angle between tangent and radius.

Written by Eddy.