Mathematics provdes some powerful tools for solving many problems. This
often means that a given problem can be solved in various ways. Some of those
ways may be easier to think of, or to follow as they lead from problem to
solution. Closely related to this, some solutions may be more illuminating than
others – which is about how clearly the reader shall understand why the
result is true. It is usually much easier to be confident of the truth of a
clear argument than that of more convoluted reasoning – sometimes an
apparently clear argument can be misleading or an
obvious result can be
false, but the more straightforward the reasoning and the fewer steps there are
to scrutinise, the easier it is to detect flaws and hidden assumptions.
I tend to think of this as the mathematical analogue of the difference
between neatly splitting logs by lining up an axe with the grain of the wood, as
compared to hacking away at the log with a chain-saw. The chainsaw always
works, where it may at times be hard (e.g. when there are knots in the wood) to
find a good line to hit the log along, but when one can do the job with the axe
the result is gracefull and efficient, in ways the chainsaw can never approach.
A phrase I've met in engineering circles (in particular among military
engineers) sums up the chain-saw solution quite nicely:
brute force and
BF&I is a well-establish class of techniques for
dealing with problem. Even though there may be a satisfyingly elegant way to
solve a problem – that would be much quicker once one has thought of
it, and likely cheaper, quieter and more efficient at the same time –
it is usually easier to see how to solve a problem with BF&I than to see how
to solve it gracefully. It is almost always more satisfying to solve a problem
in the more graceful way, and it'll impress one's peers (although not
necessarily management, when they think
well, apparently that was easy
because they don't understand what they just saw), but there is great value in
knowing how to solve problems by BF&I, too. If nothing else, BF&I can
tell you what the answer is, which can help you to think
about why it's that, which can be a great help in discovering the more
For example, in geometry, one can reason out some results in ways that leave the reader with a clear sense of why the result is inevitable, even obvious once explained; yet the same result may sometimes be obtained by less obvious paths. In particular, the combination of algebra and trigonometry provides us with a systematic way to reason about geometric truths, that can usually be bludgeoned into answering questions in geometry: yet many problems in geometry are far clearer when solved geometrically than when one uses algebra and trigonometry to solve the problem.
So let's consider an example, posed by Catriona Shearer on 2019-10-11. A square's diagonal is split into a long part c and a short part a by a point that's distance b from one of the off-diagonal corners: if c = a +b, what's the angle between the diagonal and the line from that point to that off-diagonal corner ?
There's a simple and elegant way to solve that problem, but I couldn't
initially see it, so I sat down and solved it by
brute force –
i.e. by using trigonometry and algebra.
Our diagonal's length is a +c = 2.a +b, so the side of our square is this over √2; the components of b parallel to the sides of the square are thus a/√2 in one direction and c/√2 in the other; squaring, summing and doubling, with c = a +b, we get 2.b.b = a.a +c.c = 2.a.a +2.a.b +b.b whence (b −a).(b −a) = b.b −2.a.b +a.a = 3.a.a and b/a = 1 ±√3; but both b and a are positive, so this must be 1 +√3, giving us b = (√3 +1).a, a = b.(√3 −1)/2 and c = b.(√3 +1)/2; the diagonal is a +c = b.√3, so the square's sides are b.√(3/2). (Intuition might now be screaming something about those factors of √3 meaing there's a turn/6 angle somewhere; and there's actually a nice little clue in the fact that the diagonal is 2.a +b; but let's press on with BF&I.) We can now apply the cosine rule with b and c coming out of the angle and b.√(3/2) opposite the angle, to get
whence cos(?) = 1/2 and the angle we were asked after is turn/6.Well, when the answer's as neat as that, there's probably a more elegant way to obtain it. Still, it was a Saturday morning and I'd finished my coffee, so it was time to get on with the day. While I was out shopping, I thought about it. I tried various ideas that didn't shed any light on the problem before the utterly neat solution suddenly popped into my head: just reflect the figure in the other diagonal. At this point, we split c into a piece of length a in the corner opposite our original a plus a middle part, between the two parts of length a: and that middle part's length is just c −a = b; each end of it is joined to our off-diagonal corner by a line also of length b, so we have an equilateral triangle and of course the angle we were asked about had to be turn/6 (a.k.a. 60°). I've hidden the diagram that shows this, as it'd be a spoiler for anyone reading the start of this example (on a screen big enough that they can see this far down): select the check-box at right to display it.
Notice that this second solution is direct and simple: anyone who knows the bare minimum of geometry – that the sum of angles within a triangle is always equal to a half turn, and must be shared equally among the angles if the sides are all equal – can work it out this way. In contrast, the other solution required me to use Pythagoras's theorem, solve a quadratic equation and then use the cosine rule (which can be viewed as a generalisation of Pythagoras's theorem). Even mathematicians to whom the BF&I solution is easy still prefer the elegant and graceful solution; yet the fact that we can fall back on the BF&I solution lets us solve problems even when we're too dumb to understand why our answers are right. That is both a blessing (yay ! solved :-) and a curse (because sometimes we know something is true, despite not really understanding why).
Yet another of Catriona's puzzles. This one can, in fact, be generalised: replace 18 with 2.a.a for any a and 32 with 2.b.b for any b; then replace 25 with a.a +b.b. My brute force solution was to use cartesian co-ordinates and solve for the location of the top left end of the red line, relative to the other end; this revealed that two corners (not constrained to be corners of the other two squares) of the tilted square are necessarily on the diagonals through the meeting-corner of the first two squares: one is distance a−b fron the corner, on the shared diagonal of the two squares, the other is distance a+b away on the perpendicular diagonal. That puts the corner inside the bigger of the first two squares at the centre-point of their shared diagonal. So: how much of that can be worked out more elegantly ?
I'll take the shared corner of the two parallel squares as origin and describe positions in relation to it. The firt thing to notice is that the origin is at the right-angle corner of a triangle whose hypotenuse is the diagonal of the tilted square; so the circumcircle of that tilted square must actually pass through the origin. Now, the top edge of the tilted square is a chord of the circle and so subtends, at the origin, the same angle as it subtends at the bottom left corner of that square, as both angles are at points on the circle, on the same side of the chord. The latter angle is plainly half a right-angle, hence so is the former, whence we can infer that the red line is actually the angle bisector of the right angle at the origin. Now the other diagonal of the tilted square, connecting the corners that aren't shared with the parallel squares, is a diameter of the circle, so subtends a right angle at the origin; whence, indeed, the bottom right corner of the tilted square is on the shard diagonal of the two parallel squares.
The proof by Ptolemy's theorem looks at the quadrilateral within the tilted square's circumcircle that uses two sides of the tilted square and one side of each of the other two squares; we know the lengths of these. One of the diagonals of this quadrilateral is the line whose length we want to know; the other is a diagonal of the tilted square, whose length we know. Let s = √(a.a +b.b) be the length of the sides of the tilted square; then Ptolemy's theorem multiplies each side of one of the other squares by the opposite side of the quadrilateral, which has length s, and adds the result, asserting that the sum is equal to the product of the two diagonals; each side has a factor of s in it (either from the sides of the tilted square, or from its diagonal) and a factor of √2 (the diagonal of the tilted square has length s.√2;, while the sides of the other squares are a.√2 and b.√2). Once we cancel these factors, we're left with the red line equal to a +b. This is more elegant than my initial brute force solution, but I still consider Ptolemy's theorem to be brute force (its proof is not elementary: it depends on using trigonometry, notably including the formulae for sines and cosines of sums of angles, and algebraic rearrangement). So can we find a more elegant solution ?
One thing we can do is to unpack the proof of Ptolemy's theorem and notice any parts of it that take a simpler form in our case; indeed, Ptolemy's theorem is really just a result of each chord's length being the circle's diameter times the sine of the angle the chord subtends at a point on the circle. I've marked two of the angles, between diagonals and sides of the quadrilateral, with a single arc; these are equal as each is subtended by the a.√2 chord; let these angles have size u. I've marked two others with a double arc; these are the complement of u in a right angle, turn/4 −u, whose sine is Cos(u), each subtended by the b.√2 chord. The remaining angles between diagonals and sides of our quadrilateral are all half right-angles, turn/8. The right-angle triangle with u and its complement at the ends of a diagonal of the tilted square has sides s.√2 (the hypotenuse, a diameter of the circle), a.√2 and b.√2; so Sin(u) = a/s, Cos(u) = b/s. The length we were asked for subtends angle u +turn/8 on one side and its complement in half a turn on the other side; the sine of either, given Cos(turn/8) = 1/√2 = Sin(turn/8), is just Sin(u +turn/8) = (Sin(u) +Cos(u))/√2; when we multiply this by the diameter, s.√2, we duly get s.Sin(u) +s.Cos(u) = a +b. So the result just reduces to the sine of a sum formula: and we're in the special case where one of the two angles being added is turn/8, whose sine and cosine are equal. So is there an elementary way to derive that Sin(u +turn/8) = (Sin(u) +Cos(u))/√2 ?
One thing to notice is that, if we drop a perpendicular onto the red diagonal, from each of the corners where the tilted square is anchored to the others, this perpendicular cuts the diagonal into a on one side and b on the other, since this perpendicular and the red diagonal from an isosceles right-angle triangle with a side of one of the parallel squares. The perpendiculars are, indeed, of lengths a and b; and each completes, with the part of the red diagonal that has the other as length, a right angle triangle with u or its complement in the top left corner. Even without knowing the answer, we can drop those perpendiculars and know that their lengths are a and b; we also know the angles in the top left corner match the angles of our right-angle triangle whose hypotenuse is a diagonal of the tilted square and perpendicular sides are sides of the two squares that meet at the origin. Since the right-angle triangles formed by our perpendiculars and the top left corner are similar to this larger right-angle triangle, we can duly infer the lengths of their parts along the red diagonal; and that necessarily gives us the answer.
So now let's re-wind to the point where the tilted square's circumcircle has told us the red diagonal bisects the right angle where the two parallel squares meet. First, notice that the two untilted squares only contribute one edge each; the rest of the square is only present to, via its area, say how long that one edge is. The solution uses an isoscelese right-angle triangle with that edge as hypotenuse; so we could replace each of the parallel squares by a square whose diagonal is that one edge (so that one half of it is the isoscelese right-angle triangle just mentioned).
We thus have three squares whose diagonals form a right-angle triangle; the areas of the squares whose diagonals are the perpendicular sides are a.a and b.b, while the square whose diagonal is the hypotenuse has the sum of these as its area. Since the other two squares' diagonals connect ends of this hypotenuse to a right angle, their shared corner lies on the circle whose diameter is the hypotenuse; and that circle is the circumcircle of the square whose diagonal is the hypotenuse. So, as before, we infer that the red line bisects the right angle; but now each square on a perpendicular side of the triangle has an edge along that angle bisector. We also get the hypotenuse square's bottom right corner on the far edge of one of the others; their edges perpendicular to the angle bisector are colinear.
The portion of the hypotenuse-square that isn't inside the other two squares is cut, by the red line, into two triangles; each has a right angle where its boundary with one of the other squares meets the red line; each has an angle in the top left corner; their two angles there add up to a right angle, so the other angle in each is equal to the other triangle's angle at that corner. The two triangles thus have the same angles; and their hypotenuses are sides of the original tilted square, so equal; thus these two triangles are equal. Each has one side of a non-hypotenuse square as a side, so their two perpendicular sides indeed have lengths a and b; one of these is along the red line, the rest of which is a side of a square with the other as length. Thus we infer that the red line's length is the sum of the two non-hypotenuse squares's sides. Notice, in passing, that our diagram has in fact become one of the classic proofs of pythagoras's theorem (albeit half-turned from the version linked).
So sometimes, starting with a brute-force solution and working backwards, we can discover an elegant solution, lurking in its workings.
Let's take another example. I met this when a teacher introduced my class to
various spirals defined in various terms. One of these is
equiangular spiral, whose tangent meets the radius
from its centre in the same angle at every point on the curve. Another is
exponential spiral, for which, as a ray from the
centre to a point on the spiral sweeps round to track along the spiral, the
mapping from angle turned to radius is a homonomrphism from the additive
structure on angles to the multiplicative structure on radii: that is, the
radius r of the sweeping ray and angle a swept vary as r/R = exp(a/A) for some
constant length R and angle A. (Just to confuse matters, I've seen some folk
use the name
logarithmic spiral for the exponential spiral. It shows up
in nature as the growth pattern of a snail's shell, for example.) Our teacher
(SDB) sent us away with the task of proving, by the next class, that these were
the same spiral.
I started by sketching out in my head the steps one must take to do the BF&I solution – reduce the curve to local cartesian co-ordinates, differentiate one with respect to the other, interpret the derivative as tangent of the angle the curve makes to a co-ordinate axis, infer the angle it thus makes to a radius, verify that this is constant. It was quite obvious this would be ugly and take lots of hard work, with ample scope for making mistakes. So I thought about it at length in hopes of finding a less cumbersome solution – and found one.
In the next class, my teacher and a classmate (Ben) had solutions that Did This The Hard Way – i.e. the BF&I solutions I'd contemplated and opted to avoid – while I had a nice elegant solution. Sadly, aged 17, I hadn't yet learned how to communicate things that I understood. (This is a common problem. It's easy to articulate a truth that you understand in a way that someone else who understands it – say, an examiner – shall recognise as showing you understand it. One of the problems with most education systems – and, especially, the testing used to determine how well pupils have fared in them – is that an answer that persuades the examiner you knew what you were doing is considered good enough. What's much harder (but far more useful, and satisfying) is to articulate it in a way that enables someone who didn't understand it, before you told them, to join you in understanding it.) I utterly failed to communicate my proof, although Simon eventually managed to work out what I was gabbling about. Let me take a shot at it now.
Our exponential curve is characterised by r/R = exp(a/A), with R and A constant, r and a as polar co-ordinates. A rotation of the plane through angle b maps any point [r, a] to the point [r, a+b]; an enlargement by factor k maps [r, a] to [k.r, a]. Both rotations and enlargements preserve angles. So let's rotate and enlarge our curve; the point [r, a] = [R.exp(t/A), t] gets mapped to [k.R.exp(t/A), t+b] which has a = t+b and r = k.R.exp(t/A) = k.R.exp(a/A −b/A) = (k.R/exp(b/A)).exp(a/A). So if we rotate through an angle b and then scale by a constant k = exp(b/A), we map our curve to r = R.exp(a/A), i.e. itself. Of course, we've mapped the point [R.exp(t/A), t] to the point [R.exp(b/A).exp(t/A), t+b], but this is still a point on the same curve. In particular, for any two points on the curve, there is a mapping of this form that maps one of the points to the other, while mapping the whole curve to the whole curve. In the process, the same scale-and-rotate operation must map the tangent and radius at any point of the curve to the tangent and radius at the image of that point on the curve. Since neither scaling nor rotation changes angles, the tangent must then meet the radius in the same angle; and we can do this for any pair of points on the curve, so every point on the curve has the same angle between tangent and radius.Written by Eddy.