Monkey see, monkey program

One of the commonplaces of western culture is that, if enough monkeys were given typewriters and left to mess with them for long enough, one of them would type the complete works of Shakespeare. [I feel obliged to note that this is irrelevant except in so far as one detects that one of them has done so; which complicates the matter.] Since 1982 I have mostly been employed to maintain computer programs written by other people: in that time, it has become abundantly clear to me that there are two ways to write a computer program:

Naturally, I refer to the latter as monkey programming (with sincere apologies to any monkeys who may be reading this – I realize you're smarter than that – it's just a reference to the cultural myth above). What I find most distressing about it is that I have had to fix the resulting gibberish – though the fact that it happens at all, and that the large primates who did it actually got paid for doing so, is also pretty distressing. Since many of those who are responsible for deciding who to hire for a programming job are utterly ignorant of what separates a monkey from a programmer, there are significant numbers of monkeys who've made a highly lucrative career of working in each job until a little before any blame can be pinned on them, then moving to a new job: the industry's pay-schemes are so ill-considered (we have to pay you significantly more than your last job to persuade you to come to us; but we don't feel any need to give our existing staff pay-rises even in line with inflation) that hopping from job to job attracts a higher rate of rise of income than staying in a job for any length of time, so these monkeys are often better paid than the folk who are fixing the abominations they have left behind.

Generally, I've seen most simian programming in the jobs which have paid least well and/or had the least stringent interview processes. A job where you actually have to write code and make it solve some noddy problem, as part of the interview process, is less apt to inflict this on me. Since about 1993 I've been involved with the web: which has meant that I've had to cope with code written by people who don't work for the same employer as me: now that I work on a user agent for the world-wide web, I have to deal with bugs in random web-sites which may incorporate programs (generally in JavaSpit) written by monkeys who work for what I should probably term fourth parties – we and our customer are the first and second parties, in some order, and the third party is whoever's web-site it is, but they've delegated some of their site's behaviour to someone else, whose code has been written by monkeys.

In reality, simian programmers don't simply clatter randomly on their keyboards: they frequently look for some piece of program that does something like what they have in mind, copy it, then randomly vary it until it does the thing they were asked to achieve (at least when tested with the noddy test-cases they've been given). Programmers who start by understanding can generally be relied on, if what they want to do is almost the same as what some existing piece of code does, to encapsulate the original in a function and arrange for the original code to call the function with one set of inputs (that prompt it to do what they used to want) while the new code calls the same function with subtly different inputs, prompting it to do the new task. The advantage of this approach is that, when you fix a bug in the shared function, it fixes the bug in both places that use the function – whereas the copy-and-paste solution lets you fix the bug in one place while leaving it still active in the other (and all the other others copied from it).

The other remarkable thing about simian programmers is that they don't even try to look for a document that says what they should be doing: if they bother to look at anything at all before starting to write, they first look for some existing code that shares some common keywords with what they find they need to write. I met a classic problem with this in the 1990s: the project I was working on had to talk to some standard system libraries (the X windowing system) and one of the things a programmer has to provide to some functions in these libraries is a call-back function – which the library shall call under certain well-defined circumstances, supplying it with information it needs to handle the relevant situation suitably. The relevant kind of function is known (IIRC) as an XtCallbackProc; this is a type defined to mean a function which returns void (i.e. nothing) and accepts, as its inputs, three parameters of certain specific types (which are not relevant to this discussion and I can't remember them). One can write a function of this type without ever using the word XtCallbackProc: one need only mention the names of the input types and the (void) return type. At some point, a simian programmer had written a function which accepted inputs of the right types but returned XtCallbackProc instead of void. The tools used to turn the written program into what users actually run had doubtless complained bitterly about using the resulting function as an XtCallbackProc, but the monkey had a way to get round that (it's called a cast; it tells a thing of one type to forget that it's of that type and believe that it's of another type). After that, all the other monkeys, when they'd needed to write an XtCallbackProc, had searched for the type name and found the bad code – which they'd duly copied and adapted. When I first noticed the problem, I fixed all the examples I could find. Later, I was puzzled to see new examples spring up – and worked out what was happening (the monkeys had access to old versions of the code, so they could copy from those). When I fixed these, I took care to add a comment to each example saying this is an XtCallbackProc and explaining that any function whose return type was XtCallbackProc wasn't an XtCallbackProc. After that, the monkeys copied the code I'd fixed more than the code that was wrong.

The scary thing about the world-wide web is that a whole new sub-culture is writing things that are, in varying degrees, computer programs – without any training in the lessons that the software industry has been learning since the 1950s (yes, there were programmers before that: but so much was idiosyncratic before then that the lessons intelligible to programmers of the present didn't start to crystallise until the 1950s). Those of us who first wrote computer programs under the supervision of computer programmers understand the idea of encapsulation (putting related facts into a common data structure so that they can be accessed together, while separating them from other facts, and ensuring that any piece of code can only see the facts it needs to see) and the abstraction of the function (an isolated piece of code which does just one thing, that's clearly (and ideally simply) specified) which make it possible to reduce complex problems to simple parts. Those who learned to create dynamic content on the web by monkeying around with things they found on the web, without reading about why those things should work, have recapitulated all the clumsy and stupid mistakes that programmers spent the 1960s through 1980s learning to not do.

The web-monkeys come to the game with one huge advantage over the programmers of the 1950s (ignoring, since monkeys don't read such things, the wealth of literature on how to not do it horribly wrong): the languages in which they are called upon to program (Java, if they're lucky, and ECMASCript – a.k.a. JavaScript – more usually) have been designed by people who'd learned some of the lessons of the intervening decades. These provide for structured data abstractions (and even object-oriented programming, albeit – in EMCAScript's case – in a rather inept form) and free the programmer of any need to think about how machine resources are actually being used. They also don't expect the broad mass of programmers to have brains, let alone actually engage them (as lisp does, to its eternal credit). As a result, idiots can use them and, unless really determined, not make half such disastrous messes as were quite normal in For(mula)-Tran(slation). They can still write their code in abysmal ways – many of them think that deleting all optional spaces in their code either saves band-width on their servers (if it does so at all, it's negligible) or prevents others from stealing their clever ideas (this is utterly hopeless: first, because their clever ideas are half a century old; but even if they were genuinely new, programmers have had pretty-printing software for decades, which can take obscurified code and display it in a way that's easy to read) – but the languages they're using try quite hard (with occasional lapses) to seduce them into not doing it as ineptly as was all too easy for those working in less well-designed languages.

One of the huge ironies of the web is that the economic incentives actually militate in favour of the monkeys. If you're asked to build a web-site for someone, you can approach the problem two ways. The Good Way is to write a site that works by the web standards, then bodge it to code round the deficiencies and exploit the special features of the small number of browsers that are so widely used that your work shall be judged by how good it looks on them. The industry standard approach is to write a site that works very nicely in the most widely-used of those browsers; then ask for more money to make it work not entirely dreadfully in the next most widely-used; and make sure, at each stage, to rely on any extensions and deviations from the standards, in each, so as to ensure that your client shall find it all unusable with any other browser, so that you can bill them yet again for bodging it some more. If you do it The Good Way, you get paid once: the other way sets you up for a perpetual revenue stream (bear in mind that even the entrenched market leader has the occasional new release, which shall require upgrade work). It could be argued that the former shall do you good by gaining you good reputation, but you're in a market where the customer is pug-ignorant and begging to be lead up the garden path: in practice, the industry-standard solution is ubiquitous because the web-designers who practice it get more business than the ones who do even a half-way decent job.

It is worth pointing out, in this context, that there is an important difference between industry-standard and best-practice. In most engineering disciplines, these two notions draw steadilly closer together: but, when it comes to software, they are so far from one another that the gulf is a compelling argument for refusing to allow software to be described as a branch of engineering. There is some confusion in the software industry between these two terms: I have seen at least one excellent diatribe against best-practice from an author normally remarkable for his good sense: when I substituted industry-standard for best-practice throughout his essay, it made perfect sense, but he'd been tricked – by the propaganda of those he's arguing against – into accepting their abuse of language; as a result, he bad-mouthed the wrong thing and undermined his own point by blaming the very body of expertise which he should have been commending. The single biggest problem with the software industry is that it is industry-standard practice to treat best-practice with contempt or, quite often, to simply be ignorant of it. It has been best-practice for several decades to write code which conforms to the specification of the language the code is meant to be in: but it is still standard practice to write code which only works for one compiler.

Written by Eddy.