Professionally our methods of transmitting and reviewing the results of research are generations old and by now are totally inadequate for their purpose.
from As we may think – Vannevar Bush.

Open Access

I advocate open publication of the results of research and openly publish my own writings (for what they may be worth). A great many academics now likewise argue for the public to have open access to their work. Below, I seek to explain why open access to research is so important to us all.

Our right to know

The public, through taxes, directly funds much of the research done by universities; it is only right and proper, therefore, that the public should have ready access to the results of such research. Other research done by universities may be funded by business: in so far as it is used as the basis for granting qualifications to researchers, or promoting them as public employees, the public's interest in the fairness and propriety of such consequences argues for the public to be able to scrutinise the research that's used to justify them.

Even where research done at universities is not covered by these cases, the public has just cause to better trust those universities that insist that the research done under their auspices is accessible to the widest possible audience; while there may be just cause to compromise on that to some modest degree when collaborating with other researchers, e.g. in industry, universities should push for the best deal they justly can and, in particular, should be very unwilling to embark on such collaborations without a contractual guarantee of some definite minimum time-period, from completion or cessation of the partnership, after which they may publish their researchers' account of the research they have participated in.

Regulatory authorities – such as those who control the market in drugs – base decisions on research: where (e.g. in any democracy) such authorities purport to act on behalf of the public, the research on which they base their decisions should properly be available for the public to scrutinise, alongside the regulators' explanation of their decisions (that carry the force of law). When businesses claim that their products have particular beneficial effects, advantages relative to their competitors or other objectively measurable properties, they should properly be obliged to back up such claims with (independently) peer-reviewed research which should be available for the public to scrutinise so as to assess the legitimacy of such claims. Without such guarantees, sponsors of research can create distorted impressions of the results of research, as outlined below.

There are, it must be granted, legitimate grounds why a business may wish to conduct research out of the public gaze; principally, where they do so in order to develop better products and services, that their competitors shall doubtless seek to emulate in due course. Allowing them to keep their research to themselves allows them to gain a first-mover advantage (and, where applicable, to apply for patents on the fruits of their labour) which enables them to afford to do such research. In so far as research does yield products and services which are genuinely better for the public, it is naturally in the public's interests to allow such private research. Yet, even in such a case, once the business does chose (if ever) to publish the results of such research (e.g. as part of making the publc aware of the new and better products and services), both its own and the public's interest is best served by the public having full access to the results of that research.

Public access to research results, furthermore, enables educated folk to pursue private research which enables them to make contributions that academia and corporate research might miss. While this might only seldom be important, contributions from those not actually employed to do research – such as Albert Einstein in 1905, when he did some of his most important work – are also valuable to the progress of science.

Providing access

The traditional remedy for these and similar public interest grounds for the publication of research results has long been that the research is published by businesses and available for the public to purchase. When the most practical way to disseminate results was by distributing printed copies, the cost of doing so had to be defrayed somehow and a moderate profit on top of that was a reasonable thing to allow the publisher. Given that the publishers do not pay authors, peer reviewers or the editorial boards – all of this work is done by academics, as part of their duties, at no cost to the publisher – the publisher's costs in this venture are modest, which might fairly be considered a grounds to moderate the profits they can fairly take for their part of the whole.

The market for such publications has, however, become grossly distorted, in large part by two facts: that researchers need access to their literature; and that the institutions that employ them are funded from the public purse. The latter ensures that any objectively justified need (such as the former) is met, hence that the publishers can increase their prices with little back-pressure from the customers who account for the vast majority of their revenues. (The customers may complain, but they do not stop paying.) As a result of this, coupled to the publishing industries' increasing rapacity, the journals in which research has traditionally been published are prohibitively expensive for private researchers (such as myself) and for institutions in poorer parts of the world.

One of the ways that publishers used to add significant value was in the copy-editing and typesetting of papers. Since roughtly the 1970s, there has been a shift towards researchers submitting their papers in ready-to-print formats, that greatly reduce the work involved in typesetting. The computer programs used to write papers, furthermore, have steadily improved their ability to alert the author to spelling and even grammatical errors. While it remains necessary to have sub-editors, proof-readers and professional typesetters to review and tweak the researchers' submissions, the scale of this work has been greatly reduced; yet publishers have not lowered their prices to reflect their contribution to the whole – indeed, they have rather increased their prices dramatically, significantly faster than inflation.

At the same time, the internet has made available the means to publish research at vastly reduced cost, compared to the traditional method of physically distributing copies printed on paper. Furthermore, electronic publication opens up vastly improved potential for cataloguing, indexing and cross-referencing research results, which greatly increase the benefits humanity may hope to derive from research; and open access to published research offers by far the best scope for realising these new benefits.

Traditional publishers can, to be sure, publish on the internet: indeed, they do so, but presently only subject to expensive subscriptions. (To take one example: in June 2016, a sister sent me a link to an article her employment lets her access; the paywall told me I could view the article for a mere $54, or the whole issue of the journal for $490.) This not only excludes all but the researchers at wealthy institutions but also hampers cross-linking between papers from the journals of different publishers, while balkanising catalogues and indexes, greatly reducing the value of these benefits of the new medium. If they were to make research journals freely available to all after some modest period of time (for example, the first half year), they would satisfy enough of the public's right to know that they might reasonably hope to find the public willing to accept that as a viable compromise. They would still retain a market for subscriptions, as many research institutes would still be keen to have early access to new research; that market would be somewhat reduced, both in scale and in profit margin (the option of being patient would make the need to subscribe less acute), but there would still be a market, on which they might fairly expect to make an honest profit (albeit not the high margins they have been used to). Indexes and catalogues, at least of works more than a few months old, could achieve their full potential.

It remains that alternative models of publishing can reasonably take the internet as an opportunity to enter the market and compete with the traditional model. Various open access journals are doing just that; for example, some are shifting the point of payment from the reader to the research institutions who are so eager to see their staff's work published (with various provisions for those from poorer institutions, and for private researchers, to be subsidised by those from wealthier institutions). Various other business models are being explored. Such models can meet the needs of academia without needing to impose a time-delay on the rest of us.

Benefits even for those lost on the jargon

One may fairly object that the overwhelming majority of the public cannot make any sense of academic papers, so how do they benefit from their availability ? The same objection has been raised in the case of free software, which trumpets the benefits of source code being available: and the answer to the objection is essentially the same in each case. The many who cannot directly benefit still do benefit thanks to scrutiny by those that can make more of the published work. At the same time, openness to a broader audience encourages authors to write more clearly, thus reducing the barriers that make their work hard for the public to read in the first place.

Broader Scrutiny

The general public's ability to benefit from the availability of research results starts with disinterested parties who evaluate and comment on research, provide synopses and translate results into plain terms. Open access enables those who do this, including journalists, to back up their interpretations of the primary research with references that anyone can follow. While such references may be of direct use only to a minority of readers, this minority is thus able to provide the public with a critique of such secondary accounts and to back up their critique with direct reference to relevant work. One may fairly hope that the broader portion of the public capable of searching out such critiques is sufficient to ensure that most of the public have access to some such searcher whose judgement they trust, by whom to obtain guidance on making sense of what they read. Even those who lack such a trusted adviser can still better trust what they read, if it includes references to primary material, simply because they know that others shall perform such review and kick up a fuss if they find fault with what they read.

By thus broadening the range of who can participate constructively in public discussion of research results, open access can enable a broader audience to gain a better understanding than is presently supported by highly exclusive publication to a narrow audience – albeit one with (hopefully) the fullest understanding of the work.

Improved Clarity

At the same time, open access would grant a premium to clarity of expression in academic papers. While only specialists in a field are in any position at all to read the published results of research, authors have little if any incentive to consider a broader audience in chosing how to express their work. To be sure, authors have sound reasons for using the specialist language and modes of expression proper to their fields, especially when their research deals with new material at the boundaries of knowledge. All the same, where their work can be put in plain terms without loss of clarity or precision in what they are saying, doing so widens the range of folk who have a decent chance of making sense of their work. While that work is invisible to the public, there is little pressure to do this; researchers learn little about how to communicate their ideas to the wider public; journalists honestly misunderstand, even when they are not wilfully doing so for sensationalist reasons, and this leads to miscommunication to the public.

Were the primary research available to the public, journalists would be under (at least some) pressure to refer to primary sources when reporting on it – and the public may, in any case, learn to search for them. Where the primary material is intelligible to the public, any errors in the journalists' reporting (whether innocent or otherwise) shall then be more readily discovered by their readers. Researchers who routinely bemoan the misrepresentation of their work by journalists shall thus have something concrete to gain by writing at least their introductions and summaries in terms intelligible to those outside their immediate specialisation.

By these and other (long-term, to be sure) mechanisms, open access to research may reasonably be expected to improve both researchers' ability to communicate to the public – who ultimately pay for their work, as the root of the research's sponsors' wealth, whether that wealth be derived from taxation or business profits – and the public's ability to critically evaluate what is reported to them about that research.

Tried and Tested

These benefits might seem tenuous or roundabout to some: yet, in the case of free software, the analogous mechanisms have lead to substantial benefits (e.g. Ubuntu) even for those who cannot make sense of source code. The quality of free software has improved at a rate that impresses anyone familiar with software development; external scrutiny has ensured that errors come to light faster and are resolved more robustly than could have been hoped for in a closed development environment (whereof I speak from a position of three decades' experience). Knowing that their work shall be read by a broader audience also does lead authors of free (a.k.a. open source) software to write more clearly and take greater care to make their code easier for others to understand. The situation is merely analogous, not exactly identical, yet I have every reason to expect the corresponding benefits, described above, to follow naturally for open access.

Selective publication

Alongside the case for open publication wherever research is published, there is a complementary case for ensuring that all research, at least of certain kinds, does see the light of day. If research results are only published when they serve the interests of their sponsors, the resulting distortions can give a misleading picture of the actual results of research.

Tobacco propaganda

To take one famous example: for several decades in the mid-to-late twentieth century, tobacco companies obfuscated the real evidence of the harms tobacco causes. They did so by funding large numbers of studies and selectively favouring publication of the results that could be represented as contradicting the (at the time, steadily consolidating) case that their products caused significant and lasting harm to their customers and to others exposed to the use of those products.

In fairness to the researchers, the story is somewhat complex: typically, they had insisted on being allowed to publish their results, but the sponsors had included contractual terms that allowed them a degree of editorial oversight. They used this to selectively facilitate the publication of results they liked while obstructing publication of unfavourable results by repeatedly sending papers back for revisions until either the researchers were too busy with later projects (or worn down with the senseless to-and-fro) to argue further or the research was so stale that journals were not so interested in taking the papers anyway. Thus unfavourable results were less likely, if published at all, to be seen in the more prestigious journals. By these and similar means, honest researchers were prevented from providing an impartial picture of the results of their research.

That this can happen at all depended on the statistical nature of most clinical trials. A study examines (for example) the health effects of tobacco use by comparing the health of two groups of people, some using tobacco and others not; or some more exposed to it, others less. Results for the two groups are subsequently compared and any differences assessed relative to what would be expected if tobacco use made no difference, on the average. That expectation can be used to evaluate how probable the observed results would be, were the observed differences due only to the natural random variability in the groups studied. Although such random variation might in principle produce quite large differences, the likelihood of such outcomes decreases with the size of the difference. A study then reports whether the results it found were fairly likely or very unlikely: commonly, for example, results are reported as being significant at a 95% confidence level if one could not reasonably expect those results to happen even as often as one time in twenty (i.e. 5% of the time, the complement of 95%). A result this unlikely, given the assumption that tobacco makes no difference, can reasonably be construed as indicating that tobacco does make a difference.

Generally, such comparisons have found tobacco to make a difference; and the difference is clearly bad for its users. However, by the very statistical nature of the trials, some shall show less difference than the (high) threshold needed for a 95% confidence result. Alternatively, in a study examining effects where tobacco actually does make no difference, one can expect the results to appear to make a difference, at the 95% confidence level, about one time in twenty; and that difference shall appear favourable in about half of such studies. Thus, by selectively controlling publication among a large enough number of studies, sponsors could flood the journals with clinical trials that showed no harmful effects (while hiding trials that did show harm) and throw in occasional results that showed apparent (but entirely illusory) benefits. Those funding research were thus able to create a deceptive impression of the real state of affairs.

Pre-registration

The remedy to such tactics is to insist that every research project (or, at least, all those which could reasonably be expected to produce results which might plausibly have repurcussions for the sponsors of the research) registers, before it begins, with some public body whose records of such registrations are fully open. Any research project not so registered can then be judged suspect of partiality such as the above; and the public (or researchers more concerned with the public's interests than those of the funding bodies) can scrutinise the records of registered projects for any patterns in the research projects started but never published. If they then find that certain sponsors are funding many research projects of which only a few publish results, this can be used as evidence to criticise the impartiality of the ones that are getting published, particularly if these happen to predominantly give results favourable to the sponsor.

Novelty bias

Another problem with the traditional publication system is that journals want to publish what's interesting, which needn't be what's representative. So the one in twenty trial that produces a result statistically significant at the 95% confidence level is more likely to be published than the other nineteen with boringly expected results, failing to contradict a null hypothesis. Thus an understandable preference of the publisher becomes a bias in what gets reported as science. Worse yet, review articles that survey published results thus draw from this biassed sample and miss results from the boring trials that went as expected.

In fact, we need all trials published and all of their data made available, in a format that facilitates sharing with, and use by, other scientists. Fortunately, the web makes this easy; a natural adjunct of the pre-registration scheme would be for the registrar to also provide storage for data from trials, or at least to record completion of a trial by linking to where the project team has put their data up on the web. There may be a need to anonymise data, of course, where it relates to human subjects; and some trials might have confidentiality reasons to not publish at all; but most research projects should be able to publish the bulk of their data quite straightforwardly – which is, after all, what the web was invented for.