Substitution cipher tool

A cipher is a transformation that can be applied to text to hide its meaning – e.g. while being transmitted in contexts where it might be intercepted by folk the sender doesn't want to see it – yet which can later be undone to recover the text. When designing a cipher, one needs the author to be able to apply the cipher – to encipher the real message, known as the plain text, yielding the encoded message, known as the cipher-text – and the intended recipient to be able to undo the cipher – to decipher the message – but it should be hard for anyone else – known as an attacker – to decipher the message. The hard part lies in making it easy for the intended recipient to do something that's hard for the attacker: the intended recipient needs to have someting the attacker lacks, usually knowledge of a secret, that makes the difference.

One of the oldest ciphers known (attributed to Julius Caeser) is to simply replace each letter of the alphabet with the one thirteen steps forward or backwards from it; if you write the alphabet evenly-spaced round a circle, this just corresponds to rotating the circle thirteen steps, so it's known as rotate 13 or just rot13. More usefully, one can simply write out the alphabet in two lines of thirteen letters; then each letter is converted to the one in the same position of the other line. One can do more complex substitutions – in general, any permutation of the letters is open to you – but, no matter how cleverly you permute the letters, this class of cipher – the substitution cipher – is extremely easy to break.

Natural languages don't use all letters equally frequently – for example, in English, the letter e occurs a little over 9% of the time, t 7%, o a little less, i almost 6% and a a little less and so on down to q and z which occur less than once per thousand letters – so an attacker can count the frequencies of the substituted letters in the cipher-text; whichever letters show up most often are good candidates to be some of the more common letters in whichever language the underlying message uses. Likewise, in any given language, certain words are more common than others. Furthermore, even when you've substituted the letters, the patterns of letter re-use within a word are preserved; even after you've encoded forever as LGMHTHM (say), the re-use of r and e is still visible, as re-use of H and M; it's fairly easy (particularly nowadays, with computers at the attaker's disposal) to search a dictionary for the words with a given pattern of letter re-use. This kind of approach gives the attacker a small number of possibilities to try, with good likelihood that one of these shall work, instead of having to try all of the 26×25×…×3×2×1 ≈ 4e26 possibilities (this is as many as the number of atoms in about 670 grammes of Hydrogen). Once an attacker has worked out a few words, the same substitutions as worked for those shall apply to the rest of the text, making it easier to make good guesses and thereby discover more clues.

More sophisticated ciphers (dating at least as far back as the 1600s) change the substitution as they go along, sometimes even in ways that depend on the text being enciphered. Modern ciphers go beyond that, but that's a topic for another day, and some other page.

It remains that substitution ciphers have their uses: rot13 is widely used in various internet contexts as a way to hide text from those who don't want to see it. Some folk, when publishing a joke-riddle, will supply the answer in rot13 so that readers can have the fun of trying to puzzle out the riddle before reading the punch-line. If someone is reviewing a book or movie and parts of their review would give away the plot (but are pertinent to discussion with those who already know the plot), encoding that part lets folk who haven't read or seen the work get as much out of the review as possible, without spoiling the work. I've seen one children's book use a variant on the theme to add a bit of a challenge for readers; and I was prompted to write this page by an enjoyable comic having a protracted interlude where one of its characters, as a result of a traumatic loss, lost the ability to speak normally – the author subjected her speech to a substitution cipher which changed from one strip to the next (with bonus errors in enciphering to make it harder to decipher).

The substitution tool

You can use the form below to perform substitution on a text: either to encode a text using a substitution cipher or as a helper in trying to decode one. Just type the text into the text area, in place of the example text, and fill in the substitutions you want to apply.

InA
Count0
Out

Status display.


Valid CSSValid HTML 4.01 Written by Eddy.