This post starts a series on writing programs to use and break different ciphers. I'm concentrating on ciphers used before World War II, as they're relatively easy to understand and relatively easy to program.
The Caesar cipher
We'll start by looking at the Caesar cipher.
This is a simple cipher where each plaintext letter is shifted some positions along the alphabet to give the ciphertext letter. It can be used with two concentric wheel, like in the photo. You align the inner wheel with the appropriate part of the outer wheel. You can then find the plaintext letter in the inner wheel and the ciphertext letter is the corresponding one in the outer wheel.
Encryption happens letter by letter. Non-letter characters generally stay the same.
Decryption works the other way, just in reverse. To find the plaintext letter, reverse the original shift from the ciphertext letter.
For example, a shift of four would make the following conversions:
Plaintext | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Ciphertext | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | A | B | C | D |
The word test
would be enciphered as xiwx
, and lippk
would be deciphered to hello
.
Implementing the Caesar cipher
How should we go about writing a program to do this?
As each letter is enciphered independently, we can do a very simple function to encipher a message:
to ceasar-encipher (message, shift):
for each character in message:
caesar-encipher-letter (character, shift)
Similarly, we can exploit that the deciphering is just enciphering with a negative shift, and write a very simple function to decipher a message:
to caesar-decipher (message, shift):
caesar-encipher (message, -shift)
How do we encipher a character? We need to be a bit defensive here, and account for the fact that we may be passed non-letters. We also have to think about how we want to handle upper- and lower-case letters. Should non-letters be removed from the result, kept unchanged, cause an error, or something else? Should all letters be converted to upper-case, or all converted to lowercase?
I'll be conservative here, and say that non-letters are retained unchanged, and the case of letters should be retained.
This means I can write caesar-encipher-letter
as:
to caesar-encipher-letter(character, shift):
if character is a letter:
cipherletter = letter + shift
if character is uppercasea:
return uppercase(cipherletter)
else:
return cipherletter
else:
return character
The only complication is the letter + shift
bit: most programming languages will reject this statement as characters and numbers can't be added together.
As we'll be doing a lot of this sort of thing in future, let's define a couple of utility functions which convert letters to their positions in the alphabet, and back again. Because we'll be using these functions a lot, I give them deliberately short names. pos
finds the position of a letter in the alphabet, ignoring case, and raise
ing an error if it's given something other than a letter.
def pos(letter):
"""Return the position of a letter in the alphabet (0-25)"""
if letter in string.ascii_lowercase:
return ord(letter) - ord('a')
elif letter in string.ascii_uppercase:
return ord(letter) - ord('A')
else:
raise ValueError('pos requires input of {} to be an ascii letter'.format(letter))
This uses the string
module from Python's standard library. That module defines some useful constants, such as string.ascii_letters
, string.ascii_lowercase
and string.ascii_uppercase
.
unpos
does the reverse, convering a number into a (lowercase) letter. If the number is outside the range 0–25, the modulus operator, %
, trims the number into that range.
def unpos(number):
"""Return the letter in the given position in the alphabet (mod 26)"""
return chr(number % 26 + ord('a'))
We can use these to write the caesar_encipher_letter
function:
def caesar_encipher_letter(accented_letter, shift):
"""Encipher a letter, given a shift amount"""
letter = unaccent(accented_letter)
if letter in string.ascii_letters:
cipherletter = unpos(pos(letter) + shift)
if letter in string.ascii_uppercase:
return cipherletter.upper()
else:
return cipherletter
else:
return letter
And finally, we can write caesar_encipher
and caesar_decipher
:
def caesar_encipher(message, shift):
"""Encipher a message with the Caesar cipher of given shift"""
enciphered = ""
for character in message:
enciphered.append(caesar_encipher_letter(l, shift))
return enciphered
def caesar_decipher(message, shift):
"""Decipher a message with the Caesar cipher of given shift"""
return caesar_encipher(message, -shift)
However, all those append
calls in caesar_encipher
seem a bit wasteful. It's better if we use a list comprehension to walk along the characters in the message, enciphering each one as we go.
def caesar_encipher(message, shift):
"""Encipher a message with the Caesar cipher of given shift"""
enciphered = [caesar_encipher_letter(l, shift) for l in message]
return cat(enciphered)
Because this returns a list of characters where we want a string, let's define a little function do the conversion:
cat = ''.join
Now let's see if it works:
$ python3
Python 3.6.3 (default, Oct 3 2017, 21:45:48)
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from cipher.caesar import *
>>> caesar_encipher('This is a test message.', 4)
'Xlmw mw e xiwx qiwweki.'
>>> caesar_decipher('Xlmw mw e xiwx qiwweki.', 4)
'This is a test message.'
Success!
Code
See all the code on Github.
pos
, unpos
, and cat
are in support.utilities.py
.
caesar_encipher
, caesar_decipher
, and caesar_encipher_letter
are in ciphers.caesar.py
.
Acknowledgements
- Post image from Wikimedia Commons.
- Cipher wheel image from Wikimedia commons