Neil's musings

Snippets from a computing educator and researcher.

Caesar ciphers

March 15, 2018 in #codes and ciphers #python

This post starts a series on writing programs to use and break different ciphers. I'm concentrating on ciphers used before World War II, as they're relatively easy to understand and relatively easy to program.

The Caesar cipher

We'll start by looking at the Caesar cipher.

360px-Cipher_device

This is a simple cipher where each plaintext letter is shifted some positions along the alphabet to give the ciphertext letter. It can be used with two concentric wheel, like in the photo. You align the inner wheel with the appropriate part of the outer wheel. You can then find the plaintext letter in the inner wheel and the ciphertext letter is the corresponding one in the outer wheel.

Encryption happens letter by letter. Non-letter characters generally stay the same.

Decryption works the other way, just in reverse. To find the plaintext letter, reverse the original shift from the ciphertext letter.

For example, a shift of four would make the following conversions:

Plaintext A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Ciphertext E F G H I J K L M N O P Q R S T U V W X Y Z A B C D

The word test would be enciphered as xiwx, and lippk would be deciphered to hello.

Implementing the Caesar cipher

How should we go about writing a program to do this?

As each letter is enciphered independently, we can do a very simple function to encipher a message:

to ceasar-encipher (message, shift):
    for each character in message:
        caesar-encipher-letter (character, shift)

Similarly, we can exploit that the deciphering is just enciphering with a negative shift, and write a very simple function to decipher a message:

to caesar-decipher (message, shift):
    caesar-encipher (message, -shift)

How do we encipher a character? We need to be a bit defensive here, and account for the fact that we may be passed non-letters. We also have to think about how we want to handle upper- and lower-case letters. Should non-letters be removed from the result, kept unchanged, cause an error, or something else? Should all letters be converted to upper-case, or all converted to lowercase?

I'll be conservative here, and say that non-letters are retained unchanged, and the case of letters should be retained.

This means I can write caesar-encipher-letter as:

to caesar-encipher-letter(character, shift):
    if character is a letter:
        cipherletter = letter + shift
        if character is uppercasea:
            return uppercase(cipherletter)
        else:
            return cipherletter
    else:
        return character

The only complication is the letter + shift bit: most programming languages will reject this statement as characters and numbers can't be added together.

As we'll be doing a lot of this sort of thing in future, let's define a couple of utility functions which convert letters to their positions in the alphabet, and back again. Because we'll be using these functions a lot, I give them deliberately short names. pos finds the position of a letter in the alphabet, ignoring case, and raiseing an error if it's given something other than a letter.

def pos(letter): 
    """Return the position of a letter in the alphabet (0-25)"""
    if letter in string.ascii_lowercase:
        return ord(letter) - ord('a')
    elif letter in string.ascii_uppercase:
        return ord(letter) - ord('A')
    else:
        raise ValueError('pos requires input of {} to be an ascii letter'.format(letter))

This uses the string module from Python's standard library. That module defines some useful constants, such as string.ascii_letters, string.ascii_lowercase and string.ascii_uppercase.

unpos does the reverse, convering a number into a (lowercase) letter. If the number is outside the range 0–25, the modulus operator, %, trims the number into that range.

def unpos(number): 
    """Return the letter in the given position in the alphabet (mod 26)"""
    return chr(number % 26 + ord('a'))

We can use these to write the caesar_encipher_letter function:

def caesar_encipher_letter(accented_letter, shift):
    """Encipher a letter, given a shift amount"""
    letter = unaccent(accented_letter)
    if letter in string.ascii_letters:
        cipherletter = unpos(pos(letter) + shift)
        if letter in string.ascii_uppercase:
            return cipherletter.upper()
        else:
            return cipherletter
    else:
        return letter

And finally, we can write caesar_encipher and caesar_decipher:

def caesar_encipher(message, shift):
    """Encipher a message with the Caesar cipher of given shift"""
    enciphered = ""
    for character in message:
        enciphered.append(caesar_encipher_letter(l, shift))
    return enciphered

def caesar_decipher(message, shift):
    """Decipher a message with the Caesar cipher of given shift"""
    return caesar_encipher(message, -shift)

However, all those append calls in caesar_encipher seem a bit wasteful. It's better if we use a list comprehension to walk along the characters in the message, enciphering each one as we go.

def caesar_encipher(message, shift):
    """Encipher a message with the Caesar cipher of given shift"""
    enciphered = [caesar_encipher_letter(l, shift) for l in message]
    return cat(enciphered)

Because this returns a list of characters where we want a string, let's define a little function do the conversion:

cat = ''.join

Now let's see if it works:

$ python3
Python 3.6.3 (default, Oct  3 2017, 21:45:48)
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from cipher.caesar import *
>>> caesar_encipher('This is a test message.', 4)
'Xlmw mw e xiwx qiwweki.'
>>> caesar_decipher('Xlmw mw e xiwx qiwweki.', 4)
'This is a test message.'

Success!

Code

See all the code on Github.

pos, unpos, and cat are in support.utilities.py.

caesar_encipher, caesar_decipher, and caesar_encipher_letter are in ciphers.caesar.py.

Acknowledgements

Share on Google+
No Older Posts