Neil's musings

Snippets from a computing educator and researcher.

Affine ciphers

March 25, 2018 in #codes and ciphers #python

Finally, we get to look at a new cipher: the affine cipher[1]. This cipher is very similar to the Caesar cipher, but includes a multiplication step.

Basic enciphering

Just with the Caesar cipher can be written

\[ c = p + s \pmod{26} \]

where \(p\) is the plaintext letter, \(c\) is the ciphertext letter and \(s\) is the shift, an affine cipher can be written

\[ c = m \times p + a \pmod{26} \]

where \(m\) is the multiplier and \(a\) is the adder.

This gives an easy Python implementation, rather similar to the Caesar cipher:

def affine_encipher_letter(letter, multiplier=1, adder=0):
    if letter in string.ascii_letters:
        c = unpos(pos(letter) * multiplier + adder)
        if letter in string.ascii_uppercase:
            return c.upper()
        else:
            return c
    else:
        return letter

affine_encipher is pretty much the same as caesar_encipher.

But there's a problem. The pos function maps a to 0, b to 1, and so on. But another plausible (and often used) version of the affine cipher maps a to 1, b to 2, and so on, before doing the multiplication step.

Rather than making a fixed choice in the code, we should add another keyword parameter, one_based, and use that to modify the calculation as needed. If one_based is True, we add one to the result of pos and subtract one just before calling unpos. That gives this code:

def affine_encipher_letter(letter, multiplier=1, adder=0, 
        one_based=True):
    if letter in string.ascii_letters:
        letter_number = pos(letter)
        if one_based: letter_number += 1
        cipher_number = (letter_number * multiplier + adder) % 26
        if one_based: cipher_number -= 1
        if letter in string.ascii_uppercase:
            return unpos(cipher_number).upper()
        else:
            return unpos(cipher_number)
    else:
        return letter

Deciphering

Deciphering affine ciphers is, in principle, no more complicated than enciphering them: convert the ciphertext letter to a number, undo the enciphering process, then convert the result back to a letter. Undoing the calculation is just rearranging the forumla above:

\[ p = \frac{c - a}{m} \pmod{26} \]

But there's a problem. Modular division is not straightforward to do. There are ways to calculate modular division, and there are even some implementations of how to do it. But as we're only looking at a few possible values, let's just cache all the possible solutions and look them up.

When deciphering, we know the values of c and m and want to find p. But we can easily find the value of c for any particular p and m, as \( c = p \times m \pmod{26} \). If go through all the possible values of p and m and find the correct c for each combination, we can build a lookup table modular_division_table, indexed on m and c, that gives the correct value of p.

We can implement that in Python as a dict comprehension. The keys to the dict are the 2-tuple (pair) of m and c : the value for each key is the corresponding value of p.

modular_division_table = {(m, (m * p) % 26): p 
                          for p in range(26) 
                          for m in range(26)}

Let's see if it works. Using a multiplier of 7 and an adder of 0 (just to keep things simple), let's see if we can encipher then decipher a message:

>>> cat(unpos(7 * pos(l)) for l in 'testmessage')
'dcwdgcwwaqc'
>>> cat(unpos(modular_division_table[7, pos(l)]) for l in 'dcwdgcwwaqc')
'testmessage'

That works! We can use this to write affine_decipher_letter, and incorporate all the fiddling around to deal with one-based multiplication.

def affine_decipher_letter(letter, multiplier=1, adder=0, one_based=True):
    """Encipher a letter, given a multiplier and adder"""
    if letter in string.ascii_letters:
        cipher_number = pos(letter)
        if one_based: cipher_number += 1
        plaintext_number = ( 
            modular_division_table[multiplier, (cipher_number - adder) % 26]
            )
        if one_based: plaintext_number -= 1
        if letter in string.ascii_uppercase:
            return unpos(plaintext_number).upper()
        else:
            return unpos(plaintext_number) 
    else:
        return letter

Invalid keys

But there's a problem. What if we use a multiplier of 8?

>>> cat(unpos(8 * pos(l)) for l in 'testmessage')
'wgowsgooawg'
>>> cat(unpos(modular_division_table[8, pos(l)]) for l in 'wgowsgooawg')
'trstzrssntr'

That's not right. It's even worse if we use a multiplier of 13:

>>> cat(unpos(13 * pos(l)) for l in 'testmessage')
'naanaaaaaaa'
>>> cat(unpos(modular_division_table[13, pos(l)]) for l in 'naanaaaaaaa')
'zyyzyyyyyyy'

The problem is that modular multiplication doesn't always have a well-defined inverse. For instance t is mapped to 19, and

\[ \mathrm{t} = 19 ; 8 \times 19 = 152 = 22 \pmod{26} = \mathrm{w} \]

But, g is mapped to 6, and

\[ \mathrm{g} = 6; 6 \times 8 = 48 = 22 \pmod{26} = \mathrm{w} \]

So when the affine deciphering function sees w in the ciphertext, should it decipher it into t or g?

Simiarlary for e and r, which both encipher to g.

\[ \mathrm{e} = 4; 8 \times 4 = 32 = 6 \pmod{26} = \mathrm{g} \]
\[ \mathrm{r} = 17; 8 \times 17 = 136 = 6 \pmod{26} = \mathrm{g} \]

The affine cipher only gives unambiguous results when the multiplier is coprime to 26, that is when the multiplier doesn't have any common factors with 26, and 26 = 2 × 13. That means only odd numbers, apart from 13, can sensibly be used as multipliers with affine ciphers.

We could catch the use of these multipliers in the ciphering and deciphering steps, but I'm not sure it would gain much. It's just something to be aware of when it comes to breaking ciphers, as it halves the number of keys we need to look at.

Breaking

Breaking the affine cipher is exactly the same as breaking the Caesar cipher: try all the possible keys and see which one gives text that looks most like English. In this case, the key has three parts: the multiplier, the adder, and whether the cipher is one-based. That means there are three nested for loops to try all possible keys.

The only wrinkle is the set of multipliers used: the range function counts from 1 in steps of 2, so generates 1, 3, … 25. The if in the list comprehension removes 13.

def affine_break(message, fitness=Pletters):
    """Breaks an affine cipher using frequency analysis"""
    sanitised_message = sanitise(message)
    best_multiplier = 0
    best_adder = 0
    best_one_based = True
    best_fit = float("-inf")
    for one_based in [True, False]:
        for multiplier in [x for x in range(1, 26, 2) if x != 13]:
            for adder in range(26):
                plaintext = affine_decipher(sanitised_message,
                                            multiplier, adder, one_based)
                fit = fitness(plaintext)
                if fit > best_fit:
                    best_fit = fit
                    best_multiplier = multiplier
                    best_adder = adder
                    best_one_based = one_based
    return (best_multiplier, best_adder, best_one_based), best_fit

Testing deciphering

Let's see if this works, but looking at the National Cipher Challenge 2017 challege 3a:

>>> ciphertext = open('2017/3a.ciphertext').read()
>>> print(ciphertext)
ICROCI,
G FZWMSZF YWIE IWRE CJWMF AZCF OWM YCGX GP FZE ZWYDGFCB, CPX SGTEP AZCF ZCY
ZCDDEPEX G FZGPU G GF AWMBX JE SWWX FW AWRU FWSEFZER.

MPLWRFMPCFEBO GF YEEIY FZCF G CI PWF FZE WPBO WPE FROGPS FW LGPX FZE JWWU.
FZERE GY C FZRGTGPS JBCQU ICRUEF CIWPS JGJBGWDZGBEY LWR CPOFZGPS CY WBX CY
FZGY. GF GY TERO TCBMCJBE CPX GL FZE JWWU XGYCDDECRY GPFW FZE YZCXWAO AWRBX
WL DRGTCFE BGJRCRGEY AE IGSZF PWF YEE GF CSCGP LWR QEPFMRGEY YW GF GY
RECBBO GIDWRFCPF AE SEF FW GF LGRYF.

FCQGFMY QBECRBO ACPFEX FW ICUE GF XGLLGQMBF FW CYYEIJBE FZE EPFGRE
XWQMIEPF, YW ZE ZCX GF ZGXXEP CF C PMIJER WL YGFEY CRWMPX FZE CPQGEPF
AWRBX. FZE JRGFGYZ BGJRCRO XEQGXEX FW YEPX YWIEWPE FW FRO FW LGPX CBB FZE
DGEQEY. FZEGR EHDERFY LGSMREX FZE QZCDFERY AERE BGUEBO FW JE EPQRODFEX FWW
YW, SGTEP IO JCQUSRWMPX, FZEO CYUEX IE FW SW. FZEO AERE RGSZF WL QWMRYE,
FZE LGRYF QZCDFER ACY XGYSMGYEX MYGPS C QCEYCR YZGLF CPX FZE YEQWPX JO C
YQOFCBE QGDZER, CPX G CI JESGPPGPS FW AWPXER AZCF WFZER QGDZERY FCQGFMY
IGSZF QZCBBEPSE MY AGFZ. CY LCR CY AE UPWA FZE RWICPY AWMBX PWF ZCTE ZCX
FZCF ICPO FW QZWWYE LRWI. G YMDDWYE ZE IGSZF ZCTE UPWAP CJWMF YWIE LWRI WL
FZE DWBOJGMY QGDZER.

FZE XWQMIEPFY G ZCTE LWMPX YW LCR (FCQGFMY BCYF FEYFCIEPF CPX QZCDFERY WPE
CPX FAW WL ZGY ZGXXEP JWWU) ECQZ SGTE C QBME FW FZE BWQCFGWP WL FZE PEHF
WPE, CPX CY YWWP CY FZEO BEF IE WMF WL ZWYDGFCB G DBCP FW FRO FW BWQCFE FZE
FZGRX QZCDFER, AZGQZ FCQGFMY FEBBY MY GY ZGXXEP WP FZE GYBE WL FZWRPY.
FZERE CRE YETERCB DBCQEY GP FZE CPQGEPF AWRBX FZCF FZGY IGSZF RELER FW, JMF
G ZCTE C DREFFO SWWX GXEC WL AZERE GF IGSZF JE.

GL G FZGPU WL CPOFZGPS EBYE G AGBB SEF GF FW OWM, CPX CY REKMEYFEX G AGBB
QWDO GP OWMR LRGEPX ZCRRO. G YMSSEYF AE FGSZFEP YEQMRGFO C BGFFBE. ICOJE GL
AE JBWQU WMR QGDZER FEHFY BGUE FZE EHDERFY GF AGBB DMF WLL QCYMCB
GPFERQEDFY. FZE DEWDBE AE CRE MD CSCGPYF CRE MPBGUEBO FW JE XEFERREX JO
IMQZ, JMF AE XWP'F ACPF FW GPTGFE EHFRC GPFERBWDERY FW FZE DCRFO.

JO FZE ACO, IO LRGEPX GP FZE YWMU FWBX IE FZCF ZE IEF WPE WL OWMR CSEPFY
AZW ACY CYUGPS GL ZE UPEA AZERE G ACY, AZGQZ YEEIY C JGF WXX SGTEP FZCF OWM
UPWA G CI YFMQU ZERE. ACY FZCF NMYF FW FZRWA WMR RGTCBY WLL FZE YQEPF?

CBB FZE JEYF,

NWXGE

>>> (m_a, a_a, o_a), score = affine_break(ciphertext)
>>> (m_a, a_a, o_a), score
((7, 22, True), -2146.240223193519)
>>> print(affine_decipher(ciphertext, m_a, a_a, o_a))
MARYAM,
I THOUGHT SOME MORE ABOUT WHAT YOU SAID IN THE HOSPITAL, AND GIVEN WHAT HAS
HAPPENED I THINK I IT WOULD BE GOOD TO WORK TOGETHER.

UNFORTUNATELY IT SEEMS THAT I AM NOT THE ONLY ONE TRYING TO FIND THE BOOK.
THERE IS A THRIVING BLACK MARKET AMONG BIBLIOPHILES FOR ANYTHING AS OLD AS
THIS. IT IS VERY VALUABLE AND IF THE BOOK DISAPPEARS INTO THE SHADOWY WORLD
OF PRIVATE LIBRARIES WE MIGHT NOT SEE IT AGAIN FOR CENTURIES SO IT IS
REALLY IMPORTANT WE GET TO IT FIRST.

TACITUS CLEARLY WANTED TO MAKE IT DIFFICULT TO ASSEMBLE THE ENTIRE
DOCUMENT, SO HE HAD IT HIDDEN AT A NUMBER OF SITES AROUND THE ANCIENT
WORLD. THE BRITISH LIBRARY DECIDED TO SEND SOMEONE TO TRY TO FIND ALL THE
PIECES. THEIR EXPERTS FIGURED THE CHAPTERS WERE LIKELY TO BE ENCRYPTED TOO
SO, GIVEN MY BACKGROUND, THEY ASKED ME TO GO. THEY WERE RIGHT OF COURSE,
THE FIRST CHAPTER WAS DISGUISED USING A CAESAR SHIFT AND THE SECOND BY A
SCYTALE CIPHER, AND I AM BEGINNING TO WONDER WHAT OTHER CIPHERS TACITUS
MIGHT CHALLENGE US WITH. AS FAR AS WE KNOW THE ROMANS WOULD NOT HAVE HAD
THAT MANY TO CHOOSE FROM. I SUPPOSE HE MIGHT HAVE KNOWN ABOUT SOME FORM OF
THE POLYBIUS CIPHER.

THE DOCUMENTS I HAVE FOUND SO FAR (TACITUS LAST TESTAMENT AND CHAPTERS ONE
AND TWO OF HIS HIDDEN BOOK) EACH GIVE A CLUE TO THE LOCATION OF THE NEXT
ONE, AND AS SOON AS THEY LET ME OUT OF HOSPITAL I PLAN TO TRY TO LOCATE THE
THIRD CHAPTER, WHICH TACITUS TELLS US IS HIDDEN ON THE ISLE OF THORNS.
THERE ARE SEVERAL PLACES IN THE ANCIENT WORLD THAT THIS MIGHT REFER TO, BUT
I HAVE A PRETTY GOOD IDEA OF WHERE IT MIGHT BE.

IF I THINK OF ANYTHING ELSE I WILL GET IT TO YOU, AND AS REQUESTED I WILL
COPY IN YOUR FRIEND HARRY. I SUGGEST WE TIGHTEN SECURITY A LITTLE. MAYBE IF
WE BLOCK OUR CIPHER TEXTS LIKE THE EXPERTS IT WILL PUT OFF CASUAL
INTERCEPTS. THE PEOPLE WE ARE UP AGAINST ARE UNLIKELY TO BE DETERRED BY
MUCH, BUT WE DON'T WANT TO INVITE EXTRA INTERLOPERS TO THE PARTY.

BY THE WAY, MY FRIEND IN THE SOUK TOLD ME THAT HE MET ONE OF YOUR AGENTS
WHO WAS ASKING IF HE KNEW WHERE I WAS, WHICH SEEMS A BIT ODD GIVEN THAT YOU
KNOW I AM STUCK HERE. WAS THAT JUST TO THROW OUR RIVALS OFF THE SCENT?

ALL THE BEST,

JODIE

Code

The code is in cipher/affine.py on Github.

Acknowlegements


  1. The affine cipher is named after an affine transformation in geometry. Basically, an affine transformation of a point or shape is some scaling of the original (i.e. a multiplication) and a translation (i.e. an addition). But you don't need to know about that for this cipher. ↩︎

Share on Google+