In the previous post, I described how to use *hillclimbing* to break a substitution cipher when we don't know the key. But standard hillclimbing has weaknesses: it can get trapped in a local optimum (that is, trapped on the summit of a foothill in the fitness landscape); or it can be marooned on a fitness "plateau", where all changes give about the same result, so the search wanders aimlessly.

## Exploration and exploitation

To think about how to fix this problem, have a think about restaurants.

Imagine you want to go out for dinner tonight. Do you try a new and different restaurant, or do you stick with one you already know is good?

The answer depends a lot on how long you've been in the area. If you've recently moved in, you don't know which are the good restaurants. In fact, you might not even know what counts as "good" around here! If you go to a new place, it might be rubbish, but it might also be better than anything you've visited in the area.

On the other hand, if you've been in the area for a long time, you know which the good places are. You've got your list of your favourite restaurants and a much longer list of not-so-good ones. A new restaurant you visit *might* be a hidden gem, but it's more likely to be average and therefore worse than your favourites.

Generalising, these two situations illustrate the balance between *exploration* and *exploitation*. When you know little about your situation, *exploration* is the best option. As you become more familiar with what's good, you should invest more time and effort *exploiting* what you know is best.

This is similar to what we want to do with the cipher breaking. We start with some random proposal of what the key would be, but we have no real idea if it's any good. At this stage, we want to *explore* the space of possible keys, looking for something promising. As the search progresses, we get more understanding of what the key is, so we want to move to more *exploitation* of a key that we know is pretty good.

## Simulated annealing

Simulated annealing is one method of handling this. The term *annealing* comes from metallurgy. It's the process of hardening metal by cooling it slowly; as it cools, large crystals form in the metal, making the overall piece harder. The trick is that sometimes, some crystals get a bit smaller. This is a backward move in the short term; but it gives neighbouring crystals more room to grow, making the metal stronger in the long term.

That's the key idea of *simulated* annealing: sometimes we'll accept a worse solution than what we currently have, in the hope that it will lead to an even better solution in the long term. The simulated annealing algorithm is just a little bit different from the standard hillclimbing one:

```
current_solution ← some random solution
set a limit on number of iterations
set initial_temperature
while not reached iteration limit:
new_solution ← random_change(current_solution)
if goodness(new_solution) > goodness(current_solution):
current_solution ← new_solution
else:
MAYBE current_solution ← new_solution
decrease temperature a bit
```

The magic of the algorithm lies in that final `MAYBE`

statement. It sometimes chooses the `new_solution`

even if it's worse than the `current_solution`

. The probability it does so depends on both the goodness of the `new_solution`

and an overall `temperature`

. The better `new_solution`

is, the more likely it is to be chosen: very poor solutions are unlikely to ever be chosen. The `temperature`

is how the algorithm balances exploration with exploitation. When the temperature is high, `MAYBE`

is more likely to explore a given `new_solution`

; when the `temperature`

is low, the algorithm is more likely to reject a poor `new_solution`

and continue to explore the `current_solution`

it has.

The details of this algorithm can vary. Some variants don't automatically accept better solutions during the run. The method of varying the `temperature`

can change between implementations, though they nearly always end with `temperature = 0`

. And there is great variation in how the `new solution`

is selected if it's worse than the `current_solution`

.

The detail of how these factors are balanced is in the equation:

\[

p(\mathrm{new}) = \exp \left( \frac{\mathrm{good}(\mathrm{new}) - \mathrm{good}(\mathrm{current})}{T} \right)

\]

That gets implemented as this Python code (in `keyword_cipher.py`

on Github):

```
import math
import random
sa_chance = math.exp((new_fitness - current_fitness) / temperature)
if (new_fitness > current_fitness or random.random() < sa_chance):
current_solution = new_solution
```

Unfortunately, there are no hard-and-fast rules for how to set the initial `temperature`

or how to decrease it over the course of the execution. It needs to be high enough to allow a good amount of initial exploration, but needs to decrease over enough time to allow the algorithm to find a reasonable solution mid-run and refine it during the exploitation phase. I do it by a linear decrease in temperature, so the temperature decreases by a little bit each iteration until it reaches zero on the final iteration.

The algorithm still suffers from the overall flaw of hillclimbing, which is that there's no guarantee that it will find the overall (global) best solution. But done well, simulated annealing should be more likely to produce a sufficiently good solution.

In the next post, I'll test these claims by running both algorithms on some random enciphered texts.

### Credits

Cover image: I took it from Instructables, but I'm far from sure that's the original source.

Restaurant photo by Alexis Pineaud