An 'Enigma Cracker'
The Enigma Code is a well known example of what is now regarded as 'classical' cryptography. This cipher was used by
the Germans in the second world war, initially to secure communications traffic to and from submarines,
although other variants were later used elsewhere.
This example was chosen because of its historical interest, in addition, cryptanalysis is one of the areas in which parallel
computing methods are often relatively straightforward to apply: in a 'brute force and ignorance' cryptographic attack, a
large number of combinations of possible 'keys' are tried out, to see if any produce plausible decrypts.
Because it is possible to do these calculations largely independently from one another, this means that this is well suited
to a simple, highly decoupled, parallel computing solution, which we have implemented using our xml-space technology.
Historical Background
With some justification, the Germans had considerable faith in this cipher. The combination of the secret internal workings of
the mechanism, together with the large number of possible machine configurations, does appear to produce a cryptosystem which,
for all practical purposes, should have been unbreakable at the time. For more information, see:
Modelling the Enigma Machine
In the present example, the encoding/decoding process is done by a computer model written by
Andrew Gray,
which simulates the behaviour of the Enigma machine. The variant of the machine used here had three rotors, which were chosen
in arbitrary order from five possibilities; together with a fixed reflector.
This computer model includes the specific internal wiring of each of the different rotors, and of the reflector, and includes
the details of the mechanism used to advance the rotors.
To encode/decode a message, the model is configured with the appropriate rotor selections and initial positions. The
clear/cypher-text message is then passed in as a string, and is processed by the algorithm; the resulting cypher/clear-text
is then returned to the caller.
XML-Space Enigma Cracker Methodology
The method used here is a pure 'brute force and ignorance' approach, which is not generally recommended!
The cypher-text is passed through the model, and corresponding clear-text is computed for all possible rotor configurations;
each scenario producing a different candidate decrypt message from the same cyphertext input. The correct clear-text message will
only be recovered for the specific (secret) machine configuration that was originally used to encode the message.
As this process executes, each computed candidate clear-text message is assessed for its likelihood of being the correct solution.
Since the computer cannot 'read' the clear-text, a fast and simple method of automatically assessing each candidate solution is
required, to determine which is likely to be correct.
The technique used here is quite simple, a quality metric is computed for each candidate solution, which takes into
account the characteristics of the computed clear-text:
- The angle between the two 26-dimensional letter-frequency vectors - for the candidate clear-text, and empirical language use.
- How much of the message is composed of words which are recognised from an appropriate dictionary.
If suitable data structures and algorithms are used, then this metric can be computed very quickly. The set of possible solutions
are then ranked according to this quality metric, and the set of most likely solutions are returned for assessment.
Example - Decrypting a Secret Message
In this example, we have encoded a message using our Enigma machine simulator, with a 'secret' machine configuration. We then pass
the encoded message to our xml-space Enigma cracker. A compute-grid of worker machines then calculates and assesses all candidate
decrypts almost instantaneously. The xml-space maintains a list of the best decrypt candidates.
In this example, we then use an MS Office addin to extract the best decrypt candidates from the xml-space, and to display them
in a spreadsheet, as shown below.
Remembering that Enigma uses an 'X' characters for spaces, it can be seen that we have successfuly decoded the 'top secret' message.
The 'quality metric' of the solution (~ 0.83) being significantly greater than that of the next best candidate.