Frequency Analysis
Firstly, what is frequency analysis and how will it help you?
Frequency Analysis is basically the process of examining the occurance of characters, but when combined with other skills it becomes an essential tool for cryptanalysis. Linguistics, imagination and visual awareness are just some skills, that when used in combination with frequency analysis can break most simple ciphers with suprising ease.
The import frequency Analysis, circa 1000 CE from the Arab world, forced the world of cryptography to change and marked the transition between ciphers to mathmatics, science and mechanics. The early ciphers could no longer stand against such an adversary. The use of frequency analysis highlighted a major weakness in the simple substitution ciphers (e.g. Playpen, Atbash, Monoalphabetic). These ciphers replace a letter with that of another, but that does not conceal the 'behaviour' of the letters or their frequency.
Each letter of the alphabet (on average) has a frequency that it is likely to turn up, so the common vowel E is much more likely to appear than the letter Z. From the study of many pieces of text, we could compile a frequency chart. We could then perform frequency analysis on our Cipher-text and then compare those results to a standard frequency chart. It is important to consider what Cipher-text you have e.g. Military text, private letter, and also the language its written in. Caesar enhanced the security of his cipher by sending the message in Greek, which was not well known at that time.
In the above paragraph, the letter 'E' appears 66 times, 'T' 54 times and poor old 'Z' once and that was just to say it didn't appear much. Click below to open a new window showing a graph of the standard letter distribution in a draft version of this essay:
Click for a frequency chartThe other weakness is an inability to disguise the rules of langauge. The English language has only a few letters that are commonly found in pairs (EE TT FF LL SS), a few words with two letters (AN AT IN IT IF IS WE OF ON TO SO GO) and even fewer with one (I A). This can be a good starting point for breaking into a cipher, if the spacing is in tact. Sometimes cipher messages are broken down into groups of five, making the Cryptanalyst's task slightly trickier. However, if spacing remains, we can already see the 'shape' of the plaintext even if we can't translate it. By attacking the small words with the aid of frequency analysis we should start to see parts of the plaintext come through.
A feasible order to attack a Cipher text (with spacing) could be:
- Frequency Analysis
- One and two letter words
- Pairs and repetition
If the spaces have been removed, then it removes the opportunity to specifically attack the smaller words, however, you may well see many two, three letter repititions for the smaller words. Looking for patterns is an important feature of breaking ciphers, as they often show weakness in the stength of the cipher.
It is also worth noting, the unimportance of the substitute characters used. For example, I could replace each occurance of the letter 'A' with a picture, a symbol or a colour. It would not change the difficulty of the cipher, or the approach used to break it. This is particularly true with binary, as you only need to be able to substitute two characters.
Always keep your eyes open and look for those patterns ;)
Monty 2004 :: Back Home