## My Simple Cipher

Here is a message, encrypted with my simple cipher:

```    WATAE TXLAL PDIBI MGTBU YBTMM FJBJW KBEIZ SGLZQ CCOCT XMQHM QSLFS AOBFX
QQRCK CJSME XFHLT GYSJN QEQUM FNPIZ ZMCWE PXQOD UIDLO SXLCV CGIWI UIVYC
HLYAE TKYKY RVOVZ FJWAR RCYCN SSJNN ADVZB BMQJQ UVGKC UCPVN BGRZA EVOMF
THBSK OZUYQ QDGUO FUIAT XOWPN QESJM MUHHU XBTMM NYGYF YFNFH VIATB UOHPD
QVJAT AEYLT MQTLE EXBTH MMYCT BDDAX```

You may want to solve the above puzzle. There is one fairly strong clue hidden in the encrypted text.

Some hints:

The first thing you should notice is that the letters are in groups of five. So, one would assume that the spaces (and probably punctuation) have been removed, and meaningless spaces have been added after every fifth letter, for readability. This makes any cipher harder to decrypt. Many standard clues (see Solving Cryptograms) are of no or little help. Here are a few hints concerning my cipher:

• It is not a simple substitution cipher.
• Both the original plain text message and the encrypted message contain 270 characters (maybe with meaningless filler characters at the end).
• Only one rather simple method of encryption has been used.
• The encryption method is fairly robust. In other words, an error (while encrypting or decrypting) does not scramble the entire remaining message. Some very complicated ciphers have this problem.
• M and T appear most often, and R appears least often. I do not think these facts help at all.
• Both the original plain text and the encrypted message read from left to right.

If you didn't solve the puzzle already, you may want to try to solve it again.

Solution:

The strong clue that I mentioned above is that you can make a list of the difference between consecutive letters (AB would be a difference of 1, CA would be a difference of -2 or +24), and these differences follow a pattern. You would find that a difference of four occurs most often. Instead of using numbers for these differences, we can use letters. And that solves the puzzle.

My method of encryption is to add each plaintext letter to the previous encrypted letter, modulo 26 (see Modular Arithmetic), with A=0 and Z=25. S+N=F because S=18, N=13, and 18+13=31 which is congruent to 5 (F) mod 26. The first letter "W" of the message remains "W". Decrypting involves subtraction. The previous encrypted letter is subtracted from the current encrypted letter to get the plain text letter. In the above message, the first letter is W, the second is A-W=E, the third is T-A=T, then A-T=H, etc. The entire message is the Preamble to the U.S. Constitution (see it on the WWW somewhere), with a couple of X's added at the end, to fill out the last five-letter group.

Further encryption:

At first, it would seem that we can make the above method fairly secure (difficult to decrypt without a key) if our encrypted message is further encrypted with a simple substitution cipher. Even if the method is known, without the 26-letter key, many such keys must be tried. There are a lot of possible 26-letter keys, over 400 septillion (American septillion by the way; see Million, Billion, Trillion...) of them. Consider the letter B. In the first phase of encryption, B may produce MN. Somewhere else, it may produce ST. This is fairly distinctive. The difference is one (B). But after a second phase of substitution cipher, MN and ST would seem to have no relationship whatsoever. Have I found a really secure cipher?

My advice to myself is, "Guess again." For longer messages, this cipher (my simple cipher + simple substitution) is not very secure. Every time a pair of encrypted letters (QB for example) repeats itself, they represent the same plain text letter. Frequency analysis of pairs should be very helpful in breaking this cipher. We have 26x26=676 possible pairs, and our frequency table is a 26x26 table instead of a 1x26 list. If a simple substitution cipher of 100 letters can be solved fairly easily, then a text of 100x26=2600 letters, using this cipher (simple cipher + simple substitution), should be about as easy to break.

Repetitions of three or more letters would be even more informative. For example, if an encrypted letter is N, and the next plain text words are "the constitution is", then a second encrypted N followed by "the constitution is" will result in an identical sequence of 18 letters, unlikely to be a coincidence. Of course it is unlikely that two copies of "the constitution is" will be preceded by the same encrypted letter. But entire repeated phrases can happen.