By Randy Nichols (LANAKI) President of the American Cryptogram Association from 1994-1996. Executive Vice President from 1992-1994
CLASSICAL CRYPTOGRAPHY COURSE BY LANAKI December 05, 1995
LECTURE 4 SUBSTITUTION WITH VARIANTS Part III MULTILITERAL SUBSTITUTION SUMMARY Welcome back from the Thanksgiving holiday break. The good news is that this lecture will come to you about Christmas, therefore, no homework. The not so good news is that this concluding Lecture 4 on Substitution with Variants covers some difficult material of wide practically in the field. In Lecture 4, we complete our look into English monoalphabetic substitution ciphers, by describing multiliteral substitution with difficult variants. The Homophonic and GrandPre Ciphers will be covered. The use of isologs is demonstrated. A synoptic diagram of the substitution ciphers described in Lectures 1-4 will be presented. MULTILITERAL SUBSTITUTION WITH MULTIPLE-EQUIVALENT CIPHER ALPHABETS - aka "MONOALPHABETIC SUBSTITUTION WITH VARIANTS" Each English letter in plain text has a characteristic frequency which affords definite clues in the solution of simple monoalphabetic ciphers. Associations which individual letters form in combining to make up words, and the peculiarities which certain of them manifest in plain text, afford further direct clues by means of which ordinary monoalphabetic substitution encipherments of such plain text may be readily solved. [FR1] Cryptographers have devised methods for disguising, suppressing, or eliminating the foregoing characteristics in the cryptograms produced by methods described in Lectures 1-3. One category of methods called "variants or variant values" is that in which the letters of the plain component of a cipher alphabet are assigned two or more cipher equivalents. Systems involving variants are generally multiliteral. In such systems, there are a large number of equivalents made available by combinations and permutations of a limited number of elements, each letter of the plain text may be represented by several multiliteral cipher equivalents which may be selected at random. For example, if 3-letter combinations are employed as multiliteral equivalents, there are 263 or 17,576 available equivalents for the 26 letters of the plain text. They may be assigned in equal numbers of different equivalents for the 26 letters, in which case each letter would be representable by 676 different 3 letter equivalents or they be assigned on some other basis, for example proportionately to the relative frequencies of the plain text letters. [FR1] The primary object of substitution with variants is again to provide several values which may be employed at random in a simple substitution of cipher equivalents for the plain text letters. As a slight diversion, the reader may ask about uniliteral substitution with variants. It is but not very practical. Note the following cipher alphabet constructed in French by Captain Roger Baudouin in reference [BAUD]: | Plain | a | b | c | d | e | f | g | h | i | l | m | n | o | p | q | r | s | t | u | v | x | z | | Cipher | L | G | O | R | F | Q | A | H | C | M | B | T | I | D | N | P | U | S | Y | E | W | J | | | | | | | K | | | | | | | X | | | | | Z | | | | | | | V |
(Note that the Captain was not an ACA member. The H=H combination is not allowed.) Baudouin proposed that the J and Y plain be replaced by I plain and K plain by C plain or Q plain and W plain by VV plain. Four cipher letters would be available as variants for the high- frequency plain text letters in French. Mixed alphabets formed by including all repeated letters of the key word or key phrase in the cipher component were common in Edgar Allen Poe's day but are impractical because they are ambiguous, making decipherment difficult; for example: Enciphering Alphabet: | Plain | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z | | Cipher | N | O | W | I | S | T | H | E | T | I | M | E | F | O | R | A | L | L | G | O | O | D | M | E | N | T |
Inverse form for deciphering: | Cipher | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | | Plain | p | | | v | h | m | s | g | d | | | q | k | a | b | | | o | e | f | | | c | | | | | | | | | | l | | | | j | | | r | w | y | n | | | | | i | | | | | | | x | | | | | | | | | | t | | | | | z | | | | | | | | | | | | | | | | | u |
The average cipher clerk would have difficulty in decrypting a cipher group such as TOOET, each letter having 3 or more equivalents, from which plain text fragments (n)inth, ft thi(s), it thi, etc. can be formed on decipherment. [FR1] THEORETICAL DISTINCTIONS In simple or single-equivalent monoalphabetic substitution with variants, two points are evident: 1) the same letter of the plain text is invariably represented by but one and always the same character or cipher unit of the cryptogram. 2) The same character or cipher unit of the cryptogram invariably represents one and always the same letter of the plain text. In multiliteral - equivalent monoalphabetic substitution with variants, two points are also evident: 1) the same letter of the plain text may be represented by one or more different characters or cipher units of the cryptogram. But, 2) The same character or cipher unit of the cryptogram nevertheless invariably represents one and always the same letter of the plain text. SIMPLE TYPES OF CIPHER ALPHABETS WITH VARIANTS Figure 4-1 | | | | 6 | 7 | 8 | 9 | 0 | | | | | 1 | 2 | 3 | 4 | 5 | | * | * | * | * | * | * | * | * | | 6 | 1 | * | A | B | C | D | E | | 7 | 2 | * | F | G | H | IJ | K | | 8 | 3 | * | L | M | N | O | P | | 9 | 4 | * | Q | R | S | T | U | | 0 | 5 | * | V | W | X | Y | Z |
Figure 4-2 | | | | | V | W | X | Y | Z | | | | | | Q | R | S | T | U | | * | * | * | * | * | * | * | * | * | | L | F | A | * | A | B | C | D | E | | M | G | B | * | F | G | H | IJ | K | | N | H | C | * | L | M | N | O | P | | O | I | D | * | Q | R | S | T | U | | P | K | E | * | V | W | X | Y | Z |
Figure 4-3 | | | | | | A | E | I | O | U | | | | | | * | * | * | * | * | * | | T | N | H | B | * | A | B | C | D | E | | V | P | J | C | * | F | G | H | IJ | K | | W | Q | K | D | * | L | M | N | O | P | | X | R | L | F | * | Q | R | S | T | U | | Z | S | M | G | * | V | W | X | Y | Z |
Figure 4-4 | | | | | | | V | W | X | Y | Z | | | | | | | | Q | R | S | T | U | | | | | | | | L | M | N | O | P | | | | | | | | F | G | H | I | K | | | | | | | | A | B | C | D | E | | | | | | | * | * | * | * | * | * | | V | Q | L | F | A | * | A | B | C | D | E | | W | R | M | G | B | * | F | G | H | IJ | K | | X | N | S | H | C | * | L | M | N | O | P | | Y | T | O | I | D | * | Q | R | S | T | U | | Z | U | P | K | E | * | V | W | X | Y | Z |
Figure 4-5 | | | | | | | O | | | | | | | | | | | | M | N | | | | | | | | | | | J | K | L | | | | | | | | | | F | G | H | I | | | | | | | | | A | B | C | D | E | | | | | | | * | * | * | * | * | * | | O | M | J | F | A | * | E | N | A | L | U | | | N | K | G | B | * | T | R | S | F | W | | | | L | H | C | * | O | IJ | H | Y | X | | | | | I | D | * | D | C | M | V | K | | | | | | E | * | P | G | B | Q | Z |
Figure 4-6 | | | | | | Z | | | | | | | | | | W | X | Y | | | | | | | | S | T | U | V | | | | | | | N | O | P | Q | R | | | | | | * | * | * | * | * | * | | M | J | F | A | * | E | N | A | L | U | | | K | G | B | * | T | R | S | F | W | | | L | H | C | * | O | IJ | H | Y | X | | | | I | D | * | D | C | M | V | K | | | | | E | * | P | G | B | Q | Z |
Figure 4-7 | | | | | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 0 | | | | | * | * | * | * | * | * | * | * | * | * | * | | 7 | 4 | 1 | * | A | B | C | D | E | F | G | H | I | J | | 8 | 5 | 2 | * | K | L | M | N | O | P | Q | R | S | T | | 9 | 6 | 3 | * | U | V | W | X | Y | Z | . | , | : | ; |
Figure 4-8 | | | | | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | | | | | * | * | * | * | * | * | * | * | * | * | | 7 | 4 | 1 | * | A | B | C | D | E | F | G | H | I | | 8 | 5 | 2 | * | J | K | L | M | N | O | P | Q | R | | 9 | 6 | 3 | * | S | T | U | V | W | X | Y | Z | * |
Figure 4-9 | | | | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | | | | * | * | * | * | * | * | * | * | * | * | | 5 | 1 | * | A | B | C | D | E | F | G | H | I | | 6 | 2 | * | J | K | L | M | N | O | P | Q | R | | 7 | 3 | * | S | T | U | V | W | X | Y | Z | 1 | | 8 | 4 | * | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 0 |
Figure 4-10 | | | | | | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | | | | | | * | * | * | * | * | * | * | * | * | * | | 0 | 8 | 5 | 1 | * | T | E | R | M | I | N | A | L | S | | | 9 | 6 | 2 | * | B | C | D | F | G | H | K | J | K | | | | 7 | 3 | * | P | Q | U | V | W | X | Y | Z | 1 | | | | | 4 | * | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 0 |
The matrices in Figures 4 -1 to 4-10 represent some of the simpler means for accomplishing monoalphabetic substitution with variants. The matrices are extensions of the basic ideas of multiliteral substitution presented in Lecture 3. The variant equivalents for any plain text letter may be chosen at will; thus, in Figure 4-1, e= 10, 15, 60, or 65; in Figure 4-2, e= AU, AZ, FU, FZ, LU or LZ. Encipherment by means of matrices shown in Figures 4-2, 4-3, 4-6 is commutative. The coordinates may be read row by column or visa versa. There is no cryptographic ambiguity. The remaining matrices are noncommutative. The general convention is to read row by column. In Figures 4-5 and 4-6, the letters in the square have been inscribed in such a manner that, coupled with the particular arrangement of the row and column coordinates, the number of variants available for each plain text letter is roughly proportional to the frequencies of the letters in the plain text. Figure 35 incorporates a keyword on top of this idea. [FR1] HOMOPHONIC The Homophonic Cipher is a simple variant system. It is a 4-level (alphabets) dinome cipher. Consider Figure 4-11. Figure 4-11 | A | B | C | D | E | F | G | H | IJ | K | L | M | N | | 08 | 09 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | | 68 | 69 | 70 | 71 | 72 | 73 | 74 | 75 | 51 | 52 | 53 | 54 | 55 | | 87 | 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | 97 | 98 | 99 | | |
|---|
| O | P | Q | R | S | T | U | V | W | X | Y | Z | | 21 | 22 | 23 | 24 | 25 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | | 48 | 49 | 50 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | | 00 | 76 | 77 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 |
The keyword TRIP is found by inspecting dinomes 01, 26, 51, and 76. (The lowest number in each of the four sequences.) [FR1] [FR5] The Russians added an interesting gimmick called the Disruption Area. Consider Figure 4-12 and note the slashes under U - X for the fourth level of dinomes. The famous VIC cipher used this feature very effectively. [NIC4] Figure 4-12 | A | B | C | D | E | F | G | H | I | J | K | L | M | | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | 93 | | |
|---|
| N | O | P | Q | R | S | T | U | V | W | X | Y | Z | | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 10 | 11 | 12 | 13 | | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | | 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | 53 | 54 | 55 | 56 | 57 | | 94 | 95 | 96 | 97 | 98 | 99 | 00 | // | // | // | // | 79 | 80 |
The keyword NAVY is represented by dinomes 01, 27, 53, and 79. Security for Homophonic systems is greatly improved if the dinomes and the four sequences are assigned randomly. However, the easy mnemonic feature of the keyworded four sequences is lost. The Mexican Cipher device is a Homophonic consisting of five concentric disks, the outer disk bearing 26 letters and the other four bearing sequences 01-26, 27-52, 53-78, 79-00. The cipher disk enhances frequent key changes. Figure 4-12 shows the matrix without the disruption area. [FR5] [NIC4] HOMOPHONIC CRYPTANALYSIS Lets solve the following cryptogram.
68321 09022 48057 65111 88648 42036 45235 09144
05764 22684 00225 57003 97357 14074 82524 40768
51058 93074 92188 47264 09328 04255 06186 79882
85144 45886 32574 55136 56019 45722 76844 68350
45219 71649 90528 65106 11886 44044 89669 70553
18491 06985 48579 33684 50957 70612 09795 29148
56109 08546 62062 65509 32800 32568 97216 44282
34031 84989 68564 53789 12530 77401 68494 38544
11368 87616 56905 20710 58864 67472 22490 09136
62851 24551 35180 14230 50886 44084 06231 12876
05579 58980 29503 99713 32720 36433 82689 04516
52263 21175 06445 72255 68951 86957 76095 67215
53049 08567 9730
Assuming we did not know that the above cryptogram was a HOMOPHONIC, we might make a preliminary analysis to see if we are dealing with a cipher or a code. We will cover code systems later in the course, but a few introductory remarks might be in order. The five letter groups could indicate either a cipher or a code. If the cryptogram contains an even number of digits, as for example 494 in the previous message, this leaves open the possibility that the message is a cipher containing 247 pairs of digits; were the number of digits an exact odd multiple of five, such as 125, 135, etc., the possibility that the cryptogram is in code of the 5-figure group type must be considered. We next study the message repetitions and what their characteristics are. If the cipher text is of 5-figure code type, then such repetitions as appear should generally be in whole groups of five digits, and they should be visible in the text just as the message stands, unless the code message has been superenciphered. If the cryptogram is a cipher, then repetitions should extend beyond the 5-digit groupings; if they conform to any definite at all they should for the most part contain even numbers of digits since each letter is probably represented by a pair (dinome) of digits. We start with 4-part frequency distribution. We next assume a 25 character alphabet from 01-00. This is the common scheme of drawing up the alphabets. Breaking the text into dinomes (2-digit) pairs yields: | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | | /// | | /// | / | ///// | ////// | /// | | //// | //// | ///// | /// | / | / | / | /// | | ////// | | / | // | ///// | // | | / | | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | | /// | | / | / | / | | ////// | / | / | / | ///// | / | | / | /// | | //// | / | ////// | ////// | /// | | /// | ///// | ///// | | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 | 75 | | ///// | ///// | /// | | //// | ///// | ////// | // | | | | // | | ////// | | / | // | /////// | // | / | / | //// | | //// | / | | 76 | 77 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | 97 | 98 | 99 | 00 | | ////// | / | | / | /// | | //// | / | ////// | ////// | /// | | //// | ///// | ////// | /// | / | / | / | /// | | ////// | / | | // |
What we have before us are four simple, monoalphabetic frequency distributions similar to those involved in a monoalphabetic substitution cipher using standard cipher alphabets. The next step is to fit the distribution to the normal. Since I=J for the 25 letter alphabet, we find that the Keyword is JUNE and the following alphabets result:
01 I-J 26 U 51 N 76 E 02 K 27 V 52 O 77 F 03 L 28 W 53 P 78 G 04 M 29 X 54 Q 79 H 05 N 30 Y 55 R 80 IJ 06 O 31 Z 56 S 81 K 07 P 32 A 57 T 82 L 08 Q 33 B 58 U 83 M 09 R 34 C 59 V 84 N 10 S 35 D 60 W 85 O 11 T 36 E 61 X 86 P 12 U 37 F 62 Y 87 Q 13 V 38 G 63 Z 88 R 14 W 39 H 64 A 89 S 15 X 40 IJ 65 B 90 T 16 Y 41 K 66 C 91 U 17 Z 42 L 67 D 92 V 18 A 43 M 68 E 93 W 19 B 44 N 69 F 94 X 20 C 45 O 70 G 95 Y 21 D 46 P 71 H 96 Z 22 E 47 Q 72 IJ 97 A 23 F 48 R 73 K 98 B 24 G 49 S 74 L 99 C 25 H 50 T 75 M 00 D
The first groups of the cryptogram decipher as follows: | 68 | 32 | 10 | 90 | 22 | 48 | 05 | 76 | 51 | 11 | 88 | 64 | 84 | 20 | 36 | 45 | 23 | | e | a | s | t | e | r | n | e | n | t | r | a | n | c | e | o | f |
If a 26-element alphabet were used only the distribution analysis would have been changed to be on a basis of 26, the process of fitting the distribution to the normal would be the same. PLAIN COMPONENT COMPLETION METHOD Suppose we know that two correspondents have been using the same variant system as in the previous Homophonic. The message intercepted is:
48226 88423 52099 93604 76059 05651 36683 52267
97114 54466 76
A variation of the plain-component completion method can be used to crack the new message. We copy the message into dinomes and separate by levels. | 48 | 22 | 68 | 84 | 23 | 52 | 09 | 99 | 36 | 04 | 76 | 05 | 90 | 56 | 51 | 36 | 68 | 35 | 22 | 67 | 97 | 11 | 45 | 44 | 66 | 76 | | 2 | 1 | 3 | 4 | 1 | 3 | 1 | 4 | 2 | 1 | 4 | 1 | 4 | 3 | 3 | 2 | 3 | 2 | 1 | 3 | 4 | 1 | 2 | 2 | 3 | 4 |
Levels:
(1) 22 23 09 04 05 22 11
(2) 48 36 36 35 45 44
(3) 68 52 56 51 68 67 66
(4) 84 99 76 90 97 76
These dinomes are converted into terms of plain component by setting each of the cipher sequences against the plain component at an arbitrary point of coincidence, such as the following: | A | B | C | D | E | F | G | H | IJ | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | |
|