Re: Newbie - How can I figure out which cipher was used...

From: Jim Gillogly (jim_at_acm.org)
Date: 08/26/04


Date: Wed, 25 Aug 2004 23:45:56 GMT

On Wed, 25 Aug 2004 17:25:04 -0400, X wrote:
> I'm rather new at crypto and I have a few questions.
>
> When given a ciphertext with little other information, how can I
> determine which cipher was used to create the CT?

Diagnosis is hard and an art rather than a science, but it looks
like you've got the right ideas for getting started.
>
> I've got this cipher:
>
> MSSZI KCUZU BUEFI JDHOF SJIDP UVZAA HOCNU
> BOHSK SSHSJ BIZFO SDJAZ POKUZ UOZKA DOEHU
> BSVFA EIJSB IZAHH SMDAQ DNAIY TGBJU DYSXO
> FUNYO KSDSE EFOGC OBMPA AHFOH TAVOZ UBEHJ
> UBHOU JEUBM EFUBG... and so on for over 700 characters.
>
> I'm trying to figure out how it was encoded. All I know is that it's a
> "known" cipher (ie, not something the author invented especially for
> this occasion). And the the number 3 (or maybe "three"?) may or may
> not have something to do with it.
>
> I've determined that it's not a mono-alphabetic substitution (Ceaser,
> Atbash, Rot-whatever).
>
> I've read about Vigenere ciphers and I don't think it's one of those
> (or, if it is, I think it's an "autokey"...)

Chances are it's not one of these: the index of coincidence is
too high, and with a Vig (or autokey) with a reasonable length
key you'd get a much lower overall IC.
>
> I've tried an "enigma" cracker with no results, so it's probably not
> that either. I've also tried Playfair...

It's not Enigma, because again the IC is too high. With 700
characters you have a chance of a diagnosis because letters
can't encrypt to themselves: that is, the letters ETAOINSHRDLU
when taken as a whole will show up less (perhaps significantly
less) than in the underlying language. I'm not sure if 700 is
long enough for that, though. And besides, it's not Enigma. :)

It's not traditional Playfair, because it has digraphs with two
of the same letter. Of course, one can always worry that the
kind used for the JFK PT109 message was used here, which *does*
allow the doubled letters, but that's unusual. It doesn't
smell very digraphic, though.

> I found some interesting information on the "Index of Coincidence" and
> I get peaks at 5, 7 and 35... but I'm not sure how to proceed from
> there. The highest peak is at 35, would that indicate a 35 character
> key?

I doubt it, but there may be *something* in there. It is
indeed high in the short sample you gave. I'd put it on
the list of "interesting phenomena to try to explain".

>
> I've also seen ciphers named "Gronsfeld" and "fractionated morse" and
> from the output, this could be one of those... but is there a way to
> determine if it is? There's no shortage of ciphers to choose from.

Gronsfeld will break the same way as Vigenere. Fractionated Morse
has its own oddities -- I don't have a very specific test for it,
but it's always worth a try.

> Do those cipher have "signatures", something like "well if it's a
> Vigenere autokey, you'll notice this, this and that" or "note that a
> fractionated morse cipher will tend to exhibit this and this, but not
> this...". You get the idea.

If it's a substitution of some sort, with 700 characters it's
not unlikely that there are some good long repeats that will be
helpful. These can indicate repeated words or phrases. The
3-letter repeats in the sample aren't convincing, though.
Looking at repeats is in general interesting and enlightening
with an unknown, and their absence helps you rule things out.

> I don't want someone to break it for me; I want to understand how to
> start the attack, or how to at least confirm that I'm on the right
> path.

You're on the right path. You might look further at the "3" hint.
Is the full plaintext a multiple of three? Does a frequency count
of the whole thing as trigraphs show any interesting phenomena?
Do the first, second and third letters of the trigraphs have any
particular consistency or oddity? The IC has peaks at 6 and 12
also (for the sample you gave), which could indicate a factor of
three being involved.

Good luck with it!

-- 
	Jim Gillogly


Relevant Pages

  • Re: Ive seen things you people wouldnt believe...
    ... characters to characters (usually letters to letters), ... the code word for a plaintext word was chosen ... have the words you need, you are reduced to a substitution cipher, ...
    (rec.arts.sf.fandom)
  • Re: Another thought about beale treasure Ciphers
    ... I meant to say Beale signature. ... You need only suspect this cipher a fraud and the ... information from the Beale letters? ... start to see problems forming in a future word in the remaining ...
    (sci.crypt)
  • Re: Newbie - How can I figure out which cipher was used...
    ... on what approximate IC I should expect from cipher X? ... that is, the letters ETAOINSHRDLU ... >3-letter repeats in the sample aren't convincing, ... thread on how to break an autokey with tetragraphs... ...
    (sci.crypt)
  • Re: Keyword Question
    ... Cipher: securityabdfghjklmnopqvwxz ... It only works because I know where the plain alphabet letters are ... That still doesn't explain how the keyword was chosen. ... And neither does 'security' have any special significance here. ...
    (rec.puzzles)
  • Re: Cipher challenge
    ... There's a Java KennySpeak decoder on the web, ... huge advantage to cracking the cipher. ... Once you get enough of the letters ... Austin Shackles. ...
    (uk.rec.sheds)