Re: What is the best way to break polyalphabetic ciper?
From: Fred Limouzin (firstname.lastname@example.org)
From: Fred Limouzin <email@example.com> Date: Tue, 15 Apr 2003 11:18:54 +0100
> But my cipher is not vigenere but a polyalphabetic substitutions,
> which may me, 3 or 4 monoalphabetics substitutions. My questions is
> how to determine how many monoalphabetics substitions is used.
If it were a simple polyalphabetic cipher, I guess you could use the
same method than for Vigenere: take every other 2 characters, get the IC
on both groups, if both ICs match 0.07 then you have a multiple of the
key length, else try with 3 groups (every other 3 letters) etc.
(assuming there's no column transposition). However the letters in the
sub-keys won't be ordered as in Vigenere. Once you have the key length,
then you have to find every sub-monalphabetic keys instead of sub-Caesar
if you will.
However is it really polyalphabetic...
> But with my cipher, the IC=0.1078 WHICH IS VERY HIGH.
> The text is english text and there use substitutions( 100% sure)...
The reason it's high is because only 13 characters are used, and
extensively so more to the point, especially due to the high frequency
of M's. So what it tells you is that it's unlikely to be a
monosubstitution or transposition cipher.
Now, are you sure it's polyalphabetic or would it be di/poly-graphic
substitutions, which ain't quite the same ;-).
I mean let us say you have 4 polyalphabetic substitutions, but since
only 13 chars are used (every other characters in the alphabet by the
way), you cannot get all letters you want for a given offset. So it it
were in French and only two of the "keys" have E (for instance), how
would you cipher the plaintext "ELLEFUTCREEEENMAI" (Elle fut creee en
Mai - It was made in May); see the 4 E's in a row. (sorry I couldn't
find a better example.)
I've run frequency analysis on it but I haven't tried deciphering it yet
so I don't want to put you out of track by trowing out useless
hypothesis, but could it be digraphic substitutions (hopefully not
homophonic), or a ADFGVX kind, or again an +/- offset kind of thing to
recover the 22-26 letters usually used from the 13 in the CT. Why M is
used so much? Why every other 2 only with a bigger concentraction of
letters around M?
I think that's the kind of things you have to look at. Again I haven't
tried it so I may well be way off.