# Faster key search in SecurID

**From:** Scott Contini (*contini_at_matmail.com*)

**Date:** 09/07/03

**Next message:**M.S. Bob: "Re: cryptology graduate school advice"**Previous message:**Paul Rubin: "Re: Proposal for a new PKI model (At least I hope it's new)"**Next in thread:**David Wagner: "Re: Faster key search in SecurID"**Reply:**David Wagner: "Re: Faster key search in SecurID"**Reply:**Joe Lano: "Re: Faster key search in SecurID"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Date: 6 Sep 2003 17:07:40 -0700

Here's a simple result on SecurID. I'm posting it here since a better

result is coming out soon. I'd be interested in feedback. Again,

the attack is very simple so don't have high expectations!

Scott

===============================================================================

The Effect of a Single Vanishing Differential on ASHF

Scott Contini

Sept 3, 2003

In [1], Biryukov, Lano, and Preneel present the first known

real security design flaw in the securid device. The device

is susceptible to "vanishing differentials" (when two related

plaintexts hash to the same value) with relatively high

probability. In terms of the way the device actually works,

this almost always happens when a one-bit time difference gets

formatted into a two-bit plaintext difference. This note assumes

two-bit plaintext differentials, but the general concept of the

attack we describe will apply to larger bit differentials as well.

We also use the same acronym as [1], ASHF, to refer to the alleged

securid hash function.

The main question we are interested in is:

Given that a two-bit vanishing differential occurred

in the full ASHF with time format, how can we use it

to speed up key search?

In [1], "partial key recovery" is talked about on the bottom

of page 11. They assume at least a few vanishing differentials.

They mention that the vanishing differentials can give some

information about the sum of key bits. They give no further details

about what is learned.

Here, we show that a single vanishing differential allows one

to almost always find the key in approximately 2^60 steps,

which is a factor of 2^4 less than expected. Further, these steps

are about 4 times faster than the full ASHF. Thus, the

net speedup in key search time is a factor of about 2^6.

1. ASHF Key'ed Permutation

We first give a more insightful description of how the ASHF

key'ed permutation really works. The original code, obtained

by I.C. Wiener [2] (apparently by reverse engineering the

ACE server code), is quite cryptic. Even after cleaning

it up, it is not obvious what the algorithm is really doing.

An equivalent but more insightful description of the ASHF is

given now. It first takes the key and separates it into

16 nibbles in the most natural way. We refer to these key

nibbles by k[0] .. k[15] . The data bits are indexed

between 0 and 63. These data bits will be taken 4 at a time,

copied to the permuted_data array from right to left (i.e.

higher indexes are filled in first), and then removed from the

original data array. Every time 4 bits are removed from the

original data array, the size shrinks by 4. Indexes within

that array are always modulo the number of bits remaining.

A pointer m is first initialized to the index k[0] . The

first 4 bits that are taken are those right before the index of m .

For example, if k[0] is 0xa, then bits 6, 7, 8, and 9 are taken.

If k[0] is 0x2, then bits 62, 63, 0, and 1 are taken. As

these bits are removed from the array, the index m is adjusted

accordingly so that it continues to point at the same bit it

pointed to before the 4 bits were removed.

The pointer m is then increased by a value of k[1] , and

the 4 bits prior to this are taken, as before. The process

is repeated until all bits have been taken.

Sample C code is given below.

void

securid_permute_data( OCTET *data, const OCTET key )

{

unsigned char bit_data[64], permuted_bit[64];

unsigned char hex_key[16];

int i, j, k, m;

int bits_remaining = 64;

memset( bit_data, 0, sizeof (bit_data) );

securid_expand_data_to_1_bit_per_byte( *data, bit_data );

securid_expand_key_to_4_bit_per_byte( key, hex_key );

m = 0;

for (k = 0; k < 16; k++)

{

m += hex_key[k];

m = m % bits_remaining;

/* copy the 4 bits before index m into permuted_bit */

for (i = 0; i < 4; i++)

{

/* j is index of the bit that is being copied */

j = m - 4 + i;

if (j < 0)

/* wrap around to the end of the array */

j += bits_remaining;

/* fill in the right side of permuted_bit first */

permuted_bit[bits_remaining - 4 + i] = bit_data[j];

}

/* take out the 4 bits just copied from the bit_data array */

j = m - 4;

if (j < 0)

/* wrap around */

j += bits_remaining;

remove_4_bits( bit_data, j, bits_remaining );

/* there are 4 less bits in the array */

bits_remaining -= 4;

/* decrease m by 4 since the previous 4 bits have been

chopped

out */

m -= 4;

/* note that bits that are wrapped around do not effect the

index */

if (m < 0)

m = 0;

}

securid_reassemble_64_bit_from_64_byte( permuted_bit, data );

}

An alternative description which is not needed for our result but

is cleaner to code is given here. It is essentially the same idea

as above, except rather than removing bits from the array, instead

they are crossed off. This simplifies the task of keeping track

of m , and keeps all indexes modulo the constant 64.

void

securid_permute_data( OCTET *data, const OCTET key )

{

unsigned char bit_data[64], permuted_bit[64];

unsigned char hex_key[16];

int i, j, k, m;

memset( bit_data, 0, sizeof (bit_data) );

securid_expand_data_to_1_bit_per_byte( *data, bit_data );

securid_expand_key_to_4_bit_per_byte( key, hex_key );

m = 0;

for (k = 0; k < 16; k++)

{

/* skip past the next hex_key[k] values, ignoring crossed off

entries */

for (i=0; i < hex_key[k]; ++i) {

m = (m + 1) & 0x3f;

/* if the entry is crossed off, continue until we find one

that is not */

while (bit_data[m] == 255)

m = (m + 1) & 0x3f;

}

/*

* copy the 4 bits before index m into permuted_bit

*/

j = (m-1) & 0x3f;

for (i = 0; i < 4; i++)

{

/* ignore bits that are crossed off */

while (bit_data[j] == 0xff)

j = (j - 1) & 0x3f;

/* fill in the right side of permuted_bit first */

permuted_bit[63 - k*4 - i] = bit_data[j];

/* cross off the bit we just took */

bit_data[j] = 0xff;

}

}

securid_reassemble_64_bit_from_64_byte( permuted_bit, data );

}

The ASHF code that Wiener gave yields the same results as

these alternatives, but does so in a very compact, clever way.

It requires careful study to understand why Wiener's code

really is a permutation.

2. Key Searching after a Single Two-Bit Vanishing Differential

Please refer to [1] for a description of ASHF and the concept

of vanishing differential. In this section, when we say "vanishing

differential", we assume a two-bit difference.

Assume a vanishing differential has occurred. What can we say

about it? First, almost always the internal collision will happen

within 32 subrounds of the first round. If the collision happens

after 32 subrounds, this means that each differential bit has

passed through either B[0] or B[4] without disappearing. It

will then be an additional large number of subrounds before the

differential gets to the other byte (i.e. to B[4] if it just

passed through B[0] and vice versa) and has another chance

to disappear. Heuristically, this is extremely unlikely to

happen. If the difference does not disappear within the first

round (= 64 subrounds), then it gets mixed into the key, which

makes it almost impossible to disappear. After 405 simulation

experiments of vanishing differentials, we found 11 where the

differential lasted more than 32 subrounds. So it appears that

less than 3% of the time the assumption is wrong which will

cause our algorithm to fail.

We will be doing a key search where we test a key to see if it

produces a vanishing differential on our plaintext pairs within

32 subrounds. If it does happen, this indicates a candidate key

which we test further to see if it is the right one. Otherwise,

we advance to the next candidate. Since we only have to try

32 subrounds, this is 8 times faster than the full ASHF. On the

other hand, we have to do it for 2 plaintexts, so each trial

is 4 times faster than the full ASHF.

Note that in 32 subrounds, only the first 32 key bits come into

play. We take advantage of this in our attack.

The attack works as follows. Guess the first 6 bytes of the key.

There are 2^48 possibilities. For each guess, we perform the

key'ed permutation only for those first 6 bytes. After this step,

we have 6 bytes in the permuted_data array and 2 bytes remaining.

Although there are 2^16 possible keys remaining, this is far more

than we need to try since many of the resulting permutations of

the remaining 2 bytes of data overlap. Thus, rather than guessing

the last 2 key bytes, we try all possible permutations that could

have come from some key. There are at most 16*12*8*4 ~=

2^12.58 permutations (see our description of the ASHF key'ed

permutation to understand this). If the collision does indeed

happen, all possibilities for the remaining 2 key bytes are

explored with the full ASHF to see if any is the right key. Since

collisions are extremely rare, the time for testing candidates is

negligible in the big picture. The algorithm as stated gives at

least a reduction factor of 2^3.41 work in terms of total number

of keys studied. Taking into consideration that each step is 4

times faster than ASHF, the speedup is at least 2^5.41 over

exhaustive key search.

In reality, we often can get away with less than the 2^12.58

permutation trials since many of them overlap. For instance,

if both difference bits have been placed in the 6 bytes

already determined by the guessed part of the key (which happens

heuristically with probability about 56%), and if the hamming

weight of the remaining 2 bytes is 0, then there is only 1

possibility for the last 2 bytes. According to an exhaustive

search, if both differential bits have already been placed,

then the average number of possibilities for the last two

bytes is about 2^11.3. If we assume 56% of the time we can

get away with this shortened search and the other 44% of the

time we do the full search of 2^12.58 work (which is not

optimal), then the expected work for the last 2 bytes is

2^12.0. Hence, the modified key search is 2^6.0 times faster

than full key search. The reader can independently verify

that the most number of trials for the last 2 bytes when

there is no difference is 4670, which comes from any bit

rotation of the word 0x09bd.

It is not necessary to precompute and store all unique

permutations for each 2 byte word. An algorithm exists to

traverse them without duplication. The algorithm works as

follows. First, all of the 16 possible choices for the

first 4 bits are listed along with their corresponding

shrunken 12 bit arrays. Each shrunken 12 bit array is

rotated until the least integer representation of that

array is obtained (where one is free to choose the endianness

they prefer). Any duplications among the 16 choices are

removed, where a duplication means that both the 4 bit

and the 12 bit arrays are the same. For each remaining

entry, the same process is repeated recursively on the 12

bit arrays (i.e. the list of all 12 possible choices for

the next 4 bits are obtained along with their shrunken

8 bit arrays, etc....). At each recursive step, the

duplication check must not only be done among the locally

computed lists, but also those lists obtained from other

recursive branches of the same depth. This can be done

in constant time using 2^18 storage bits.

Alternatively, one can use the precomputed memory lookup

table method but switch to guessing 6.5 bytes of the key

instead of 6. The search space becomes size 2^60.17, and

the speedup factor becomes 2^5.83.

3. Intelligent Key Searches

Another characteristic about vanishing differentials is that the

two differential bits usually enter B[0] or B[4] within a few

subrounds of each other. This is because the farther apart they

are from entering these bytes, the more difficult it becomes for

the differentials not to propagate, eliminating any reasonable

chance of them disappearing.

Below is a histogram from 405 trials, showing how many iterations

apart it took for both differences to enter B[0] or B[4] .

Thus, we see that 65.43% of the time the differences enter

B[0] or B[4] within 4 subrounds of each other and 74.07% of

the time within 5 subrounds.

iteration number of sum of sum of occurrences

difference occurrences occurrences divided by 405

----------- ----------- ----------- -----------

0 44 44 0.1086

1 36 80 0.1975

2 61 141 0.3481

3 44 185 0.4568

4 80 265 0.6543

5 35 300 0.7407

6 37 337 0.8321

7 19 356 0.8790

8 9 365 0.9012

9 4 369 0.9111

10 5 374 0.9235

11 3 377 0.9309

12 2 379 0.9358

17 1 380 0.9383

18 1 381 0.9407

19 5 386 0.9531

20 2 388 0.9580

21 4 392 0.9679

22 3 395 0.9753

23 5 400 0.9877

24 5 405 1.0000

One can increase the speed of key search by ordering the

keys according to the difference of the number of subrounds

before each differential enters B[0] or B[4] . For

example, if we initially search only among the keys that

put the differences in B[0] or B[4] within 4 subrounds

of each other, there is a 65.43% chance that we will discover

the key 3.27 times faster than doing the full exhaustive

search. If that search fails, one can try repeating the

search for keys that put the differences in B[0] or B[4]

between 5 and 7 iterations of each other, and so on.

It is assumed that a strategy like this can be combined, to

some extent, with the strategy in Section 2.

Bibliography.

[1] A. Biryukov, J. Lano, and B. Preneel, "Cryptanalysis of the Alleged

SecurID Hash Function," from http://eprint.iacr.org/2003/162/ .

[2] I.C. Wiener, "Sample SecurID Token Emulator with Token Secret

Import,"

from http://archives.neohapsis.com/archives/bugtraq/2000-12/0428.html .

**Next message:**M.S. Bob: "Re: cryptology graduate school advice"**Previous message:**Paul Rubin: "Re: Proposal for a new PKI model (At least I hope it's new)"**Next in thread:**David Wagner: "Re: Faster key search in SecurID"**Reply:**David Wagner: "Re: Faster key search in SecurID"**Reply:**Joe Lano: "Re: Faster key search in SecurID"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ] [ attachment ]