a new very fast hash algorithm (160 bits), with a technique of "overlapping sums"

From: Claudio (cgonzale_at_numericable.fr)
Date: 09/21/04


Date: Tue, 21 Sep 2004 23:41:46 +0200

Hello,

This is a method for a hash function, based on a new technique (I
think it's new, but this should be validated).

#define UL unsigned long
#define LEFT 13
#define RIGHT 19

void round()
{
        hash[4] += hash[0] >> RIGHT;
        hash[0] += hash[0] << LEFT;
        hash[0] += hash[1] >> RIGHT;
        hash[1] += hash[1] << LEFT;
        hash[1] += hash[2] >> RIGHT;
        hash[2] += hash[2] << LEFT;
        hash[2] += hash[3] >> RIGHT;
        hash[3] += hash[3] << LEFT;
        hash[3] += hash[4] >> RIGHT;
        hash[4] += hash[4] << LEFT;
}

void processBlock(const UL in[25])
{
        int i, j, r, s;
        for (i = 0; i < 5; i++)
        {
                s = 0;
                for (r=0; r<5; r++)
                {
                        for (j = 0; j < 5; j++)
                        {
                                hash[j] ^= in[j+s];
                                hash[j]++;
                        }
                        round();
                        s += 5;
                }
        }
}

It processes blocks of 800 bits. The same block is processed 3 times.
Each time, a sub-block of 160 bits is XORed with the hash block, and the
result is mixed with itself (round() function), by making the 32 bits
integers overlap with themselves (overlapping, and then addition modulo
  2^32).

Can anybody tell me whether this idea is completely bad or very
interesting ??

The facts:
- The avalanche effect seems OK after 5 rounds. This has been
empirically checked with the tests programs: ENT, NIST , DIEHARD (I
generated a set of hashvalues, from a given 800-bits block where a
32-bits integer at a given position was incremented before generating
the next hashvalue. I chosed the following integers: in[0] and in[24]).

- fast algorithm. In Assembly, the inner loop will take about 40 cycles
by using MMX instructions, and by storing the hash block in MMX
registers (6 cycles per DWORD processed). The processBlock() function
will therefore take about 40*5*5 = 1000 cycles, for 100 bytes => about
10 cycles / byte, which is just a little bit slower than MD5.

Thanks a lot in advance for your remarks (sorry for my non-perfect english).