Re: Expanding hash function bit width

From: Matthijs Hebly (heeb_at_iname.com)
Date: 06/24/04


Date: Thu, 24 Jun 2004 17:29:53 GMT

Gregory G Rose wrote:

> In article <heDCc.10174$3N6.6353@amsnews05.chello.com>,
> Matthijs Hebly <heeb@iname.com> wrote:
> >I have an idea to expand the bit width of any hash function H.
> >What I want to know is whether my idea is:
> >a) totally rediculous, because...
> >b) not bad, but not usable, because...
> >c) OK, OK... let's continue this...
> >d) whatever.
>
> I think it's B.
Thank you. Not tooooo bad.

> >For argument's sake let's say I'm using SHA-1, which has a 160-bit
> >output. I have a message M consisting of N bytes from which I want to
> >derive a 2*160=320 bit hash value, because for some reason I find
> >160-bit too little (I'm considering message-BYTES rather than BITS here,
> >because that happens to be what most computers nowadays calculate with.
> >Any message ending with some bits that do not form a whole byte should
> >be padded somehow, I guess with 10000....).
> >
> >Now let's say I would send every byte of my message M alternating to one
> >of two hashes, H0 and H1, every even numbered byte to H0 and every odd
> >byte to H1. After sending all bytes, I would have a resulting H0 and H1,
> >both consisting of 160 bits.
>
> I think that you want to send every byte of input
> to both hash functions, and not just to alternate
> them. To compensate for the equivalence of input
> you could do either, or both, of:
>
> a) start the hashes by hashing different input
> blocks (eg. H0 = SHA("This is H0" | input))
>
> b) changing the input bytes for one of them, for
> example H1 = SHA(input ^ PAD) where PAD is as many
> 0xC5 bytes as you need. (0xC5 is one of the values
> from HMAC, and it changes half the bits.)
>
> Hmmm, on second thoughts, (b) is useless -- you
> could arrange for the input to have the
> relationship so that it still duplicates the
> other input. So I take that back; (a) is
> necessary and sufficient, (b) is unnecessary and
> insufficient. At least I think so :-). But I
> left in the discussion above to show how hard it
> is to get this stuff right.
>
> Now you can just use H0 and H1 directly, or if you
> want, you could further hash them together.
>
> The rationale for this is that it seems dangerous
> to only give half the information to the partial
> hashes. And it is. See below.
>
> >Because H0 and H1 COULD be the same (especially if M would be something
> >like AABBCCDD...), I negate (change 0-bits into 1's and vice-versa) the
> >most significant half of H0 (the 80 most significant bits), as well as
> >the least significant half of H1 (the 80 least significant bits).
> >
> >Now I concatenate:
> >A) the negated most significant bits from H0 with the most significant
> >bits from H1, and send this to a new hash function H'0;
> >B) the least significant bits from H0 with the negated least significant
> >bits from H1, and send this to a new hash function H'1;
> >C) the outputs from H'0 and H'1, resulting in a 320-bit hash.
>
> So, here's a sort-of attack.
>
> Take a bunch of short, equal-length input
> messages to SHA, and hash them all, looking for
> certain coincidences. You want to find P1 and P2
> such that the first 80 bits of SHA(P1) is the
> same as the first 80 bits of SHA(P2). At the same
> time you want to find messages Q1 and Q2 with the
> same property except that we want the *last* 80
> bits of their hashes to be equal. You should be
> able to find a pair of each by hashing 2^40 short
> texts.
>
> Now form
> M1 = interleave(P1, Q1)
> M2 = interleave(P1, Q2)
> M3 = interleave(P2, Q1)
> M4 = interleave(P2, Q2)
> where "interleave" is just the operation of taking
> one byte from the first message, followed by one
> byte from the second, etc. (In other words, the
> opposite of your splitting algorithm. If you
> change the way you split messages, just change the
> way you interleave them to compensate.)
>
> You now have 4 messages, all of which have the
> property that that the first 160 bits of the
> output hash are equal, and you've done it with
> 2^40 work. You shouldn't be able to do this. Note
> that just further hashing the outputs from (C)
> above together somehow doesn't really solve the
> problem; it will still be the case that full
> collisions can be produced more efficiently
> (2^120) than should be possible for a 320-bit
> output.
Thanx, I guess you are right (on first glance).
So my question as put to Francois Grieu repeated:
Is there some way to construct a remarkably stronger hash from existing
hash building blocks?

Gr,

Matthijs Hebly



Relevant Pages

  • Re: "index" efficiency... any help or ideas?
    ... no general purpose hashing method, no matter how good, is good enough ... to prevent the possibility that everything hash into the same place. ... > Downside of hashes; if the data is stored externally, ... > perfect hashing are techniques that require analysis of the data in advance, ...
    (alt.lang.asm)
  • Re: "index" efficiency... any help or ideas?
    ... > dealing with different searching algorithms. ... > cons of hashing vs. binary searching. ... That's the killer reason for using hashes in this application. ... Searching a hash table is ...
    (alt.lang.asm)
  • Re: Some comments on "super fast hash"
    ... SFH seems reasonably good and certainly is fast. ... > a hash, and SFH does not. ... The latest versions of each hash function which leverages this ... it must behave worse on other key sets. ...
    (comp.programming)
  • Re: basic info on hashes?
    ... The basic idea of hashing is that you compute a hash value ... >of finding a good hash function easier to solve. ... >longer being a simple array. ...
    (comp.lang.pascal.delphi.misc)
  • Re: Why unhashing is not possible?
    ... then it is not a hash. ... (which is where perfect hashing becomes useful) ... Types of hashes and their properties exist apart from why ... ice axes in ice, or digging axes when dealing with a forest fire); ...
    (comp.security.misc)

Loading