Re: Are there problems with Merkle-Damgaard and SHA-256?

From: D. J. Bernstein (
Date: 08/29/05

  • Next message: "Re: Fast implementation of AES, Rijndael"
    Date: Mon, 29 Aug 2005 08:31:02 +0000 (UTC)

    The length-extension bug is that, given (x,SHA-256(m),length of m), one
    can easily compute SHA-256(m,padded length of m,x).

    One consequence of the length-extension bug is that it's not safe to use
    SHA-256(k,m) as an authenticator of m under a secret key k. The attacker
    can use this authenticator to forge an authenticator of any message of
    the form (m,padded length of (k,m),x).

    Applications can work around the length-extension bug without very much
    trouble, but these workarounds lose speed. For example, the bug doesn't
    hurt the HMAC-SHA-256 authenticator, but HMAC-SHA-256 also has terrible
    overhead for short messages.

    If SHA-256(message) had been defined as h(h(h(IV,0mes),0sag),1e10), with
    the final input block visibly distinct from all other blocks, then this
    bug would have disappeared, and would not have required any effort for
    applications to fix. Does anyone know why compressing an extra block
    became more popular than tweaking the final block?

    One way to view the length-extension bug is as as follows. An attacker
    given oracle access to a uniform random compression function h, and to
    the function H defined by H(message) = h(h(h(h(IV,mes),sag),e10),007)
    where IV is a uniform random string (kept secret), can see that

       H(message10007) = h(h(H(message),100),012).

    The same equation is almost never true for independent uniform random
    functions h and H. From this perspective, there may be many other ways
    to distinguish the pair (h,H) from uniform, and one can reasonably ask
    to eliminate all of those bugs, not just the length-extension bug.

    Anyway, the techniques of should
    allow an easy proof that the pair (h,H) is indistinguishable from
    uniform if H(message) is defined as h(h(h(IV,0mes),0sag),1e10). An easy
    corollary is that H is a secure authenticator.

    Nobody should be using this authenticator, or any other SHA-based
    authenticator---the Wegman-Carter framework has faster authenticators
    with stronger security bounds; see there
    are other applications of hash functions, and other ways to see the
    damage caused by the length-extension bug.

    Suppose, for example, that we use H as the hash function in the
    Rabin-Williams signature system. Can we prove that any generic attack
    against the system---any attack that works for all pairs (h,IV), given
    access to h and IV---can be turned into a factorization algorithm? What
    people normally prove is that an attack that works for _all_ functions H
    can be turned into a factorization algorithm; this isn't enough.

    I think that the answer for h(h(h(IV,0mes),0sag),1e10) is again yes,
    although I haven't worked out the details. The answer for a hash
    function with the length-extension bug is surely no.

    I also think that this is what the Coron-Dodis-Malinaud-Puniya CRYPTO
    2005 paper is trying to say, but I haven't deciphered their definitions
    yet, and unfortunately the paper doesn't contain proofs for most of its
    claimed theorems. Hopefully a subsequent paper will do better.

    ---D. J. Bernstein, Professor, Mathematics, Statistics,
    and Computer Science, University of Illinois at Chicago

  • Next message: "Re: Fast implementation of AES, Rijndael"