Re: Suggestion for an AES Based Hash Function
 From: Douglas Eagleson <eaglesondouglas@xxxxxxxxx>
 Date: Fri, 28 Mar 2008 04:50:06 0700 (PDT)
On Mar 27, 10:59 am, rossum <rossu...@xxxxxxxxxxxx> wrote:
Spring is sprung and young people's thoughts turn to... Well I am no
longer young so my thoughts have turned to hash functions. Other
amateurs posting here seem to favour cyphers, I go for hash functions.
There are already a number of ways of turning a block cypher into a
hash function, MiyaguchiPreneel and so forth. However the problem is
that generally these methods are slow, requiring both a key schedule
and the required number of rounds for each hash block. I was thinking
about ways to speed this process up.
Rather than trying to use a strong oneway compression function would
it be possible to use a weaker, and faster, compression function with
a strong oneway function added at the end of the hash as a final
computation? With this structure the compression function would just
mix up and compress the incoming message and the oneway nature of the
hash would be added in the final processing stage. The round function
would not be completely weak, but it would not be as cryptographically
strong as MiyaguchiPreneel or whatever.
In order to deny an attacker, who has full control over the initial
message, the ability to fully control all of the internal state of the
cypher I have used two strategies. Firstly I try to use the message
input in two different ways so that it is difficult for the attacker
to get the one message to do two different things. Secondly I have
delayed allowing each incoming message block to impact half of the
state until the processing of the following block. This means that
the attacker is working indirectly when trying to control that half of
the state. It is impossible to prevent an attacker having some
control over the internal state of the hash. The point is to minimise
what control is allowed.
I have used a fixed set of keys in two places. This means that for
messages up to one block (after padding) the block cypher key schedule
need not be run at all. For longer messages the key schedule will
have to be run at least once but less than once per block.
I have limited the design to hash lengths of 128, 256 and 512 bits.
It would not be difficult to extend the design for other hash lengths
if they were required.
My design goals were:
 the compression function should emphasise mixing its inputs over
being oneway.
 the state of the compression function should be twice the size of
the maximum hash. (I had a reference for this, but I cannot now find
it.)
 the final oneway function must emphasise being oneway over mixing
its inputs.
 incoming message blocks must be used in more than one way to make
it more difficult for an attacker to influence the internal state.
 delaying the impact of incoming message blocks on part of the
internal state.
 the last compression function output needs to be well hidden by the
final oneway function so any attacker cannot determine the final
compression output from just the hash.
 shorter hashes should not be truncations of the longer hashes.
 the block cypher key schedule should be run less frequently than
once per block.
 use elements of an existing block cypher: AES.
1 Inputs and Outputs
The two inputs are:
L  the bit length of the hash. This may be any one of 128, 256 or
512 bits. Other values will cause the algorithm to fail.
M  the message to be hashed. The length of the message cannot
exceed 2**64  1 bits. Messages that are too long will cause the
algorithm to fail.
The output is the L bit hash of the message.
2 Padding
The message will be divided into 512 bit blocks. Padding will be as
for MD5: add a 1 bit, some zero bits and 64 bits giving the unpadded
bit size of the original input. Enough zeros are included to pad to
the next 512 bit block boundary. The padded message is divided into n
512 bit blocks: P1, P2, ... Pn.
3 Initial values
The algorithm uses an internal 1024 bit state divided into two 512
state blocks: S and R. These are both initialised to all zeros: S0 =
0x000 ... 00, R0 = 0x000 ... 00
The key schedule uses a 256 bit key mask, KM. This is initialised
using L, the hash length:
L KM

128 0x01280128 ... 0128
256 0x02560256 ... 0256
512 0x05120512 ... 0512
Other possible values of L could be calculated similarly: zero extend
to four digits and treat the result as hexadecimal rather than
decimal, repeating 16 times to get a total of 256 bits. Deriving the
key mask in this way ensures that each different length hash is its
own value and not simply a truncation of a longer hash.
4 Key Schedule
Keys are treated differently for the first and last runs of the
compression function. In both these cases a fixed set of keys is
used. At the start either the key is fixed, derived from the fixed S0
and R0 or else includes input from the initial message block. This
last option hands control of the keys to an attacker. The other two
options give an attacker knowledge of the keys but not control over
them. Given that I use the generated keys over a number of rounds I
preferred to have the first round keys fixed and to start deriving
keys from input at the start of the second round. The keys for the
final run are fixed in order to deny an attacker control over the keys
just before entry to the final one way function. Fixed keys also mean
that a one block message never needs to run the key schedule at all as
it will always use the fixed keys.
4.1 Fixed Keys
These are used for round 1 of the compression function and for the
last round that mixes in the extra block. There are eight of them,
derived from the key mask KM.
KF1 = KM eg 0x05120512 ... 0512
KF2 = ~KF1 eg 0xFAEDFAED ... FAED
KF3 = KF1 <<< 4 eg 0x51205120 ... 5120
KF4 = ~KF3 eg 0xAEDFAEDF ... AEDF
KF5 = KF3 <<< 4 eg 0x12051205 ... 1205
KF6 = ~KF5 eg 0xEDFAEDFA ... EDFA
KF7 = KF5 <<< 4 eg 0x20152051 ... 2051
KF8 = ~ KF7 eg 0xDFAEDFAE ... DFAE
Here ~ is bitwise negation and <<< 4 is a left rotation by four bits.
Even numbered keys are the bitwise negation of the previous odd
numbered key. Key 1 is the key mask, and each further odd numbered
key is the previous odd numbered key rotated left by four bits.
4.2 Variable Keys
The AES key schedule is used to generate a set of 14 round keys.
These are then used in different combinations to cover the next 14
blocks of the hash. After 14 blocks a new set of round keys is
generated. The key schedule will need to be run before processing the
second block and every fourteenth subsequent block. The first block
uses the fixed keys.
Split the current S and R each into two halves and xor the four halves
together, then xor the result with the key mask. The 256 bit result
is used as input to the AES key schedule for a 256 bit key, producing
14 128 bit round keys. These are then shifted left by 8 positions so
that key 9 moves to position 1, key 10 moves to position 2 and so
forth until key 8 moves to position 14. The result is the final list
of keys: RK1, RK2, ... RK14.
(Shi  Slo) < S
(Rhi  Rlo) < R
temp[1 .. 14] < ASE256KS(Shi xor Slo xor Rhi xor Rlo xor KM)
RK[1 .. 14] < (temp[9 .. 14]  temp[1 .. 8])
Each round of the compression function needs eight keys, so block 2 of
the hash will use keys 1 to 8; block 3 will use keys 2 to 9 and so on
with the numbering wrapping round to the start so block 14 will use
keys 14 to 7.
The AES key schedule returns its input at the start of the round keys.
The shift by eight keys avoids the input being used with the S and R
values used in its derivation.
5 Compression Function
As indicated above I have kept this pretty lightweight, doing enough
to mix the inputs up but not really thinking about making it one way.
The message block is mixed with only one half of the state (S) on its
first appearance. When processing the next block the previous S is
mixed with R. This delay makes it more difficult for an attacker to
modify R by changing the message. Changes to R can only be made
inditectly by first changing S.
5.1 Inputs
Pi  the i'th block of the padded message. i = 1, ... n
S(i1), R(i1)  the current state blocks.
RKj ... RK(j+7)  eight round keys, where j = (i1) mod 14 or else
for block 1 (i = 1) and the last run ( i = n+1) the eight fixed keys
are used.
5.2 Output
S(i), R(i)  the new state blocks.
5.3 Compression Function Description
S and P, are xor'ed together to give the SP block. The SP block and
the first four round keys are passed to the Mix() function which
outputs the M block which is then rotated left by one bit. The final
output S block is SP xor M.
S and R are xor'ed together to give the RS block. The RS block and
the last four round keys are passed to the Mix() function which
outputs the N. The final output R block is RS xor N.
// Calculate new S(i)
SP < S(i1) xor Pi
M < Mix(SP, RKj, RK(j+1), RK(j+2), RK(j+3))
M < M <<< 1
S(i) < SP xor M
// Calculate new R(i)
RS < R(i1) xor S(i1)
N < Mix(RS, RKj+4, RK(j+5), RK(j+6), RK(j+7))
R(i) < RS xor N
The SP block is used in two different ways, once as input to the Mix()
function and once to xor with the output from Mix(). SP can be
completely determined by an attacker chosing Pi so by making X perform
two very different functions it makes it more difficult for the
attacker to fully determine the output S(i+1). It is not possible to
avoid the attacker having some influence over S(i+1). The treatment
of the RS block is similar though this is not so critical as it is
less easy for an attacker to fully control the inputs R and S.
The 1 bit rotation of M is to break up the alignment on byte
boundaries.
5.4 Mix Function
This takes AA (SP or RS) and four round keys as inputs. It uses an
matrix of 16 AES round functions to mix the inputs so all the input
can influence all the output.
First the AA block is split into four 128 bit chunks: AAa, AAb, AAc,
AAd:
(AAa  AAb  AAc  AAd) < AA
Together with the four round keys these are input to the first four
AES round functions of the matrix. The output from each AES round
function is passed to its neighbours to the right and to the lower
right with output from the fourth row being input to the top row of
the next column.
A B C
++  ++  ++  ++
RKj   +>  +>  +> 
 R11   R12   R13   R14 > Ma
AAa  +> +> +> 
++  ++  ++  ++
  
++  ++  ++  ++
RKj+1  +>  +>  +> 
 R21   R22   R23   R24 > Mb
AAb  +> +> +> 
++  ++  ++  ++
  
++  ++  ++  ++
RKj+2  +>  +>  +> 
 R31   R32   R33   R34 > Mc
AAc  +> +> +> 
++  ++  ++  ++
  
++  ++  ++  ++
RKj+3  +>  +>  +> 
 R41   R42   R43   R44 > Md
AAd  +> +> +> 
++  ++  ++  ++
A B C
Here the R## are standard AES round functions and A, B and C are link
labels indicating the transfer of output from row four into the input
of row one of the next column. There are four 128 bit outputs: Ma,
Mb, Mc and Md. These are concatenated to give the 512 bit output M.
M < (Ma  Mb  Mc  Md)
6 One Way Function
Having kept the compression function relatively lightweight, this one
is probably overengineered. However it will only be run once per
hash as opposed to many times for the compression fumction.
6.1 Inputs
S(n), R(n)  the state blocks output from the n'th compression
function.
L  the bit length of the required hash.
KM  the key mask.
KF[1 .. 8]  the fixed keys.
6.2 Output
Hash  the L bit hash of the message.
6.2 One Way Function Description
At this point the final message block Pn has only been mixed into S(n)
and not into R(n). In order to mix the last message block properly a
further dummy block of all zeros is compressed, using the fixed keys.
The resulting values of R and S are concatenated and saved as RSpre.
A 256 bit key is derived from RSpre and KM. RSpre is encrypted
using AES256 in counter mode giving RSpost. RSpre is shuffled
using RSpost, and RSpost is shuffled using the changed RSpre. The
actual hash values are derived by repeatedly xoring halves of the
final RS.
// Compress Extra Block
P = 0x000 ... 00
(R(n_1), S(n+1)) < Compress(R(n), S(n), P, KF[1 .. 8])
// Encrypt
RSpre < R(n+1)  S(n+1)
(Shi  Slo) < S(n+1)
(Rhi  Rlo) < R(n+1)
key = (Shi xor Slo xor Rhi xor Rlo xor KM)
IV = 0
RSpost < AES256CTR(key, IV, RSpre)
// Shuffle
T < shuffle RSpre using RSpost
RSfinal < shuffle RSpost using T
// Return hash
(lo  hi) < RSfinal // Split into two 512 bit pieces
H512 < lo xor hi
if (L = 512) return H512 as Hash
(hi  lo) < H512 // Split into two 256 bit pieces
H256 = hi xor lo
if (L = 256) return H256 as Hash
(hi  lo) < H256 // Split into two 128 bit pieces
H128 = hi xor lo
return H128 as Hash
The extra compression block is to mix the last message block into
state block R. AES and the two shuffles are used to obscure the final
values of R and S. The shuffling and AES encryption allows RSpre to
impact RSfinal in two different ways making it more difficult for an
attacker to completely control the final hash output.
6.3 Shuffling
Shuffling uses the elements of one array to reorder the elements of
another array. The instruction "shuffle A using B" uses a stable sort
of the elements of array B to determine an ordering, and then applies
that ordering to array A. For example:
A = 105, 104, 101, 101, 103
B = 202, 204, 205, 201, 204
Using a stable sort for B gives 201, 202, 204, 204, 205 = B[3], B[0],
B[1], B[4], B[2]. Note that using a stable sort requires the ordering
of B[1] before B[4]. The ordering of the B array gives indexes 3, 0,
1, 4, 2 in order. Rearranging A to this ordering gives A[3], A[0],
A[1], A[4], A[2] = 101, 105, 104, 103, 101. Using a stable sort is
important to ensure that the ordering defined by the array is unique
and hence reproducible.
In this case the 1024 bit RSpre and RSpost are to be considered as
arrays of 128 bytes for shuffling. After shuffling they are to be
reassembled into blocks. Byte 0 is at the high end and byte 63 at the
low end of the blocks.
7 Questions
 is the basic idea of a weaker compression function with a stronger
oneway function at the end reasonable?
 have I underegged the compression function? It would be possible
to double up the Mix() function for either S or R or both if required
at a cost in speed.
 have I overegged the oneway function?
 am I right to use a double sized (1024 bit) internal state rather
than just a 512 bit internal state?
 am I right to use fixed keys for the first block, given that I am
otherwise deriving the keys in part from the input?
 am I right to use fixed keys to compress the extra block or should
I use normally derived variable keys?
 any other obvious points I have missed.
8 Bloodsports
You can now all tear this latest idea to pieces. :)
rossum
You make fun of alot of issues here. A selected key string as part of
the relative input appears the design goal.
I would have just made a hidden key.
You take the input and schedule. Allowing the applied hash function.
A technically correct usage of the schedule was all in evidence.
Schedule was the hash's theorectical seqence of strings as a function
of the algorithm, and not a sequnece of algorithm content. Algorithm
in hash was to always be a selected number of the hash output, slected
by sequence location only. A input string was always therfor
unrelatable to the hash content.
A property of functional completeness was always a hash, making it
nontrivial to generate. A basic hash generator would enable common
people to cause new hash functions.
I will post the fourth hash function soon. It will simplify hash
creation fo the common man.
.
 FollowUps:
 Re: Suggestion for an AES Based Hash Function
 From: rossum
 Re: Suggestion for an AES Based Hash Function
 References:
 Suggestion for an AES Based Hash Function
 From: rossum
 Suggestion for an AES Based Hash Function
 Prev by Date: Re: statistical randomness of hash function
 Next by Date: Re: statistical randomness of hash function
 Previous by thread: Suggestion for an AES Based Hash Function
 Next by thread: Re: Suggestion for an AES Based Hash Function
 Index(es):
Relevant Pages
