Re: 64-bit AES



Stian Karlsen wrote:

[snip]
Thanks! This is what I'm looking for, yes. But do you have this
available on your webpage? Haven't seen it there. One thing I needed is
to see these numbers, but I also need something I can refer..

It will be there shortly so unless you are planning to complete your
thesis in the next couple of days you will have something to refer to.

From these results it looks like there are speed to win on a 64-bit
implementation. Why is that? Do you use some SSE instructions to gain

No, it is the 8 extra _ordinary_ registers that makes the difference.

There are too few registers on the x86 to do the memory accesses
efficiently because it is necessary to hold the four 32-bit words of the
AES state in them and this prevents their use for other things.

On the 64-bit AMD64/EM64T architecture the 8 extra registers r8..r15 are
available and this allows static table addressing to be replaced by
register based addressing which is much more efficient in terms of the
instruction stream generated (more compact code is often the faster code).

The AES state takes 4 32-bit regsiters. The extra ordinary registers on
the 64-bit systems allow the next state to be compiled into a second
register based state without disturbing parts of the current state that
are still needed. Hence it is no longer necessary to spill registers
onto the stack or shift bytes around from one register to another to
obtain an empty register for the new state.

this or what? As Tom pointed out there isn't anything to earn on the
fact that one can calculate with larger numbers, so by a direct
translation to a 64-bit machine one should get a similar amount of
instructions - each having the same cost as in a 32-bit machine, and
hence get the same speed.
(unless I've misunderstood..)

The additional SSE registers look attractive for some encryption
algorithms but not for AES. The reason is that a primary operation in
AES is to take a byte out of each of four 32-bit words and use this byte
to index into a table in memory. There is no easy way to do this using
the SSE registers and the cost of moving these registers into and out of
ordinary registers gets too high.

So, in case you use some SSE instructions to speed up; is this also used
in a 32-bit Assembly implementation? But there are more such registers
available on a 64-bit machine and hence a speedup may be gained when
moving over to a 64-bit machine.

I have _x86_ code that uses the MMX and SSE registers for AES and it is
faster than code that is based on the ordinary registers in some
situations. But this is because of the small number of registers on the
x86. On AMD64 the much better register set and the ability to schedule
up to three operations in parallel (for some operations) makes it
unlikely that the SSE registers would bring advantages.

I don't intend to try this as there are more interesting things to do -
getting to 6 cycles/byte by adding threading support for dual core
systems for example.

Brian Gladman
.



Relevant Pages

  • [PATCH] Re-implemented i586 asm AES
    ... AES implementation). ... distributions of this source code include the above copyright ... * may be distributed under the terms of the GNU General Public License, ... ecx or edx registers or the artihmetic status flags. ...
    (Linux-Kernel)
  • Re: 64-bit AES
    ... By registers I mean the XMM registers to ... So AES is faster for 64-bit platform because it can more easy ... perform parallel operations with the extra XMM registers. ... AES for 64-bit platform? ...
    (sci.crypt)
  • Re: 64-bit AES
    ... it is the 8 extra _ordinary_ registers that makes the difference. ... AES state in them and this prevents their use for other things. ... The additional SSE registers look attractive for some encryption ... faster than code that is based on the ordinary registers in some ...
    (sci.crypt)
  • Re: 64-bit AES
    ... it is the 8 extra _ordinary_ registers that makes the difference. ... AES state in them and this prevents their use for other things. ... faster than code that is based on the ordinary registers in some ... unlikely that the SSE registers would bring advantages. ...
    (sci.crypt)
  • Re: 64-bit AES
    ... I asked a question about implementing AESon a 64-bit platform ... improvements for AES on 64-bit platform. ... in 64-bit mode GCC makes a lot of use of r9-r13. ... GCC does not use any 64-bit wide registers and takes advantage [in this ...
    (sci.crypt)

Quantcast