Re: Tiny table AES implementation

tomstdenis@xxxxxxxxx wrote:
karl malbrain wrote:
Of note is the table organization: 32 or 8 bit. Large tables cause
problems. Tiny table implementations are immune because the whole
table fits in 1/4 of the cache lines needed by large tables.

The problem [as I understand it] has to do with cache-bank conflicts.
Because the memory is dual ported you can perform two [upto] 64-bit
reads at a time. But they cannot be in the same bank on a cache line.
[I'm going from memory, I have the specs around somewhere at the

Right. On Intel machines loads that are back-to-back to the same bank
stall one cycle. Your aes encryption code loads a dword from TE0 and a
byte from s1 back-to-back.

We have a linux box running here that I would like to test on. Can you
post your rd_clock version for linux?? Thanks, karl m


Relevant Pages

  • Re: Sysinstall automatic filesystem size generation.
    ... much safer with the cache _enabled_, on most drives except the most ... >>advocates, typically against Linux users with journalled fs, on web ... points, even on consumer-grade hardware. ...
  • Re: [00/17] Large Blocksize Support V3
    ... with this form of block aggregation - this is pretty much what is ... linux we can't say how nasty it would be. ... what Linux had for a buffer cache. ... Given that small block sizes give us better storage efficiency, ...
  • Re: Improvements to fsck performance in -current ...?
    ... > conjunction with the performance impact it had on Postgre. ... > The tests don't 100% apply, since he was testing with Linux and XFS, ... I doubt seriously that it is the disk caching which is to be blamed here, ... The cache sized on disks ...
  • Re: The strangeness called `sbin
    ... there are shells nowadays that cache all binaries in PATH ... I have colleagues of mine who use Linux ... to privileged files due to privacy concerns on shared user systems. ...
  • Re: Memory usage - Evoultion and Mozilla
    ... > Yep - that's basically a subtotal of the total less buffers and cache. ... do know that each should show an identical amount of active RAM use, ... If you see 6 threads of Mozilla, each showing 50MB, it's actually 50MB ... Also, as was hinted at earlier in this message thread, Linux makes more ...