Re: Release 1.1 (beta) of my AES implementation
From: Tom St Denis (tomstdenis_at_iahu.ca)
Date: Wed, 25 Jun 2003 03:32:53 GMT
Mok-Kong Shen wrote:
> Tom St Denis wrote:
>>Tons of comments, cool, a bit complicated API though... I mean LTC
>>sports a fully functional AES implementation [with more functionality
>>than yours] and it only has five functions in it :-)
> Honestly, I haven't (for diverse very odd personal reasons)
> yet studied your stuff, though I did have a very quick
> glance through one of the files. It's true that I
> only provide the minimum of functionalities. But that
> was in fact also the goal at the start, the work
> being intiated only for the purpose to clear up a topic
> in the other thread concerning the document of AES as
> such. So nothing but ECB mode is done. (Or do I
> misunderstand you here?)
What I mean is my aes.c gives a key setup, ECB encrypt, decrypt, test
vectors and a key_size adjuster function. Your source on the other hand
only has setup, ECB encrypt/decrypt and has more functions.
What I was hinting at is you should break up the code into separate files.
>>I still would wave flags at your unions. Not only because I don't trust
>>it but because it honestly is *NOT* faster. When I was porting Rijmen's
>>code to LTC I experimented with various things like that.
> I have a request: Could you do a timing comparison on
> your computers (I have the impression on reading your
> past posts that you have several)? I don't think that
> my code is better, since you have apparently done
> quite some work in optimizing your code particularly
> for certain hardware. On the contrary, I'll like to
> know whether mine is very much poorer in comparison.
I give out free code for this very such reason. Write a LTC compliant
interface to your code, register your cipher and test it with x86_prof.
At the moment I'm touching up the draft of my LTM book [due to be
released tommorow hopefully] and don't have time to hack your code.
>>Your gf multiplier is horribly slower than it has to be. If you want to
>>provide a small variant with no 8x32 tables you're going to need gf
>>mults at runtime. Your code is not the way todo it. See my GF
>>multiplier in Twofish for an example of fairly efficient code.
> I suppose there is a misunderstanding here. In the
> application runs (i.e. not the installation run), no
> explicit GF multiplication is done. (There is GF
> multiplication code in the non-optimized version of a
> function, which is however not actually used.)
Might want to rethink that. Having a smaller/slower variant is not
always a bad idea. That being said you can trivially speed up the GF
multiplier without incresing its size too badly.
>>Another suggestion is to put _ in your longer names... e.g.
>>is insane to look at, try:
> You are right. On the other hand, the functions that
> a user actually calls don't have such ugly long names.
Yes, but it makes it easier for people who are auditing the code :-)
>>is slightly easier to read. your enum for PROCESS is in the middle of a
>>source file is it not? That should be in a header so other 3rd party
>>apps can make use of it.
> If another piece of code includes my code as a header
> file, then the location of that definition within my
> code shouldn't matter, if I don't err. (I chose to put
> that nearer to the function of mine that first needs
> that definition so that one doesn't need to look
> far away.)
Yes, but including source with #include is a bad idea in general. You
should really think of writing your code as standalone .C and .H files.
>>Using global variables to store your PT and CT is not only useless in
>>threaded applications but cheating. In any benchmark your app will be
>>faster [well assuming equal code] since you don't push anything on the
> I suppose on the other hand you should do the same to speed
> up your code. I don't yet see a reason why you find it
> desirable to have stuffs on the stack.
Because my code [and LTM too] is thread safe as far as low end stuff is
concerned. For the curious, the only thread-dangerous functions in LTC
are the registry functions [e.g. register_cipher] which wouldn't
normally be called after an application has split into threads.
This means my code can be used in an application that has threads
without worrying. Your code on the otherhand must have a mutex or
semaphore on the routine to prevent two threads from using the same buffers.
> Thanks for the very quick comments and critiques.
Keep in mind I have neither compiled your code or tested it against test
vectors. I'm merely commented on the code style itself.