Re: question about 4.2 Multiplication in AES document(FIPS 197)

From: Mok-Kong Shen (mok-kong.shen_at_t-online.de)
Date: 05/30/03


Date: Fri, 30 May 2003 16:38:49 +0200


Jean-Luc Cooke wrote:
>
> Mok-Kong Shen <mok-kong.shen@t-online.de> wrote:
> > Brian Gladman wrote:
> >> My compiler makes a much better job of optimising the following:
> >>
> >> typedef unsigned char byte;
> >> byte mult(byte a, byte b)
> >> { byte r = 0;
> >> b &= 0xff;
> >> do
> >> {
> >> r ^= (b & 1 ? a : 0);
> >> a = (a << 1) ^ (a & 0x80 ? 0x1b : 0);
> >> }
> >> while
> >> (b >>= 1);
> >> return r & 0xff;
> >> }
> >>
> >> I have no doubt that this can be improved on.
>
> > Isn't it clear that '& 0xff' could be dropped and
> > 'r ^= (b & 1 ? a : 0);' could be replaced by
> > 'if (b & 1) r ^= a;' ? Your compiler had certainly
> > recognized these redundancies and optimized them away.
> > That resulted in the better efficiency you observed.
> > (I happen to know of cases where compilers looked
> > even at some more global contexts, i.e. larger regions,
> > and were able to do certain smart optimizations.)
>
> that would be true if there wern't any conditional assign
> operations at the ASM level. Since there are, a
> r ^= (b&1 ? a : 0);
> explicitly says "conditional assign" while
> if (b&1) r ^= a;
>
> portable code is not nessasarily platform ignorant code.

I don't yet understand what you meant. Codes here under
consideration, both the one with the indicated redundancy
and the one without are portable, right? Now, if one
codes oneself in assembler strictly acccoridng to the
stated algorithm with the redundancies and the compiler
generates code without, because it discovers the
optimization, then there should be a timing difference,
shouln't it? (We assume that the quality of code of the
human programmer and the compiler can otherwise be put
on a comparable bases.)

M. K. Shen



Relevant Pages

  • Re: WaitForSingleObject() will not deadlock
    ... One is to hijack the semantics of volatile to disable compiler optimizations ... and otherwise let the compiler to agressive optimization. ... Agressive optimizations are the ones that work on the edge of the semantics of the ... Because the compiler can see into lock and unlock, it is able to reduce f ...
    (microsoft.public.vc.mfc)
  • Re: WaitForSingleObject() will not deadlock
    ... represent an incorrect implementation of the language. ... the *compiler* does not guarantee this. ... but to state it in terms of the execution instead of the formal semantics of the language ... as long as the optimizations do not change the semantics of the language). ...
    (microsoft.public.vc.mfc)
  • Re: optimized code
    ... > loop invariants are handled by the JIT not by the compiler fron-ends. ... generates the best optimized MSIL of any of the .NET languages. ... standard native code optimizations on MSIL code. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: c compilation - gcc vs visual c
    ... I recently compiled a numerically intensive c project under cygwin gcc ... MS focuses a lot more on specific optimizations, ... the simplest approach, however (and the one I currently use in my compiler), ... (silly code), are ones I focus on fixing. ...
    (comp.lang.c)
  • Re: Migrating ARM9E codebase to ARM11
    ... > optimizations done using hand-coding, only the control code is left to ... You should then consider upgrading to a newer compiler. ... wish to target might be faster than the assembler optimised for the 9E. ... I do not believe that the ARM ARM for architecture version 6 has been ...
    (comp.sys.arm)