Re: Q: Fast computation of parity

From: Michael J. Fromberger (Michael.J.Fromberger_at_Clothing.Dartmouth.EDU)
Date: 12/31/03


Date: Wed, 31 Dec 2003 11:01:23 -0500

In article
<6%AIb.234860$ea%.129109@news01.bloor.is.net.cable.rogers.com>,
 "Tom St Denis" <tomstdenis@iahu.ca> wrote:

> "Foo Bar" <foobar965@hotmail.com> wrote in message
> news:YkwIb.40486$mU6.150890@newsb.telia.net...
> > "Tom St Denis" <tomstdenis@iahu.ca> writes:
> >
> > > "Foo Bar" <foobar965@hotmail.com> wrote in message
> > > news:CViIb.44244$dP1.176220@newsc.telia.net...
> > > > Mok-Kong Shen <mok-kong.shen@t-online.de> writes:
> > > >
> > > > > Could someone please give an optimized C code to get the
> > > > > parity of bits in a 32-bit word? Could xor-ing the bytes
> > > > > and then using a lookup table be a good idea? Thanks.
> > > >
> > >
> > <SNIP xor-and-shift based code>
> > >
> > > My guess is on any decent modern processor Gregs is faster.
> >
> > Try it. I think you'll be surprised.
>
> Actually no I wasn't.
>
> http://iahu.ca:8080/pars.c
>
> $ ./pars
> Greg: 5513474
> Xor: 11888907
> y = 1048576
>
> Greg's method is 2.15 times faster on my Athlon. Not surprised.

I'm not surprised either. The X86 instruction set is not well suited to
a register-based computation of this type. I can't run the exact same
comparison, because I can't use your rtdsc() function, but here is a
similar output from my G4 (compiled with gcc -O2):

% ./pars
Lookup: 50
XOR: 6

This is consistent with my expectations for a register-rich RISC
processor as compared with the register-poor x86. On my Pentium III
machine at work, my results are at least directionally consistent with
yours:

% ./pars
Lookup: 680000
XOR: 860000

The modified source: http://thayer.dartmouth.edu/~sting/misc/pars2.c

I basically replaced rtdsc() with calls to clock(), and increased the
number of tests.

-M

-- 
Michael J. Fromberger             | Lecturer, Dept. of Computer Science
http://www.dartmouth.edu/~sting/  | Dartmouth College, Hanover, NH, USA


Relevant Pages

  • Re: Algorithm to compare two binary numbers
    ... which processors might have such an instruction) and the compiler is ... > 10,000,000 XOR operations, and need to count the bits of every XOR result, ... > either by lookup table or by a real-time counting function. ... and ended with a CPU time of between 0.8 and 2 seconds for all of it ...
    (comp.programming)
  • Re: Q: Fast computation of parity
    ... So I'd think the lookup would be faster... ... x86 doesn't have enough general-purpose registers, so I suspect the XOR ... into] takes longer than one such sequence plus an access to the cache. ...
    (sci.crypt)
  • Re: Q: Fast computation of parity
    ... > Lookup: 2 ... > XOR: 1 ... This once again shows that the relative efficiency of ... algorithms/codes could be fairly hardware dependent. ...
    (sci.crypt)
  • Re: Q: Fast computation of parity
    ... I had to increase the number of tests again to get the resolution ... > Lookup: 2 ... > XOR: 1 ... Tom ...
    (sci.crypt)