Re: WAS Frobenius, so good?

From: Marcel Martin (mm_at_nowhere.net)
Date: 01/14/05


Date: Fri, 14 Jan 2005 17:00:15 +0100

Cristiano a écrit :
>
> Marcel Martin wrote:
> > Cristiano a écrit :
> >>
> >> Marcel Martin wrote:
> >>> Assuming b=1 for FR (the fastest case), both FR
> >>> and BW make use of the same algorithm to compute the Lucas sequence.
> >>> Which program do you use for BW?
> >>
> >> I translated an UltraBasic implementation (which implements the
> >> extra strong Lucas pseudoprime test with method A*) in C++ code.
> >
> > So you compared the running times of FR with "frobenius.c" (which
> > is written in C and makes use of gmp which is also written in C and
> > which is one the fastest existing C libraries) with a program written
> > in C++. Which library does your C++ program use?
>
> I took the times of the BW test running on Commodore 64 Basic V2. :-)
>
> C'mon, Marcel; I done the comparison using *exactly* the same "parameters":
> all the tests are in the same program, they are invoked in the same way and
> they have been written using the same "technique" (language, structure,
> optimizations, ...).

Great. So now, it just remains to know which programs you are
comparing.
 
> Some years ago I have found the BW implementation in UB. My first
> translation was for MIRACL with its C++ wrapper, but few weeks ago I
> translated that code in C code for GMP (optimized for the Athlon).

No, you have not found _the_ BW implementation but _a_ BW
implementation. Is it the file "bwppt1.ub" that you translated? If
so, of course, it is slower than "frobenius.c" (notice that if you
translated "bwppt2.ub", that's even worst).

 1180 Wn=((N+1)\2)
 1190 for I=J to 0 step -1
 1200 Vs=(2*Ub-Us)@N:Vb=((Vs+D*Us)*Wn)@N
 1210 Us=(Us*Vs)@N:Ub=(Ub*Vb)@N
 1220 Um=(Ub+Q*Us)@N
 1230 if Bit%(I) then Us=Um else Ub=Um endif
 1240 next I

With Q=1, an optimal version shouldn't contain more than 1 MUL,
1 SQR and 2 MOD's [*], whereas, here, the "I" loop contains 3 MUL's
and 3 MOD's:

  Vb=((__)*Wn)@N
  Us=(Us*Vs)@N
  Ub=(Ub*Vb)@N

My prefered one is "Vb=((__)*Wn)@N". The goal of this multiplication
by Wn is to divide (__) by 2 modulo N. I hope you didn't translate
it verbatim. But, anyway, even if you rewrote it more efficiently, it
remains 2 MUL's and 2 MOD's, which is sufficient to make _your_ BW
slower than FR (as it is written in "Frobenius.c").

[*] If S is the size of N, I am talking of a multiplication of 2
numbers of size S and of a division of a number of size 2S by a number
of size S. I don't take in account the operations with D or with Q
since they are supposed to be small integers.

-- 
mm
http://www.ellipsa.net/