Re: ECC Point Multipliers



tomstdenis@xxxxxxxxx <tomstdenis@xxxxxxxxx>:

I did a profile of the LTC code and it seems 45% of the time [at least
on my 885] is spent in Montgomery reduction. If I unroll the reduction
code from TFM I get that down to about 41% or so.

Unfortunately the moduli aren't 64-bit friendly (because NIST can't
choose a prime to safe their lives...) so the reduction techniques they
have aren't efficient. [...]

2. k-ary sliding double/add, what I use now
3. w-NAF and w-FAN, nice on paper, not faster in practice, at least
not on desktops

You are spending most time on point doubling, and there's probably not
much you can do about this.

Is your doubling code already using all the tricks? Most standardized
curves over prime fields use coefficient a = -3, so where you'd be
computing 3 * X^2 + a * Z^4 in the general code (using three squarings
and one multiplication since factor three is a simple and fast case),
you can compute 3 * (X + Z^2) * (X - Z^2) instead (using one squaring
and one multiplication), which is 3 * X^2 - 3 * Z^4, i.e. the above
for a = -3.
.