Two-dimensional Montgomery ladders and ECDSA

From: D. J. Bernstein (
Date: 10/24/05

Date: Mon, 24 Oct 2005 07:38:15 +0000 (UTC)

The current speed records for Diffie-Hellman key exchange, and
specifically for elliptic-curve point multiplication, use two ideas that
Montgomery introduced in 1987:

   * You can easily compute x(iP+jP) given x(iP), x(jP), and x(iP-jP),
     where x(P) means the x-coordinate of point P. This is particularly
     fast for curves of the form y^2 = x^3 + Cx^2 + x.

   * Starting from P, you can reach nP for a b-bit n with a sequence of
     b double+add steps, where each addition iP+jP has iP-jP = P. For
     example, you can reach 51P using 2P,3P,4P,6P,7P,12P,13P,25P,26P.

Putting these ideas together you obtain x(nP) from x(P) at high speed.
Some side advantages: public keys are transmitted as just x-coordinates;
if the curve is chosen carefully then you don't have to spend any time
verifying keys; the computation is easily made immune to timing attacks.
See for more details.

Now, what happens if you want to compute x(mP+nQ), as required for
verification of ECDSA signatures?

One answer (by, e.g., Okeya and Sakurai) is to compute mP via x(mP),
compute nQ via x(nQ), and then add. Computing x(mP) and x(nQ) by
Montgomery's technique takes 2b x-coordinate double+add steps.

I've noticed that a different technique takes 2b x-coordinate additions
and only b x-coordinate doublings. Specifically, one can reach x(mP+nQ)
with a simple two-dimensional addition chain that uses b double+add+add
steps, where each addition has difference P or Q or P+Q or P-Q. Example:

   P+Q, 2P, 2P+Q,
   2P+2Q, 3P+Q, 3P+2Q,
   4P+4Q, 5P+3Q, 5P+4Q,
   9P+7Q, 9P+8Q, 10P+8Q,
   18P+14Q, 18P+15Q, 19P+15Q,
   36P+30Q, 37P+29Q, 37P+30Q,
   73P+59Q, 74P+58Q, 74P+59Q.

Each line has three of the four pairs aP+bQ, (a+1)P+bQ, aP+(b+1)Q,
(a+1)P+(b+1)Q. The missing element of (a+{0,1},b+{0,1}) is always chosen
as either (even,odd) or (odd,even), where the choice can be described
recursively using the (A,B) for the next line:

   * if (a+A,b+B) is (even,odd) then the choice is (odd,even);
   * if (a+A,b+B) is (odd,even) then the choice is (even,odd);
   * if (a+A,b+B) is (even,even) then the lines have the same choices;
   * if (a+A,b+B) is (odd,odd) then the lines have opposite choices.

For small-C Montgomery-shape curves, x-coordinate addition (using affine
difference) takes 5 field operations, and x-coordinate doubling takes 4
field operations, so this two-dimensional addition chain takes just 14
field operations per bit, quite possibly the state of the art in ECDSA
verification even if we don't require timing-attack protection.

Is this a new observation?

---D. J. Bernstein, Professor, Mathematics, Statistics,
and Computer Science, University of Illinois at Chicago