Re: Surrogate factoring demonstrated

From: Tim Peters (tim.one_at_comcast.net)
Date: 02/25/05


Date: Fri, 25 Feb 2005 01:56:23 -0500


[JSH]
[...]
> Given a target M, you pick a j, where I now recommend picking a j as
> close to M as you can where j is *odd*, as I am now thinking that's
> important, again.
>
> Once you've picked j, you find T, using
>
> T = M^2 - j^2
>
> and my recommendation is that T have 2, 3 and 5 as factors, and there's
> a theoretical reason but I'm keeping things simple here.

Combining all the advice about j, I picked the largest odd integer < M for
which 2, 3, and 5 all divide T. Of course 2 divides T for every odd j, so
there's really nothing to check there.

> Then you just iterate through integers f_1 and f_2 such that
>
> f_1 f_2 = T^6
>
> so you need to factor T, and you use that f_1 and f_2 in
>
> w_num = +/-(f_1 +/- f_2) - 2T^2 (2j^2 + T)
>
> and you take the gcd of w_num with M, and if it's not 1, then you're
> done.

Not quite: the gcd can also be M, and that doesn't help. You're not done
until you've run of <f_1, f_2> pairs (failure), or until 1 < gcd < M
(success).

> For the sake of full disclosure I *do* force in factors of 2 by
> considering some rational f_1 and f_2, basically like
>
> 2 f_1 (f_2/2)
>
> as you have to balance out factors forced in that way, of course.
>
> Many of my solutions come with the forced in factors of 2.

I skipped all that, because it's not clear enough to me to write a program.

...

> That is the full algorithm. I know it looks too simple to be worth
> anything, but if you try it, you will be amazed at the high factoring
> percentage--for small numbers i.e. under 40 bits.

This is my testing procedure: as explained earlier, but less precisely, on
sci.crypt, it's parameterized by N. For a given N, it repeats these steps:

     p1 = a random integer with 2**(N-1) <= p1 < 2**N
     q1 = a random integer in the same range
     # Note that p1 and q1 are both N-bit integers now.
     if p1 == q1:
         # forget it -- try another pair
         # The algorithm specified above doesn't say it shouldn't be used
         # when M is the square of a prime, but JSH has said that in other
         # posts.
     p = the smallest prime >= p1
     q = the smallest prime >= q1
     # Note that, rarely, p and/or q may be a little bigger than N-bit ints
     # now.
     M = p*q
     # Note that this never tries a composite with a small factor (where
     # "small" means less than N bits).
     Use the JSH algorithm above on M, counting the number of gcds
performed;
     as soon as a gcd with 1 < gcd < M is found, the algorithm stops without
     computing any more gcds.
     # Note: I use different methods to factor T. There's no need for
     # anything fancy with numbers this small.

Starting small, with N=10, so these composites are the product of two 10-bit
primes, and doing 1000 repetitions of the steps above. These are composites
like 739 * 821:

    nbits 10, 1000 of 1000 factored, 100%
    gcds: actual 552,579 expected for random gcd 375,476

So it factored all of them, but required more gcds to do so than a method
picking gcd candidates purely at random from 1..M-1 would be expected to
take.

Trying again, with the same N:

    nbits 10, 1000 of 1000 factored, 100%
    gcds: actual 535,567 expected for random gcd 376,601

Roughly the same, but there was a noticeable drop in the actual # of gcds
computed. One more time:

    nbits 10, 1000 of 1000 factored, 100%
    gcds: actual 535,860 expected for random gcd 376,547

A lot like the last one.

Let's try a run with 15-bit primes (unfortunately, it's already a lot slower
here; I don't mean that as a comment on the algorithm, I mean that as a
pragmatic constraint on how much testing I can afford to do this time of
night). These are composites like 21599 * 17509:

   nbits 15, 872 of 1000 factored, 87.2%
   gcds: actual 14,974,536 expected for random gcd 11,986,429

It would be prudent to run that at least twice more, but for now I'm just
going to skip to 20-bit primes and call it a night; these are composites
like 1017371 * 807749 ... ewww, I can't stay up that late tonight, calling
it quits after 125 trials:

    nbits 20, 68 of 125 factored, 54.4%
    gcds: actual 43,765,571 expected for random gcd 47,694,836

So the success rate fell dramatically, but the # of gcds tried is actually
less than expected from a purely-random gcd approach.

I found that very interesting at first, but-- alas --suspect it's really due
to a mistake in a last-second change I made to the test driver. I'll sort
that out tomorrow.

Here's what I suspect: As everyone other than JSH who has tried these
algorithms has found, the probability of finding a factor seems strongly
correlated with the number of possible f_1 f_2 splittings. When a number
fails to factor, it's likely that it had unusually few f_1 f_2 splittings to
try. In the didn't-factor case, the test scaffolding adds the number of
gcds it actually tried to the "actual" count, but still charges the
"expected for random gcd" count with the number of gcds a random method is
expected to take to _succeed_. When that's larger than the number of gcds
the method actually tried before failing to factor, that makes the method
look _better_ than random-gcd despite that it didn't factor at all.

Quick sanity check on that: the last composite it didn't factor was 640993
* 847139, where it tried 314,680 gcds. The number expected for random-gcd
is M/(p+q-2) ~= 364,894. So the method scored a relative "win" of 364894 -
314680 = 50,214 gcds in that case despite that it didn't find a factor.

If I do this again, I'll change the scaffolding to add to the gcds counts
only in the cases where the method finds a factor. An earlier version of
the test driver _did_ do that. I changed it to add to the gcds counts in
all cases in a fuzzy-headed attempt to "be fairer" to the algorithm being
tested. LOL -- it certainly was more than fair <wink>:

   nbits 50, 0 of 2 factored, 0%
   gcds: actual 2,487,440 expected for random gcd 895,946,188,997,481



Relevant Pages

  • Re: Surrogate factoring demonstrated
    ... Use the JSH algorithm above on M, counting the number of gcds ... Starting small, with N=10, so these composites are the product of two 10-bit ... actual 552,579 expected for random gcd 375,476 ...
    (sci.math)
  • Re: Simple answer, surrogate factoring
    ... [JSH] ... >> the gcd of the denominator with M, and for at least one case for any ... "dividing out factors in common between them" isn't the same as "dividing ... Here are the gcds: ...
    (sci.crypt)
  • Re: Simple answer, surrogate factoring
    ... [JSH] ... >> the gcd of the denominator with M, and for at least one case for any ... "dividing out factors in common between them" isn't the same as "dividing ... Here are the gcds: ...
    (sci.math)

Loading