Re: SF: Back to theory

From: Nora Baron (norabaron_at_hotmail.com)
Date: 02/26/05


Date: 26 Feb 2005 12:15:16 -0800

Tim Peters wrote:
> [Nora Baron, to JSH]
> > You have made this so much more complicated than it needs to
> > be. Here is a simpler approach that may accomplish much the
> > same and it is also much more general.
> >
> > Assume M is the number to be factored. Pick a (small)
> > integer j. Let T = M - j. Thus T is a function of both
> > M and j.
> >
> > Factor T. Assume you have split it into two factors,
> > f and g. Thus T = f*g.
>
> As for James's methods, the test I'll talk about below tries all
possible
> ways of splitting T=f*g s.t. f and g are integers >= 1 and f >= g.
I'm
> skipping f < g cases since gcd(f-g, whatever) = gcd(g-f, whatever)
(someone
> may wish to sue me over this, but in my universe gcd always returns a

> non-negative result).
>

  Rest assured that at least I will not be filing a lawsuit
over this!

> > Now let X be some rational function of f and g. One
> > possible choice might be,
> >
> > X = (f - g)/(f + g).
> >
> > Finally, let Y = M / X. Thus M = X * Y.
> >
> > Note that X and Y are both functions of the factors of
> > T. Also both are rational numbers. There is some chance
> > that the numerator of X has a factor in common with M.
>
> It's always ambiguous in these cases whether it's assumed that
rationals are
> or aren't expressed in normalized form, where by "normalized" I mean
> gcd(numerator, denominator) = 1. I'm not sure it matters here to the

> results. It matters _pragmatically_ quite a bit, because another gcd

> calculation is required to create a normalized rational. The program
below
> doesn't do that, so doesn't even bother to compute f+g.

  I agree - it might result in a little wasted work, but not a lot,
and it could actually go the other way.

> It just tries
> gcd(f-g, M) for all <f, g> splittings as mentioned above. If it had
to
> compute a normalized X, the count of the number of gcds performed
reported
> below would double.
>
> > This is, certainly, surrogate factoring. Really, your
> > underlying idea is not that different. What you are doing
> > essentially is finding a more complex function to define
> > X as a function of f and g.
>
> Of course I agree, but I'm almost as insignificant as you are these
days
> <wink>.
>

  Hey, you're competing with a serial liar here!

> > Here is how this might work with M = 15. Let j = 1.
> > Then T = M - j = 14. You factor T as f * g = 7 * 2. Then
> > you note that
> >
> > X = (7 - 2) / (7 + 2) = 5/9.
> >
> > Right there, in the numerator, you have a factor that
> > divides M.
> >
> > Interestingly, the denominator, 9, also has a factor in
> > common with M.
> >
> > Let's try it on a bigger number. Say M = 77. This time let
> > j = 2. T = 75 = 3* 5 * 5. Let f = 25 and g = 3. Then
> >
> > X = (f - g)/(f + g) = (25 - 3)/(25 + 3) = 22/28 = 11/14.
> >
> > Note that the numerator of X, namely 11, is a factor of 77.
> > And again, the denominator, 14, also has a factor in common
> > with 77.
> >
> > Pretty amazing, eh?
>
> Heh. Heh heh.
>

  Right. Well, actually I was a little surprised it was that
easy.

> > I am a little surprised myself. I typed out this whole
> > message in the last 15 minutes. I just made up the function
> > that defines X, X = (f - g)/(f + g). I could have chosen an
> > infinity of other functions. Also, quite honestly, I just
> > made up the two examples. I must confess that I did try j = 1
> > with the second example mentally, and saw it was not going
> > to work, so I tried j = 2. No other trial-and-error.
>
> OK, I plugged this into the same test scaffolding I used to test
assorted
> JSH algorithms. In all cases, I used the sequence j=1, j=2, j=3, ...
until
> a factor was found.
>
> Despite that it's using much smaller integers than JSH's methods, it
runs
> much slower than those. I expected that. Despite
sometimes-hysterical
> posts about this point, factoring James's T=M^2-j^2 truly isn't hard,
and
> then iterating over all splittings of T^6 can generate an enormous
number of
> gcds to try. Iterating over the splittings of the comparatively
teensy
> T=M-j in _this_ method yields far fewer splittings to try. Since
most j
> "don't work" here either as M gets larger, we end up needing to
_factor_
> much more often in this method, and factoring is more expensive than
doing
> gcds (I noted once in sci.crypt that I've never hit a case in James's

> assorted algorithms that required using a j larger than log2(M)
except when
> M was the square of a prime; you already know that j much larger than
that
> is sometimes needed in this method; but note that JSH methods can
burn thru
> hundreds of millions of gcds before j even reaches 10).
>

  Yes, good points, and I agree that there are advantages to
using M^2 - j^2. I was trying to make things as simple as
possible.

> So it's slow for that reason, but apparently for another too. Here's
"the
> standard test" output after picking 1000 pairs of 15-bit primes at
random (I
> gave a precise account of what that means in an earlier post), and
using
> this algorithm to factor their product:
>
> nbits 15: 1000 of 1000 factored, 100%
> gcds: actual 23,150,579 expected random 12,122,784 no factor 0
>
> So all of them factored, which you expected. "actual" is the total
number
> of gcds it actually computed (and would be twice this if it
normalized X
> first). "expected random" is the expected (mean) number of gcds that
would
> have been needed to factor the same products by a method that
repeatedly
> picks k purely at random from 1..M-1 until 1 < gcd(k, M) < M. "no
factor"
> is the number of gcds wasted on products that failed to factor (of
which
> there are none here, so that's 0).
>
> It's not surprising that it's doing worse than random-gcd, as
measured by
> number of gcds required. It's possibly surprising that it's doing
worse in
> this respect than JSH's latest specified algorithm, but not if you've
tested
> a lot of these things <wink>. I can slash the number of gcds needed,
and
> speed it by a factor > 3, just by iterating over all the ways to
split T^6
> (instead of plain T) as f*g. But that leads to a much bigger
surprise!
> Like so:
>
> nbits 15: 1000 of 1000 factored, 100%
> gcds: actual 7,765,039 expected random 12,139,989 no factor 0
>

  Hmm. That is a little surprising.

  The 100% factored part is NOT surprising. Here's why. Say
M = p * q, where p and q are prime. Let j = p - 1. Thus
T = M - p + 1. Hence let f = M - p + 1 and g = 1. Then
f - g = M - p + 1 - 1 = M - p, which factors as p*(q - 1), hence
gcd(f - g, M) = p.

  So can one do better with T = M^2 - j^2? Certainly one can do
no worse since M - 1 divides M^2 - 1.

> Congratulations! This is the first "surrogate factoring" algorithm
I've
> tested that performs better than random-gcd (well, it's still much
slower
> than random-gcd, but time isn't the measure here). I suggest we
christen it
> the Baron-Harris method <LOL>.

  I was hoping to patent it. Oh well. Anyway, your using it with T^6
makes it Baron-Harris-Peters.

> I don't have time to think about "why" now,
> but it sure was a surprise to me -- all I had in _mind_ by moving to
T^6 was
> to increase the quality of the pseudorandom number generator "f-g".
>
> Looks like that holds for products of 20-bit primes too (these take a
_lot_
> longer to run, so I stopped after 75 -- out of time for tonight):
>
> nbits 20: 75 of 75 factored, 100%
> gcds: actual 13,320,947 expected random 28,033,396 no factor 0
>
> Just one example from that run, as a sanity check:
>
> M = 533237 * 755309 = 402758705233
> j = 23
> T = M-j = 402758705210 = 2 5 19 4931 429889
> T^6 = obvious
> f = 501359048354607032677714075483363995707002203040
> g = 8513747362502192556250
> check for yourself that T^6 = f*g (it does)
> gcd(f-g, M) = 533237 bingo
> number of gcds tried = 156,337
>
> Sanity checked.
>

  Amazing! And reassuring, regarding the sanity.

> > What's the point?
> >
> > The point is, I have modified and greatly simplified your
> > central idea - and at the same time, greatly generalized it -
> > and on these two simple little examples - the only ones
> > I have tried - it works. It might work on lots of other
> > examples. If it doesn't, I can try changing the function
> > that defines X. I can add parameters, like your A, y, and z.
> > Your central theme, surrogate factoring of the number
> > T = M - j, is still there, and it might even explain why
> > it works (if it does).
>
> Not to mention that Baron-Harris probably always finds a factor,

  See above.

> and seemingly needs fewer gcds than poke-and-hope.
>
> > After all, the factors of M ought to be related somehow to the
factors
> > of T.
>
> There doesn't currently seem to be a good reason to believe that's
the case,
> right?

  I don't see any.

> For example, if you could deduce the factors of M from those of M-j
> or M+j efficiently, you could pick up a pile of RSA challenge checks
> tomorrow. Here's a start: if you know all the primes dividing N-1
or N+1,
> you don't have to bother checking whether they divide N <wink -- but
all the
> results I've read about this are as trivial and unexploitable as that
one>.
>
> > In the case of RSA numbers, in general you expect T to be easier
> > to factor than M.
>
> Yup. The other half is that if M _could_ be factored from a
factorization
> of T efficiently, it's more than a little puzzling that nobody yet
has found
> a way to do that. Well, maybe tomorrow's Baron-Harris-Decker-Filho
method
> will finally crack that nut.

  Baron-Harris-Peters-Decker-Filho. I'm not holding my breath.

  Nora B.



Relevant Pages

  • Re: SF: Back to theory
    ... >> This is, certainly, surrogate factoring. ... > gcds to try. ... It's possibly surprising that it's doing ... > tested that performs better than random-gcd (well, ...
    (sci.math)
  • Re: SF: Back to theory
    ... gcds to try. ... much more often in this method, and factoring is more expensive than doing ... It's not surprising that it's doing worse than random-gcd, ...
    (sci.crypt)
  • Re: SF: Back to theory
    ... gcds to try. ... much more often in this method, and factoring is more expensive than doing ... It's not surprising that it's doing worse than random-gcd, ...
    (sci.math)
  • Re: Proof factoring solution is closed form
    ... In an RSA-class composite, M is so large compared to p+q this is ... JSH) has reported details about so far, the probability of factoring by pure ... rather than # of gcds, poking at random would have been _much_ more ... In other words, in this special case, the method succeeds at exactly the ...
    (sci.crypt)
  • Re: Proof factoring solution is closed form
    ... In an RSA-class composite, M is so large compared to p+q this is ... JSH) has reported details about so far, the probability of factoring by pure ... rather than # of gcds, poking at random would have been _much_ more ... In other words, in this special case, the method succeeds at exactly the ...
    (sci.math)

Loading