Re: Surrogate factoring explained

From: Tim Peters (tim.one_at_comcast.net)
Date: 02/25/05


Date: Thu, 24 Feb 2005 22:11:55 -0500


[Tim Peters]
>> JSH sometimes claims that his implementations do much better than
>> that, but is so short on necessary detail it can't be replicated. AFAIK,
>> he has not specified the algorithm he actually uses these days (else it
>> would be easy to confirm or refute independently).

[JSH]
> That's just not true. I've given out the basic algorithm, which
> depends on factoring T only.
>
> If you can show it factors as badly as you claim then I'd find that
> information to be very important.

Sorry, it doesn't work that way: either specify a specific algorithm, or
point to a previous message that specifies the exact algorithm you would
like to see tested. There are so many variations now, and I can't read your
mind. If you point to one specific algorithm, I'll try to make time to test
that specific one.

> However, I'm back to thinking I should ignore you, as it seems to me
> your posting is just a deliberate attempt to guide people away.

Suit yourself. I'm not going to explain why I think that's a foolish thing
for you to say, but know that I believe it is (if you can't see the obvious
reasons for yourself, explanation is futile).

> What would be more useful from you, would be factoring percentages with
> the algorithm--which I have posted--that depends only on factoring T.

You've posted at least 3 different algorithms that iterate over splittings
of T^6. Be specific about exactly which one you have in mind and I'll try
to help. "A specific algorithm" includes a deterministic rule for how to
pick j. I don't care whether it makes T easy to factor (I already have code
to factor T, using other methods). If making T easy to factor is part of
the rule you actually use to pick a j, then you have to specify that part of
your rule too -- or nobody else _can_ test the rule you actually use.

> And, remember, surrogate factoring is a *concept* where the idea is to
> use some surrogate to factor a target, and the particulars are, what
> works works.

Obviously, I can't report factoring precentages for "a concept", I can only
report factoring percentage for a specific algorithm.

The kind of test I do, given parameter N:

    num_gcds_actually_used = 0
    num_gcds_for_random_gcd = 0
    num_trials = 0
    num_successes = 0
    repeat:
        p = pick a random N-bit prime
        q = pick a random N-bit prime
        if p == q:
            # JSH said that the square of a prime shouldn't be used
            continue
        num_trials += 1
        num_gcds_for_random_gcd += the expected number given p and q
        M = p*q
        if JSH_factor(M) found p or q:
            num_successes += 1
        num_gcds_actually_used += number of gcds JSH_factor(M) performed

I'll note that, in this kind of test, when JSH_factor(M) finds a factor, it
exits at once, without computing more gcds. So the total number of gcds
"charged against" JSH_factor(M) _can_ be very much smaller than the total
number of splittings. When JSH_factor(M) doesn't find a factor, then every
gcd it tried _is_ charged against it.

By construction, this never tries a case with a small factor. I don't know
how you test things, because you haven't explained that. If, for example,
you test random composites, then you'll almost certainly see a success rate
much higher than I'll see by trying cases where small factors never exist.

Since cracking RSA-class composites is your stated goal, the tests I run
only try products of two "large" primes (where "large" is parameterized by
N).

> If what I have now doesn't work, then I move on.
>
> If it does work at some level, but falls apart at another, then I want
> to know it, while the propaganda posting is just so old, and
> unnecessary.

What propaganda? I stand by every result I've posted from every experiment
I've run on every algorithm you've specified here. It is simply a fact that
the numbers generated by those experiments said that random-gcd does at
least as well, and probably better, than those specific algorithms, based on
comparing number of gcds tried before a factor is found.

As I've explained before (as others have too), when you moved to T^6 the
number of possible splittings grew so large in some cases that it became
very painful to run large-scale experiments: when it picks a j such that a
factor isn't found, and such that the number of possible splittings is
large, it can require hundreds of millions (or more) of bigint gcd
computations to determine that no factor is found. That can take a long
time. You see that too, right? And do remember that I've been using _all_
splittings of T (T^6, j^2 T^3, ...) from the start -- I never underestimated
how many of those there are, and my code has always tried all of them.

> From what I'm seeing it doesn't work as well as I need, so you don't
> have to work to convince people of what I don't mind saying myself!

I report on the results of my experiments, just as you report on the results
of yours. You can consider my experiments to be interesting or not, as you
please, but that's of no real consquence to me -- I'll report on what I find
either way. That means bad outcomes and good outcomes: whatever the
program computes is what I report.

If you think you have a better-performing algorithm now, what you need to do
is specify it. For example, I just saw this in a different msg of yours:

> For the sake of full disclosure I *do* force in factors of 2 by
> considering some rational f_1 and f_2, basically like
>
> 2 f_1 (f_2/2)
>
> as you have to balance out factors forced in that way, of course.

You never mentioned that in any of the algorithms you specified here before,
so (and *of course*) I've never tested anything like that. If you now think
that's important, spell it out. Right now, I have no idea what "basically
like" means, so couldn't even start to incorporate that.

Complain all you like, but this fact is obvious: people test the algorithms
you specify. If you don't like the results, specify different algorithms.
Reporting actual results isn't "propaganda", it's part of the _purpose_ of
testing.

All that said, yes, I don't see a mathematical reason for why your algorithm
would be likely to produce factors at better than chance rates. I've been
wholly open about that from day one. Sometimes I write about-- and
from --that perspective too, just as you write about-- and from --your
perspective. I make no apology for that. You can think of me as an object,
and that's fine by me, but don't imagine I'm obligated to think of myself
that way too. I have my own interests here. To the extent that testing
algorithms plays to my interests (I'd have no other reason to bother
<wink>), that can be useful to you. Your part in that, if you _want_
independent testing, is specifying the precise algorithm you want to see
tested. I'll say it again:

>> AFAIK, he has not specified the algorithm he actually uses these days
>> (else
>> it would be easy to confirm or refute independently).

That means every stinkin' detail, James, not just "the concept".



Relevant Pages

  • Re: FP-style map over set
    ... advice on how to implement an algorithm, you have to specify the data structure that you want the algorithm to operate on. ... Or are you also asking for advice on how to implement the set ADT? ...
    (comp.lang.functional)
  • Re: fmincon for empirical likelihood-type optimization
    ... now I cannot specify the upper and lower bound for the p's. ... Length of upper bounds is> length; ... When I suggested using the interior point algorithm, I forgot that you had equality constraints. ... Is this simply a convergence tolerance issue? ...
    (comp.soft-sys.matlab)
  • Re: Evolution by Computer: 1000aa in 1163 generations.
    ... reproduced Dawkins' "Methinks it is like a weasel" algorithm all over ... That's correct -- because you didn't specify a fitness function, ... That's an improved function in your toy model of evolution, ... targets sequences with unknown locations in sequence space. ...
    (talk.origins)
  • Re: Proof factoring solution is closed form
    ... > in a thread where I'm paying attention, as if you're right I'd like to ... and cycling through all possible ways of splitting the latter into ... Now you've steadfastly refused to specify an algorithm, ...
    (sci.crypt)
  • Re: Proof factoring solution is closed form
    ... > in a thread where I'm paying attention, as if you're right I'd like to ... and cycling through all possible ways of splitting the latter into ... Now you've steadfastly refused to specify an algorithm, ...
    (sci.math)