Re: Factoring and rationals
From: Tim Peters (tim.one_at_comcast.net)
Date: 04/18/05
- Next message: Nora Baron: "Re: Talking Rationally About Surrogate Factoring"
- Previous message: Joseph Ashwood: "Re: md5 checksum"
- In reply to: Robert Maas, see http://tinyurl.com/uh3t: "Re: Factoring and rationals"
- Next in thread: Bryan Olson: "Re: Factoring and rationals"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Date: Sun, 17 Apr 2005 22:36:14 -0400
[Tim Peters]
>> do you think Dirichlet was deluded in some way when he proved that
>> 6/pi^2 is the probability that two integers chosen at random are
>> coprime?
[Robert Maas]
> No, I think if he did what you say (I am not going to spend the next
> half hour doing a Google search to verify that you got the facts
> correct, I'll just trust you for sake of argument)
It's a famous result from the 1800s, and there are enough unusual words
(like Dirichlet and coprime) that Googling for it is easier than you fear.
If you have Knuth's TAoCP handy, there are both a heuristic "plausibility
argument" and a formal proof in Volume 2, in the "Greatest Common Divisor"
section.
Concretely, write a little program using the best approximation to "picking
integers at random uniformly" you can dream up (you'll have to suspend
disbelief for the duration, trying your best despite that you know it's
impossible to succeed), and _compute_ the empirical probability that the
pairs you so pick are coprime. Don't be too surprised if you get a result
close to 6/pi^2 ...
> he carefully defined what he meant by that probability, and then
> proved that such probability defined in that way was indeed 6/pi^2.
Yes -- although I would have guessed you would be able to guess _how_
"probability" was defined here based on the rest of my msg.
>> In number theory, a claim that this-or-that probability holds for an
>> integer "chosen at random" has a conventional, rigorous meaning: you
>> consider the claim applied to an appropriate finite set of integers
>> (1 through N is most common), and compute the probability wrt the
>> uniform distribution on that finite set. Then you look at what
>> happens as N (the size of appropriate sets) goes to infinity. If a
>> limit (in the ordinary sense of the word) exists, then "the
>> probability for an integer chosen at random" is, by agreement, that
>> limit.
> That's no good, because that method covers only the positive integers.
Yes, number theory is predominantly concerned with the naturals, in which
case "1 through N is most common" and also "appropriate". I qualified both,
because "1 through N" isn't always used, nor is it always appropriate.
Of _course_ "1 through N" is an inappropriate choice if you're interested in
negative integers too. Then pick something more appropriate, like -N
through N. BTW, it's also conventional in number theory that "pick an
integer at random" refers to the set of naturals; if Z is intended and an
important distinction, "Z" is used instead. Mentally translate to "pick a
natural at random" if you like.
> By that method you could conclude that 100% of the integers are greater
> than zero.
You could conclude that the probability that an "integer chosen at random"
is greater than 0 is 1.0, where from context "integer" means "natural
number" and "probability" means the limit of [etc].
> So in this case I don't trust what you wrote because it's clearly
> wrong. Try correcting your mistake and I'll consider your revised
> definition.
Suit yourself. I'm not here to win debating points <wink>.
> Note that in general, you define an arbitrary enumeration of the
> non-finite set of interest, which works only if the set is countable in
> the first place, then you apply the limit process you outlined above.
> (Of course you must have an enumeration of the *whole* set, not just
> part of it as you mistakenly did.) But note that this method is
> specific to a given specified enumeration, and wouldn't apply to other
> enumerations that are possible or even plausable or even rational or
> even equally reasonable as the given enumeration. In the case of
> positive integers, the interval [1..n] where n gets larger seems
> reasonable as a "canonical" enumeration, but it's not the only
> enumeration, not by a long shot.
See the word "convention" in my original msg. It explains what people
generally mean, in the absence of further qualification, when they talk
about the probability of something obtaining when picking "an integer at
random". Conventions ease communication. If you feel that a convention is
inappropriate for some reason in some context, that's fine, spell out the
assumptions you need to make. I was explaining "the default" convention.
> If you wanted to enumerate all the integers, not just the positive
> integers, there are several equally plausable enumerations to choose
> from. For example, start with 0, then alternate 1 -1 2 -2 etc. Or go
> the other way 0 -1 1 -2 2 etc. Or go back and forth symmetrically: 0 -1
> 1 2 -2 -3 3 4 -4 etc. Or do them in blocks by number of bits in the
> magnitude, alternating entire blocks: 0 1 -1 2 3 -2 -3 4 5 6 7 -4 -5 -6
> -7 etc. Except for that last one, I believe all of them would produce
> the same result whenever any one of them converges.
Sorry, I don't understand why you're talking about enumerations.
Probabilities are generally defined over the elements of a set, order
doesn't matter in sets, and I didn't use the word "set" in my original msg
capriciously. If you're looking at finite sets of integers {-N, 1-N,
... -1, 0, 1, ..., N-1, N} parameterized by N, it just doesn't matter which
of the (2N+1)! ways of listing its elements you like -- it's the same _set_
regardless, and the probability of some property obtaining when picking an
element uniformly at random from the set is also the same regardless.
> When you try to enumerate the rationals, however, there's no particular
> method that is rightly canonical,
Note that I didn't say anything about "rightly" -- a convention is just a
convention. A given convention may or may not be useful for a specific
purpose. The convention used in number theory is usually useful for
number-theoretic investigations -- and of course that's why it became a
convention to begin with.
That said, I'm not aware of a common convention for defining probability
over rationals "chosen at random", and it looks messier to me too.
> i.e. no way to choose among the various likely candidates, and I
> don't believe the equally-good choices give the same limit even where
> they all give some limit, but I'm not sure about that. Note that the
> Subject of this thread is something about rationals, not integers, not
> positive integers.
While my reply, to which you're replying here, was not. It's Usenet -- live
with it <wink>.
> Note from the original poster:
You must be going all the way back to JSH here, right (the author of the
post to which Bruce Stephens replied, to which latter you replied)?
Attributions are appreciated, just to keep things a little straighter.
[reads like JSH]
>>> There are an infinite number of factorizations into two factors that
?>> are rational for j^2 T.
> I.e. he's talking about an infinite set of rationals.
Hey, I can be pedantic too <wink>: no, he's talking about a subset of RxR
there. But to make progress, yes, he's really talking about R-{0} (although
he's rarely clear about that).
>>> Now considering those infinities, you can break them up into what we
>>> call trivial and non-trivial factorizations easily enough,
> I.e. he's splitting a set of rationals into two subsets.
Yes, but JSH is notorious for not defining his idiosyncratic terminology --
it's not clear _how_ this split is defined. I usually guess that he means a
rational i/j is expressed in canonical form (i and j are coprime, j >= 1,
and 0 is represented uniquely by 0/1), and that i/j "is a non-trivial
factor" of natural M iff 1 < gcd(i, M) < M. He doesn't have the vocabulary
to say what he means precisely, so all comers have to fill in the gaps
somehow. The way I filled a gap there, j turns out to be irrelevant to the
split. I find it wholly reasonable then to assume that the numerator (i) is
equally likely to fall into any of the M congruence classes (mod M). Then
since gcd(i, M) = gcd(i mod M, M), computing probabilities is
straightforward.
A key reason for my finding that "wholly reasonable" is that the
probabilities so computed were a close match to empirically observed
probabilities in computer experiments in earlier rounds of this general
"factoring and rationals" topic. The denominators of rationals were
explictly ignored in those rounds, and the empirical results were a good
match to what was predicted by computing probabilities based on the usual
convention applied to numerators alone. Of course the usual convention is
especially well suited to computer experiments that in fact do pick
candidates from finite (albeit "large") sets.
>>> One poster made a telling reply where he claimed that there were more
>>> trivial factorizations than non-trivial ones.
> See, there's the problem: What particular enumeration of this infinite
> set of rationals is canonical or just taken for granted when estimating
> which fraction of the totality is in one subset or the other? If one
> set is finite then it's clear, 100% are in the infinite set and 0% are
> in the finite set. But if both are countably infinite, what??
I'm still not clear on what enumerations have to do with this. Sets yes,
enumerations no. In any case, I agree that lots of what JSH writes is
unclear, and it's really his job to explain what he means. Right now, he's
stuck on the argument that because the two sets have the same cardinality,
"the probability" of landing in either is 0.5. He hasn't defined what he
means by probability here, and I believe he is (unlike you) wholly unaware
of the pitfalls that abound here.
If he ever specifies an _algorithm_ for factoring (which is putatively the
real goal), then order of enumeration becomes important, and fuzzy arugments
about infinite sets can (probably) go away. In the meantime, that the
conventional meaning of probability over N says:
the probability a random integer is divisible by a prime p is 1/p
and not:
because the set of integers divisible by a prime p has the same
cardinality as the set of integers not divisible by p, the
probability that a random integer is divisible by p is 50%
may be suggestive <wink>.
- Next message: Nora Baron: "Re: Talking Rationally About Surrogate Factoring"
- Previous message: Joseph Ashwood: "Re: md5 checksum"
- In reply to: Robert Maas, see http://tinyurl.com/uh3t: "Re: Factoring and rationals"
- Next in thread: Bryan Olson: "Re: Factoring and rationals"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Relevant Pages
|