Re: can anyone help me with the calculation of statistical probability?
- From: "Jane Jian" <dd34e@xxxxxxxxx>
- Date: Tue, 18 Mar 2008 14:31:15 -0500
<flame.dawn@xxxxxxxxx> wrote in message
news:6d354577-a5ce-4f75-84f0-d48e21c3140d@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Here is the question. This concerns a claim of plagiarism. There are
two indexes of a similar text numbering about 750,000 words. The first
index has 27,740 terms in it, while the second index has 3,500 terms
in it. The authors of the first index claim that the authors of the
second plagiarized their index, but it turns out the indexes are
mostly different, and only a few terms are similar. Can anyone
calculate what the random similarity would be, i.e., if we assume that
there was no plagiarism and that index 1 (27740 terms) and index 2
(3500 terms) were independently derived, what would be the probability
that some of the terms would still be identical if the text to which
the indexes refer is 80%-90% similar.
the first index would have to prove the plagiarize to a jury.
You are missing the boundary condition, what is the subject field ( blue
plants in Alaska) or is it wide open like a dictionary?
IF such a field had 30,000 terms in it, then both indexes would be
independent.
If such a field had 3,000,000 then perhaps the second index may have
borrowed too much from the first.
However there are more conditions too, like word for word etc.
.
- References:
- can anyone help me with the calculation of statistical probability?
- From: flame . dawn
- can anyone help me with the calculation of statistical probability?
- Prev by Date: Re: can anyone help me with the calculation of statistical probability?
- Next by Date: Re: structure of hash functions?
- Previous by thread: Re: can anyone help me with the calculation of statistical probability?
- Next by thread: Re: can anyone help me with the calculation of statistical probability?
- Index(es):
Relevant Pages
|