can anyone help me with the calculation of statistical probability?



Here is the question. This concerns a claim of plagiarism. There are
two indexes of a similar text numbering about 750,000 words. The first
index has 27,740 terms in it, while the second index has 3,500 terms
in it. The authors of the first index claim that the authors of the
second plagiarized their index, but it turns out the indexes are
mostly different, and only a few terms are similar. Can anyone
calculate what the random similarity would be, i.e., if we assume that
there was no plagiarism and that index 1 (27740 terms) and index 2
(3500 terms) were independently derived, what would be the probability
that some of the terms would still be identical if the text to which
the indexes refer is 80%-90% similar.
.



Relevant Pages

  • Re: can anyone help me with the calculation of statistical probability?
    ... two indexes of a similar text numbering about 750,000 words. ... there was no plagiarism and that index 1 and index 2 ... You are missing the boundary condition, what is the subject field (blue ... If such a field had 3,000,000 then perhaps the second index may have ...
    (sci.crypt)
  • Can any one help me calculate a statistical probability
    ... This concerns a claim of plagarism. ... two indexes of a similar text numbering about 750,000 words. ... index has 27,740 terms in it, while the second index has 3,500 terms ... terms would still be identical if the text to which the indexes refer ...
    (sci.math)
  • Can any one help me calculate a statistical probability
    ... This concerns a claim of plagarism. ... two indexes of a similar text numbering about 750,000 words. ... index has 27,740 terms in it, while the second index has 3,500 terms ... terms would still be identical if the text to which the indexes refer ...
    (sci.stat.math)
  • Can any one help me calculate a statistical probability
    ... This concerns a claim of plagarism. ... two indexes of a similar text numbering about 750,000 words. ... index has 27,740 terms in it, while the second index has 3,500 terms ... terms would still be identical if the text to which the indexes refer ...
    (sci.stat.edu)
  • Can any one help me calculate a statistical probability
    ... This concerns a claim of plagarism. ... two indexes of a similar text numbering about 750,000 words. ... index has 27,740 terms in it, while the second index has 3,500 terms ... terms would still be identical if the text to which the indexes refer ...
    (sci.stat.consult)