mailman email harvester

From: Bernhard Kuemel (
Date: 02/07/05

  • Next message: morning_wood: "[Full-Disclosure] netdde during update"
    Date: Mon, 07 Feb 2005 23:48:44 +0100

    Hash: SHA1


    Tons of email addresses from mailman mailing lists are vulnerable to
    be collected by spammers.

    They are "protected" by obfuscation ( -> user at and access to the subscriber list can be restricted to
    subscribers. The obfuscation is trivially reversed and harvester
    scripts can subscribe to gain access to restricted lists.

    I suggested a graphical turing test that would bar scripts but the
    mailman developers argued spammers might hire a couple of temps that
    would solve the test as it already happened for the creation of
    email accounts. The only solution would be not to have the desired
    information available. This is already an option by restricting
    access to the member list to the list administrator.

    However, still many lists either have the member list openly
    published, or available to the list members. To raise awareness to
    this issue I wrote a script that collects addresses from openly
    accessible lists. It stops after processing 1000 (the maximum
    allowed) search results from google and collects 76772 email
    addresses (61124 unique). It is attached as mmxp1.

    An improved version that collects addresses that are restricted to
    subscribers, processes more lists and works more parallelized is

    Bye, Bernhard
    Version: GnuPG v1.2.5 (GNU/Linux)
    Comment: Using GnuPG with Debian -

    -----END PGP SIGNATURE-----


    #!/usr/bin/perl -w

    #2.1.4 "current archive" "private list which" mailman/listinfo site:org

    for ($i=0;1;$i+=10) {
            $google=`wget -qO - -U 'any browser' '$i'`;
    # print $google;
            @urls=($google=~m*<p class=g><a href=(http://\S+?)>*g);
    # print join("\n",@urls);
            if ($#urls==-1) {last;}
    # print "\naoeu $#urls\n";
            foreach $url (@urls) {
                    print STDERR "$url...\n";
                    $roster=`lynx -connect_timeout=10 -dump $url`;
            # print $roster;
                    @mails=$roster=~/^ +\* \(?\[\d+\](.* at .*?)\)?$/mgo;
                    foreach $mail (@mails) {
                            $mail=~s/ at /@/;
                            print "$mail\n";
            print STDERR "mails=".($#mails+1).", total=$n, url=$u, google=$i\n";
    # exit;
            } #foreach url

    } #while google

  • Next message: morning_wood: "[Full-Disclosure] netdde during update"

    Relevant Pages

    • [Full-Disclosure] Re: mailman email harvester
      ... processes more lists and works more parallelized is ... The addresses of mailing list subscribers are top quality to ... | You hoping to sell it to spammers? ... The report you cited is about individuals obfuscating addresses in ...
    • Re: Another flood of spam
      ... We all know there are a relative few large-scale spammers who send a large percentage of the spam, I'm not surprised when this mailing list happens to make it onto one of their lists. ... As much as I do not want to restrict the Debian lists to "subscribers only post", I will not be surprised nor blame the Debian developers if that decision is made. ...
    • Re: freebsd list admins?
      ... makes it difficult for others like myself who read the lists online and ... Making it writable to subscribers only in-and-of-itself does not solve the ... spammer problem. ... the list subscriber only there is no way to get rid of spammers. ...
    • Re: Another flood of spam
      ... Each time the spammers find another way around the spam filters, ... As much as I do not want to restrict the Debian lists to "subscribers ...
    • Re: An interesting note - definitely off topic
      ... exagerating and publicizing the claim, ... subscribers aren't subjected to advertising. ... entirely familiar with how mailing lists work. ... I put the list on temporary moderation. ...