mailman email harvester

From: Bernhard Kuemel (
Date: 02/07/05

  • Next message: morning_wood: "[Full-Disclosure] netdde during update"
    Date: Mon, 07 Feb 2005 23:48:44 +0100

    Hash: SHA1


    Tons of email addresses from mailman mailing lists are vulnerable to
    be collected by spammers.

    They are "protected" by obfuscation ( -> user at and access to the subscriber list can be restricted to
    subscribers. The obfuscation is trivially reversed and harvester
    scripts can subscribe to gain access to restricted lists.

    I suggested a graphical turing test that would bar scripts but the
    mailman developers argued spammers might hire a couple of temps that
    would solve the test as it already happened for the creation of
    email accounts. The only solution would be not to have the desired
    information available. This is already an option by restricting
    access to the member list to the list administrator.

    However, still many lists either have the member list openly
    published, or available to the list members. To raise awareness to
    this issue I wrote a script that collects addresses from openly
    accessible lists. It stops after processing 1000 (the maximum
    allowed) search results from google and collects 76772 email
    addresses (61124 unique). It is attached as mmxp1.

    An improved version that collects addresses that are restricted to
    subscribers, processes more lists and works more parallelized is

    Bye, Bernhard
    Version: GnuPG v1.2.5 (GNU/Linux)
    Comment: Using GnuPG with Debian -

    -----END PGP SIGNATURE-----


    #!/usr/bin/perl -w

    #2.1.4 "current archive" "private list which" mailman/listinfo site:org

    for ($i=0;1;$i+=10) {
            $google=`wget -qO - -U 'any browser' '$i'`;
    # print $google;
            @urls=($google=~m*<p class=g><a href=(http://\S+?)>*g);
    # print join("\n",@urls);
            if ($#urls==-1) {last;}
    # print "\naoeu $#urls\n";
            foreach $url (@urls) {
                    print STDERR "$url...\n";
                    $roster=`lynx -connect_timeout=10 -dump $url`;
            # print $roster;
                    @mails=$roster=~/^ +\* \(?\[\d+\](.* at .*?)\)?$/mgo;
                    foreach $mail (@mails) {
                            $mail=~s/ at /@/;
                            print "$mail\n";
            print STDERR "mails=".($#mails+1).", total=$n, url=$u, google=$i\n";
    # exit;
            } #foreach url

    } #while google

  • Next message: morning_wood: "[Full-Disclosure] netdde during update"