[imp] SpamAssassin

Jonathan Soong jon.soong at imvs.sa.gov.au
Tue Mar 2 22:06:36 PST 2004


I am using the 'Report as Spam' button for email to be fed to the Bayes 
in SpamAssassin.

I found that headers were being added to my mail so i wrote a quick 
script to strip a Maildir full of 'Report as Spam' of their unnecessary 
headers and have them ready to be fed into sa-learn.

I have attached the script if it is useful to anyone.

Regards

Jon
-------------- next part --------------
#!/usr/bin/perl
#
# iisjsoo
# 3 March 2004
#
# This script is used to clean up mail forwarded into a Maildir directory by Horde/IMP through
# the 'Report as Spam' link.
# After running this script you can then run 'sa-learn' on the directory to learn all the mail
# as spam or ham.
# This script can be safely run over the same directory again and again (useful if you're just
# going to let more mail accumulate).

use strict;
use File::Find;
@ARGV = ('.') unless @ARGV;

my ($count, $filename, $outfile);

if ($#ARGV+1 != 1)
{
  print "Usage: $0 directory-name\n";
}

sub process_file {
  $count = 0;
  $filename = $_;
  $filename =~ s#^(\s)#./$1#;
  $outfile = $_ . ".out";
  if (-f  $filename)
  {
    open( FH, "< $filename") or die "could not open file";
    open( OUTFILE, ">$outfile" ) or die "could not open output file";
    while (<FH>)
    {
      if ($_ =~ m/^Return-Path.*$/)
      {
        $count++;
      }

      if ($count >= 2)
      {
        print OUTFILE $_;
      }
    }
    close (FH);
    close (OUTFILE);

    if ($count == 2) # This means there were the two 'Return-Path's
    {
      print "Moving $outfile to $filename\n";
      `mv $outfile $filename`;
    }
    else
    {
      print "Did not receive two Return-Paths: $filename\n";
      `rm $outfile`;
    }

  }
}

find(\&process_file, @ARGV);


More information about the imp mailing list