[svlug] sa-learn

Rick Moen rick at linuxmafia.com
Thu Jan 22 10:35:18 PST 2015

Quoting Scott DuBois (rhcom.linux at gmail.com):

> Does anyone have an already well established spamassassin list built that I can
> plug into the EBLUG SA we installed last night?
> We would be most grateful. =)

Typically, one best populates the Bayesian database by feeding it mboxes
of known ham and spam.

# su - Debian-exim                                                                                                           
Debian-exim at linuxmafia:~$ sa-learn --ham --mbox /tmp/ham 
Debian-exim at linuxmafia:~$ sa-learn --spam --mbox /tmp/spam
Debian-exim at linuxmafia:~$ exit

I will caution you in advance against relying on autolearn.  The
spammers send out spam with large amounts of Project Gutenburg and
similar text to de-tune people's Bayesian classifiers.  You need to
ensure that you are feeding mbox files consisting of typical spam
without the public-domain-text chaff for the 'sa-learn --spam'
invcations, and clear non-spam for the 'sa-learn --ham' ones.

A more-fundamental point:  Don't rely primarily on SA for your antispam.
SA, even when run daemonised as a system facility, is big and slow
(being in Perl).  Most of the heavy lifting is best done by Exim ACL
sets, as described by J.P. Boggis's 'Eximconfig' prepackaged set of

See:  'Eximconfig' on http://linuxmafia.com/kb/Mail/

Unpacking the Eximconfig bundle into your system and then following
J.P.'s (Jonathan's) instructions gives you an intelligently designed
antispam system with a number of optional features worth considering
over time.  I would urge going slow on the optional features such as
greylisting:  I decided, in particular, that I would defer any feature
that would require making my MTA dependent on MySQL being running.

Here's one important tweak I suggested to J.P.  (Note that I do _not_ 
recommend doing what Jonathan says in the '>'-indented text.)

Date: Wed, 30 Nov 2005 22:55:15 -0800
From: Rick Moen <rick at linuxmafia.com>
To: Jonathan Boggis <jpb at jcdigita.com>
Subject: Re: Mailing list posts, and EximConfig

Hi, Jonathan.  Responding to your mail from last February:

Quoting Jonathan Boggis (jpb at jcdigita.com):

> Thanks for your comments.
> The problem is with SPF and forgery checks being performed on the
> header From: address.  These help reject the forgeries generated by
> spammers and viruses, but have the downside of potentially blocking
> messages sent via mailing lists.
> The easiest way to address this in the short term is simply to
> whitelist the mailing list's domain in accept/sender_domain (I.e:  Add
> lists.svlug.org to this.)  This will prevent the SPF and forgery
> checks on mail sent by this mailing list.  However, it will also
> disable spam checks, so if spam is sent to the mailing list, it won't
> be trapped by SA.
> In the long term, SPF and forgery checks probably need to be a little
> bit more configurable, i.e:  So it's possible to specify whether to do
> SPF and forgery checks on the envelope sender, header From: and SPF on
> return path (Not presently done in EximConfig, but should be.)

I found that the best remedy was to disable "spf_from_acl" in
/etc/exim4/eximconfig/config/spf.conf .  This bit:

  #    # Check header From:
  #    warn     set acl_m8  = ${address:$h_From:}
  #    deny     !acl        = spf_check
  #    warn     message     = Received-SPF-From: $acl_m8 ($acl_m7)
  #    accept

I've left the envelope-sender check, etc., intact:


      # Check envelope sender
      warn     set acl_m8  = $sender_address
      deny     !acl        = spf_check
      warn     message     = Received-SPF: $acl_m8 ($acl_m7)


      warn     set acl_m2  = ${readsocket{/tmp/spfd}\
                           helo=${if def:sender_helo_name\                                                                                              
                           \nsender=$acl_m8\n\n}{20s}{\n}{socket failure}}                                                                              
      # Defer on socket error

      defer    condition   = ${if eq{$acl_m2}{socket failure}{yes}{no}}
               message     = Cannot connect to spfd

      # Prepare answer and get results

      warn     set acl_m2  = ${sg{$acl_m2}{\N=(.*)\n\N}{=\"\$1\" }}
               set acl_m8  = ${extract{result}{$acl_m2}{$value}{unknown}}
               set acl_m7  = ${extract{header_comment}{$acl_m2}{$value}{}}

      # Check for fail

      deny     condition   = ${if eq{$acl_m8}{fail}{yes}{no}}
               message     = ${extract{smtp_comment}{$acl_m2}{$value}{}}
               log_message = Not authorized by SPF


Unless I'm missing something, that really _is_ the only logical                                                                                         
solution, since SPF by its creator's definition and intent _is_ intended                                                                                
to validate "From " and _not_ "From:".                                                                                                                  

I really would urge that you consider disabling "spf_from_acl"                                                                                          
in post-2.2 versions.

Cheers,                 "Due to circumstances beyond our control, we regret to
Rick Moen               inform you that circumstances are beyond our control."
rick at linuxmafia.com                                              --Paul Benoit

More information about the svlug mailing list