spamassassin-users: Re: Training Bayes on outbound mail

From: David B Funk <dbfunk_at_nospam>
Date: Fri Jan 28 2011 - 19:24:50 GMT

On Fri, 28 Jan 2011, David F. Skoll wrote:

> On Fri, 28 Jan 2011 18:10:08 +0000
> Dominic Benson <> wrote:
> > Recently, in order to balance the ham/spam ratio given to sa-learn, I
> > have started to pass mail submitted by authenticated users to
> > sa-learn --ham.
> > I haven't seen any mention of this strategy on-list or on the web, so
> > I'm interested in whether (a) anyone else does this, and (b) is there
> > a good reason not to do it that I haven't thought of?
> It's possibly a good idea, but you want to be really careful of one
> thing: Make sure your users are savvy enough not to have their
> accounts phished. It'll take just one compromised account that blasts
> out a spam run to destroy the usefulness of your Bayes data.

Amen to that. Sad how many supposedly educated people (say engineering
professors ;) fall for phishes and get their accounts powned. 419 spammers
love to target university systems, semi-clueless users and fat pipes.

One other semi-issue with that strategy, half of Bayes is based upon
header contents. Your outgoing messages are not going to have headers that
are representative of incoming messages.

