spamassassin-users January 2011 archive
Main Archive Page > Month Archives  > spamassassin-users archives
spamassassin-users: Re: Training Bayes on outbound mail

Re: Training Bayes on outbound mail

From: David B Funk <dbfunk_at_nospam>
Date: Fri Jan 28 2011 - 19:24:50 GMT
To: users@spamassassin.apache.org

On Fri, 28 Jan 2011, David F. Skoll wrote:

> On Fri, 28 Jan 2011 18:10:08 +0000
> Dominic Benson <dominic@lenny.cus.org> wrote:
>
> > Recently, in order to balance the ham/spam ratio given to sa-learn, I
> > have started to pass mail submitted by authenticated users to
> > sa-learn --ham.
>
> > I haven't seen any mention of this strategy on-list or on the web, so
> > I'm interested in whether (a) anyone else does this, and (b) is there
> > a good reason not to do it that I haven't thought of?
>
> It's possibly a good idea, but you want to be really careful of one
> thing: Make sure your users are savvy enough not to have their
> accounts phished. It'll take just one compromised account that blasts
> out a spam run to destroy the usefulness of your Bayes data.

Amen to that. Sad how many supposedly educated people (say engineering
professors ;) fall for phishes and get their accounts powned. 419 spammers
love to target university systems, semi-clueless users and fat pipes.

One other semi-issue with that strategy, half of Bayes is based upon
header contents. Your outgoing messages are not going to have headers that
are representative of incoming messages.

-- Dave Funk University of Iowa <dbfunk (at) engineering.uiowa.edu> College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include <std_disclaimer.h> Better is not better, 'standard' is better. B{