| Main Archive Page > Month Archives > spamassassin-dev archives |
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6400
--- Comment #8 from Darxus <Darxus@ChaosReigns.com> 2011-11-28 19:39:46 UTC ---
(In reply to comment #7)
> I'd recommend the entire MSPIKE kit and kaboodle. I'm running with these
> scores and recommend them:
Really? The ranks of the _WL rule and its components are kind of bad. And I'm
concerned that including all the components of the _BL rule will cause the
rescorer to behave suboptimally with our relatively limited corpora. Huh,
although effectively it looks like it's just two components, _L4 and L5, since
the other two are empty, so that's probably fine. But if I did have a vote I
certainly wouldn't vote against using the score set you recommended. I'm just
curious what you're reasoning is.
MSECS SPAM% HAM% S/O RANK SCORE NAME WHO/AGE
0 74.6530 0.0067 1.000 0.99 0.00 T_RCVD_IN_MSPIKE_BL
0 0.0251 7.0172 0.004 0.83 0.00 T_RCVD_IN_MSPIKE_WL
0 0.8738 0 1.000 0.79 0.00 T_RCVD_IN_MSPIKE_ZBI
Components of _BL:
0 0 0 0.500 0.48 0.00 T_RCVD_IN_MSPIKE_L2
0 0 0 0.500 0.48 0.00 T_RCVD_IN_MSPIKE_L3
0 48.1830 0.0059 1.000 0.98 0.00 T_RCVD_IN_MSPIKE_L4
0 25.5962 0.0007 1.000 0.97 0.00 T_RCVD_IN_MSPIKE_L5
Components of _WL:
0 0.1684 13.3764 0.012 0.72 0.00 T_RCVD_IN_MSPIKE_H2
0 0.0241 6.9795 0.003 0.84 0.00 T_RCVD_IN_MSPIKE_H3
0 0.0010 0.0355 0.029 0.50 0.00 T_RCVD_IN_MSPIKE_H4
0 0 0.0022 0.000 0.48 0.00 T_RCVD_IN_MSPIKE_H5
Somebody should create a graph, with number of randomly sampled emails from the
corpora on one axis, and accuracy rate on the other axis. Get some actual
numbers related to how much email we need for what accuracy.
-- Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug.