spamassassin-users: Re: Using ZMI_GERMAN ruleset

From: Michael Monnerie <michael.monnerie_at_nospam>
Date: Wed Dec 14 2011 - 20:50:39 GMT

On Dienstag, 13. Dezember 2011 Axb wrote:
> patterns with >120 characters are not really efficient, in terms of
> speed and hit rate. They are very specific to certain campaigns and
> minimal template changes will render them useless as in:
> body __ZMIde_STOCK34 /Wir sind .{0,2}berzeugt, dass der
> Zeitpunkt sich an einem Unternehmen zu beteiligen, welches
> erfolgreich im Edelmetallhandel t.{0,2}tig ist, nicht besser sein
> k.{0,2}nnte/
> or
> body __ZMIde_SALE5 /In den letzten 5 Jahren hatte ich .{0,2}ber
> drei dutzend gut funktionierende Strategien, um die Zahl meiner
> Webseitenbesucher drastisch zu erh.{0,2}hen und dadurch meinen
> Umsatz anzukurbeln/

Since they get hits, no need to change them. Once I get reports about a
sentence that has been modified, I apply that to the rules. I need
feedback for those, of course.
And it should still be fast in terms of CPU as if there's (rule
__ZMIde_SALE5) no "In den l" in the message, the regex shouldn't have to
search too much, right? At least I'd guess it's an optimized search
which compares in 64bit steps, which is 8 chars at once?

Look for the "Krankenkassa" ruleset, this has been very active these
last weeks. All the time modifications from them, I get reports and
modify the rules accordingly.

And not to forget: Long sentences mean chance for a false positive drops

-- mit freundlichen Grüssen, Michael Monnerie, Ing. BSc