|Main Archive Page > Month Archives > spamassassin-users archives|
On Thu, 2011-12-15 at 10:57 -0500, firstname.lastname@example.org wrote:
> On 12/15, Martin Gregorie wrote:
> > The problem that needs addressing is that the ok_locales configuration
> > parameter doesn't work. This appears to be because it thinks the
> > sender's choice of (in Windows terms) the character translation code
> > page is a reliable indication of the sender's locale. I accept that this
> I'd argue that ok_locales is defined by the way it functions, which was
> dependent on the fact that at one time it was useful to differentiate
> languages by character set. And TextCat's functionality is basically
> exactly what you're looking for. So it would make less sense to redefine
> ok_locales, and more sense to fix TextCat.
In that case I'm missing some information: how to write a rule that can
interpret the value(s) returned by TextCat.
Why wouldn't it be sensible to rewrite ok_locales to compare TextCat
return value(s) against its list of OK codes?
> I don't think your comment will help either way. Cyrillic character sets
> aren't hard to find, and all the devs are aware of the problem.
Then why has ok_locales not been fixed already? This is not a criticism,
just a request for information. Is it something that's difficult to do
efficiently? I'd imagine that language recognition by looking codepoint
values is possible but not necessarily fast nor unambiguous.
> If, on the other hand, you want to fix TextCat, or otherwise implement a
> solution to the problem, and attach a patch to a bugzilla comment, that
> would be awesome.
I've no time ATM and in any case I'm a middling to poor Perl coder. Now,
if SA was written in C or Java....