spamassassin-users January 2011 archive
Main Archive Page > Month Archives  > spamassassin-users archives
spamassassin-users: Re: eval:html_tag_balance - short tags not a

Re: eval:html_tag_balance - short tags not accepted?

From: Per Jessen <per_at_nospam>
Date: Fri Jan 28 2011 - 07:43:41 GMT
To: users@spamassassin.apache.org

Lawrence @ Rogers wrote:

> On 27/01/2011 5:36 PM, Per Jessen wrote:
>>
>>> I believe that the behavior of HTML_TAG_BALANCE_HEAD is valid in
>>> this case, as<head/> is invalid HTML (despite what the validator
>>> says) and should not be used by anyone.
>>
>> True, but html_eval_tag() will fire on _any_ short tag.
>>
>>
>> /Per Jessen, Zürich
>>
>>
>
> If it's firing on <head/> with no content, that's completely valid
> don't you think? It's invalid HTML and contains no content.

Yes, I agree.

> Could you provide an example of a site using <div/> or <p/> shorthand
> tags? I've never seen them before anywhere.

I could point you to one of mine :-), but otherwise I don't know of any
off hand. I don't do much website programming, I'm more focused on
filtering out spam.

> Previously, my understanding has always been that shorthanded closing
> was only allowed for tags that didn't have a closing tag before (such
> as <meta>). The HTML recommendations support this.

In XML, they all have a closing tag though? XSLT will certainly complain
about a missing tag:

XSLT warning: Fatal Error at (file notify-email.html.xsl, line 54, column 51):
Expected end of tag 'meta' (notify-email.html.xsl, line 54, column 51)
SAXParseException: Expected end of tag 'meta' (notify-email.html.xsl, line 54, column 51)

(I asked for XHTML Transitional output).

> Perhaps there is further work to be done in SA regarding handling HTML
> balancing, but <head/> is pointless to test for as it has no reason or
> possible use in the real world.

Agree.

> If html_eval_tag() is firing on any short tag, and not just the
> invalid example code, that would signal a possible bug and
> investigation.

Actually no - whilst html_eval_tag() does fire on any short tag, you
have specify which tag you want to check for. I hadn't noticed that
yesterday. The 3.2.5 ruleset only checks on body and head. Problem
solved. Thanks for the chat anyway.

/Per Jessen, Zürich