[Seaside] [BUG] in IAHtmlParser
Julian Fitzell
julian@beta4.com
Thu, 28 Mar 2002 22:38:05 -0800
Avi Bryant wrote:
> On Fri, 29 Mar 2002, Alain Fischer wrote:
>
>
>>Hi Avi,
>>
>>I have tried to inspect or explore each of the following line:
>>
>>IAHtmlParser parse: '<span><table></table></span>'
>>IAHtmlParser parse: '<table><tr></tr></table>'
>
>
> According to the HTML 4.0 spec, <span> can only contain inline tags, and
> so having a <table> inside a <span> is not legal (<div> is the equivalent
> intended to contain block tags like <table>). This forces the <span> to
> close before the <table> tag.
>
> The HTML parser Julian wrote for Seaside is pretty strictly conformant -
> this lets it be smart about not requiring close tags everywhere, but it
> does mean that it can do somewhat surprising things with illegal markup.
> One way it could be improved would be to actually throw an error when a
> tag (like span) that requires a close tag doesn't get one (or, as in this
> case, apparently doesn't).
>
> I imagine this cost you some time, and I apologize - if you stick to
> conformant HTML4, you should be ok in the future.
Yeah, sorry about that. I wouldn't say the parser is strictly compliant
but it ended up being necessary to make it somewhat compliant. The
reason is that in order to allow all the cases that people use all the
time, we would essentially not be able to support valid HTML (even
though it is probably never used).
The problem is, frankly that the HTML spec is insane! There are
ridiculous combinations of only allowing certain tags within others and
implicitly closing tags for you. This implicit closing is most of the
reason why I had to enforce some of the rules about what tags can be
contained inside others. This is why most people never use </p> or
</li> allowing them to be closed implicitly be the next non-inline tag
(usually the next <p> or <li> in these cases).
But it sucks. XML is often overused but in this case, HTML so wants to
be XML anyway I wish browser developers would hurry up and start adding
support for XHTML so I can start writing my webpages with it.
Again, sorry for the problems. It certainly isn't my goal to have a web
application server enforce the HTML spec for you (that should be the
browser's job) but unfortunately a loose spec and loose, loose, loose
browser implementations have made writing a parser rather difficult. I
don't want to implement a complete knowledge of every way every tag
could be used. :( I tried to keep it loose where possible but... what
can I say?
Julian
--
julian@beta4.com
Beta4 Productions (http://www.beta4.com)