[ANN] HTML and CSS Parser on SqueakSource
Todd Blanchard
tblanchard at mac.com
Sat Apr 15 05:39:20 UTC 2006
I've released the underlying technology behind http://
www.badpage.info and placed it on squeaksource.
http://www.squeaksource.com/htmlcssparser
Project Description
This is an HTML and CSS parser and DOM that handles rotten HTML and
broken CSS quite well. I wrote it to provide validation of web pages
and it is the underlying technology behind http://www.badpage.info.
The tag nesting and attribute rules are determined by interpreting
the DTD's at the W3C. Hopefully this will make it fairly future
proof. The CSS parser understands most of CSS 2 and some CSS 3 and
the CSS selectors can tell if they match a DOM node. There is no
visual rendering and no calculation of layout.
I hearby license it free for almost any use with the understanding
that it may not be used to provide website QA software or services
such as might compete with http://badpage.info.
Otherwise, do whatever you like with it. I think it would make a
dandy base for a real web browser. I also find it quite useful for
scraping web pages.
-----
SqueakMap is not presently responding to requests to send me a new
password and I can't remember my old one. When it regains its
senses, I'll put it up there as well.
-Todd Blanchard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20060414/1232ee1e/attachment.htm
More information about the Squeak-dev
mailing list
|