RSS Reader in 10 lines of code
James Robertson
jarober at gmail.com
Tue Oct 4 13:08:31 UTC 2005
It's even worse than that, Markus. There are often character encoding
issues, and illegal characters to deal with. The bottom line - you can't
use a fully strict parser and expect to deal with syndicated content. I've
done a fair bit of work in this area in BottomFeeder...
<snip>
>It's even worse: Inside the <description> there can not be any HTML,
>as RSS is not a superset of HTML.
>RSS readers are very forgiving (nobody checks the DTD, and they even
>tolerate non-well formed XML in
>many cases).
>
>But in general, to make this really work, the HTML inside the RSS
>needs to be encoded as CDATA:
>
><description>
><![CDATA[
>
>now the html
>
>]]>
>
> Marcus
<Talk Small and Carry a Big Class Library>
James Robertson, Product Manager, Cincom Smalltalk
http://www.cincomsmalltalk.com/blog/blogView
More information about the Squeak-dev
mailing list
|