RSS Reader in 10 lines of code

James Robertson jarober at
Tue Oct 4 13:08:31 UTC 2005

It's even worse than that, Markus.  There are often character encoding 
issues, and illegal characters to deal with.  The bottom line - you can't 
use a fully strict parser and expect to deal with syndicated content.  I've 
done a fair bit of work in this area in BottomFeeder...


>It's even worse: Inside the <description> there can not be any HTML,
>as RSS is not a superset of HTML.
>RSS readers are very forgiving (nobody checks the DTD, and they even
>tolerate non-well formed XML in
>many cases).
>But in general, to make this really work, the HTML inside the RSS
>needs to be encoded as CDATA:
>now the html
>    Marcus

<Talk Small and Carry a Big Class Library>
James Robertson, Product Manager, Cincom Smalltalk

More information about the Squeak-dev mailing list