[squeak-dev] Re: Extracting data from web pages using Squeak

cdrick cdrick65 at gmail.com
Mon Jun 16 16:34:03 UTC 2008


>
>
> Thanks for the hint.


no pb

>
> Are there parsers available to get say table data into some kind of
> collection?


not directly  (from what  I know)

Have a look at XMLDOMParser addressBookXMLWithDTD but it's for XML files...

Maybe just cut the part of the stream you're interested in... when you get
to <td> until </td>... somethink like:

string := (HTTPClient httpGet: 'http://url.com') contents .
a := (string indexOfSubCollection: '<table>') + '<table>' size  "if this is
the first table..."
b := (string indexOfSubCollection: '</table>') - 1
string copyFrom: a to: b.
...

than you work on the string to build your collection (#copyReplaceAll: with:
can help)...  quite hacky though ;) ... I'm sure there are better options
but that's all I can see now

Cédrick

>
>
> Lou
> -----------------------------------------------------------
> Louis LaBrunda
> Keystone Software Corp.
> SkypeMe callto://PhotonDemon
> mailto:Lou at Keystone-Software.com http://www.Keystone-Software.com
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20080616/347d84d3/attachment.htm


More information about the Squeak-dev mailing list