[squeak-dev] Re: [Pharo-project] HTML parser (again)

Andrei Stebakov lispercat at gmail.com
Wed Aug 18 18:10:53 UTC 2010


As for Scamper when I try to evaluate (in Pharo 1.1)
tok := HtmlTokenizer on: '<html />'.

There is an error:

Error: My subclass should have overridden #contents
Proceed
Abandon
Debug
HtmlTokenizer(Object)>>error:
HtmlTokenizer(Object)>>subclassResponsibility
HtmlTokenizer(Stream)>>contents
HtmlTokenizer(Stream)>>printOn:
[] in HtmlTokenizer(Object)>>printStringLimitedTo:
String class(SequenceableCollection class)>>streamContents:limitedTo:
HtmlTokenizer(Object)>>printStringLimitedTo:
HtmlTokenizer(Object)>>printString
TextMorphForShoutEditor(ParagraphEditor)>>printIt
[] in TextMorphForShoutEditor(ParagraphEditor)>>printIt:
TextMorphForShoutEditor(ParagraphEditor)>>terminateAndInitializeAround:
TextMorphForShoutEditor(ParagraphEditor)>>printIt:
TextMorphForShoutEditor(ParagraphEditor)>>dispatchOnKeyEvent:with:
TextMorphForShoutEditor(TextMorphEditor)>>dispatchOnKeyEvent:with:
TextMorphForShoutEditor(ParagraphEditor)>>keystroke:
TextMorphForShoutEditor(TextMorphEditor)>>keystroke:
[] in [] in TextMorphForShout(TextMorph)>>keyStroke:
TextMorphForShout(TextMorph)>>handleInteraction:
TextMorphForShout(TextMorphForEditView)>>handleInteraction:
[] in TextMorphForShout(TextMorph)>>keyStroke:



On Wed, Aug 18, 2010 at 2:34 AM, laurent laffont
<laurent.laffont at gmail.com> wrote:
>
>
> On Wed, Aug 18, 2010 at 7:50 AM, Andrei Stebakov <lispercat at gmail.com>
> wrote:
>>
>> I've been looking for a nice and fast HTML parser.
>> I've found Zulq Alam's Soup
>> (http://www.squeaksource.com/@vHckXt8_6gVtXFxy/XMrjDbIs) it looks nice
>> but it's way too slow for me (takes 5 sec to parse the page, my
>> current lisp parser takes about 1 sec for that.)
>> I found another one, Todd Blanchard's HTML and CSS parser
>> (http://www.squeaksource.com/@iMgHmTKVxU00wEdz/A0jkqk71) but I
>> couldn't load it into Pharo 1.1 or Squeak 4.1.
>> It complains about some syntax error and leaves the progress bar which
>> I can't kill...
>> I wonder if anyone (Todd?) can take a look at the parser and figure
>> out how to fix it?
>>
>> What other options I have for an HTML parser?
>> Looking at Pharo speed I wonder if there is any way to optimize it? Is
>> JIT or some other speed optimization in plans for Pharo/Squeak?
>
>
> What do you need to do ?
> There's XMLSupport http://www.squeaksource.com/XMLSupport.html
> Scamper might have a standalone HTML
> parser http://www.squeaksource.com/Scamper.html
> The CogVM has JIT.
> Laurent.
>
>>
>> Thank you,
>> Andrei
>>
>> _______________________________________________
>> Pharo-project mailing list
>> Pharo-project at lists.gforge.inria.fr
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>
>
>
>
>



More information about the Squeak-dev mailing list