[PWS] PWS only meant for Swiki?

Mark Guzdial guzdial at cc.gatech.edu
Tue May 4 15:54:08 UTC 1999


>I don't think either the Swiki file format, or the <> check in itself, are
>inherently slow.  It's just that the current implementations that deal
>with those things are slow:
>
>	1. The SwikiPage>>text routine does things similar to upTo: on a
>FileStream, which results in character-by-character processing.  One of
>these days, someone should fix all that stuff up on FileStream to use
>buffers, but it's not in place right now.  (And who has the time to work
>on it?).  In particular, I think reading in a Squeak string with the
>doubled ' marks is the biggest hit.

I think that Bolot's new PWS implementation is all stream-based rather than
passing strings around -- both for speed and for memory hits.  (Try to
serve a 25M QuickTime Star Wars Trailer in the current PWS :-)  This was
one of the tests that Bolot did of the current system.

>	2. The <> check currently works by doing an initial scan to find
>all ranges of <> pairs, and then checks at each line end whether the
>current text position falls within one of those ranges.  If there are 20
>HTML tags on this page, then this means going through 20 calls to
>between:and: and 20 block invocations AT EACH LINE END.  A better way is
>simply to keep a flag which reflects whether the current position is
>within a <> pair or not; seeing a < turns it on, and seeing a > turns it
>off; the check at each end of line then becomes extremely cheap.

Hmm, I just wrote a tiny-and-still-incomplete HTML tag scanner for my class
as a demonstration
(http://www.cc.gatech.edu/classes/cs2390_99_spring/slides/parse/outline.html).
Maybe I can modify that for this purpose. A hand-built scanner will
probably be faster than a regular expression system.

Mark

--------------------------
Mark Guzdial : Georgia Tech : College of Computing : Atlanta, GA 30332-0280
(404) 894-5618 : Fax (404) 894-0673 : guzdial at cc.gatech.edu
http://www.cc.gatech.edu/gvu/people/Faculty/Mark.Guzdial.html





More information about the Squeak-dev mailing list