[Seaside-dev] speedup WAEncoder
Paolo Bonzini
bonzini at gnu.org
Sat Oct 3 10:22:55 UTC 2009
The attached changeset speeds up WAEncoder by noticing that all of its
subclasses are mostly outputting the input value for all characters, and
that a conversion from a character X to a character Y is never found.
With this change I got a 15% improvement on this test:
(s := ((1 to: 100000) collect: [ :e | Random between: 32 and: 126 ])
asByteArray asString) size
Time millisecondsToRun: [ 100 timesRepeat: [ ws := String new
writeStream. (SWAHtmlEncoder on: ws) nextPutAll: s ] ]
before: 5319 (best of 4 runs, restarting the VM each time)
after: 4606 (best of 4 runs, restarting the VM each time)
Consistent with this, I measured the cost per character of
WAEncoder>>nextPutAll: and the (new) WASimpleEncoder>>nextPutAll: as
respectively 24 and 21 bytecodes. This was measured with GNU
Smalltalk's profiler.
The reason is that on GST #notNil is much faster than #isString.
I could get even better speedups by using SequenceableCollection's
#at:ifAbsent: method. GST has it implemented as a primitive on
SequenceableCollection (with the failure code invoking the absentBlock),
which explains the reason for the speed. But that would be a speedup
only when all characters are in the 0-255 range, so I did not do that.
In general, the undisputed hotspot is WriteStream>>#nextPut: (15%),
which is heavily used under GST by both Swazoo and Seaside. I am
starting to think it was not such a bad idea to make it a primitive in
the Blue Book... :-)
While unportable, even a C-coded String>>#htmlEncoded (to be used by
String>>#encodeOn:) would not be a bad idea actually. 10% of execution
time is spent there, and while this would have a higher GC cost because
of possibly big strings returned by the C function, #nextPutAll: boils
down to a single memcpy so...
Paolo
-------------- next part --------------
SystemOrganization addCategory: #'Seaside-Core-Document'!
WAEncoder subclass: #WASimpleEncoder
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'Seaside-Core-Document'!
!WASimpleEncoder commentStamp: '<historical>' prior: 0!
I encode everything that is written to myself using #nextPut: and #nextPutAll: onto the wrapped stream. I am optimized for the case of being neutral to most characters. The convertion of a single character is defined in #encode:on: on the class-side of my subclasses.!
!WASimpleEncoder class methodsFor: 'initialization'!
initializeBMP
"Initializes the BMP, the Basic Multilingual Plane of UTF characters, using the encoding strategy of the receiver. This caching strategy ensures that most commonly used characters can be encoded as efficient as possible."
| stream characterLimit ch |
characterLimit := self maximumCharacterValue.
table := Array new: characterLimit.
stream := WriteStream on: (String new: 6).
1 to: characterLimit
do: [:index |
ch := Character value: index - 1.
self encode: ch on: stream reset.
table at: index
put: (stream contents = (String with: ch)
ifTrue: [nil]
ifFalse: [stream contents])]
! !
!WASimpleEncoder methodsFor: 'accessing'!
nextPut: aCharacter
| value encoded |
value := aCharacter asInteger.
value < table size
ifFalse: [stream nextPut: aCharacter]
ifTrue:
[encoded := table at: value + 1.
encoded notNil
ifTrue: [stream nextPutAll: encoded]
ifFalse: [stream nextPut: aCharacter]]
!
nextPutAll: aString
"uses #to:do: for speed reasons (on Squeak)
this is not premature optimization, this is a hotspot method method
and #to:do: shows measurable speed improvements for rendering seaside pages"
| character value encoded |
1 to: aString size
do: [:index |
character := aString at: index.
value := character asInteger.
value < table size
ifFalse: [self class encode: character on: stream]
ifTrue:
[encoded := table at: value + 1.
encoded notNil
ifTrue: [stream nextPutAll: encoded]
ifFalse: [stream nextPut: character]]]
! !
WASimpleEncoder subclass: #WAHtmlEncoder
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'Seaside-Core-Document'!
!WAHtmlEncoder class methodsFor: 'initialization'!
maximumCharacterValue
^255
! !
WASimpleEncoder subclass: #WAUrlEncoder
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'Seaside-Core-Document'!
!WAUrlEncoder class methodsFor: 'initialization'!
maximumCharacterValue
^255
! !
WAHtmlEncoder initialize!
WAUrlEncoder initialize!
More information about the seaside-dev
mailing list