[Seaside-dev] 3.1 repo?

Philippe Marschall philippe.marschall at gmail.com
Sat Oct 23 17:47:57 UTC 2010


2010/10/5 Philippe Marschall <philippe.marschall at gmail.com>:
> 2010/10/5 Julian Fitzell <jfitzell at gmail.com>:
>> I've seen a couple of commits come in to the 3.0 repo... should we be
>> using the 3.1 repo now or...?
>
> I believe so but:
>  - When that happens all deprecated methods need to go
>  - We need to decide whether we want to change codec semantics
> (they're currently in 3.1)

Bump, let me summarize what that would probably mean:
- GRCodec >> #encode: would take a String and answer a ByteArray
- GRCode >> #decode: would take a ByteArray and answer a String

The advantage of that would be to have a better, clear separation of
byte oriented data (ByteArray) and character oriented data (String).
This would potentially help to catch encoding errors earlier.

Here's a guess what the implications of this are based on the
individual dialects:

Pharo:
All TextConverter (WAPharoGenricCodec) based converters would stop
working (anything but UTF-8 and ISO-8859-1). If needed, they could be
rewritten.

VW:
AFAIK would better work with their existing infrastructure.

GemStone:
Would probably have to change their UTF-8 code. AFAIK some of it is
primitive based so that might get a bit awkward.

VAST:
AFAIK they're not supporting UTF-8 anyway so it shouldn't be that big
of a change.

GST:
dunno

Dolphin:
dunno

There's still a place left where there's a mix of byte and character
oriented data and that's URLs. Encoding works like this:
 1. encode using  the URL codec
 2. escape URL unsafe characters
 3. escape HTML unsafe characters
 4. encode with the page encoding
The problem is step 1 will now deliver a ByteArray (which is
technically correct) but step 2 and 3 expect a String. Step 2 could be
changed to work with a ByteArray but step 3 can't. So we still need a
way to go from ByteArray to String without an encoding.

Decoding OTHO shouldn't work quite straight forward.

guessed performance implications on Pharo:
We'll likely loose performance for the ISO-8859-1 and UTF-8 encoding
because we'll no longer be able to write from one collection to an
other without conversion. We could theoretically get around this if
write a custom WriteStream that doesn't check the collection class.

Depending on the solution chosen for URL encoding we might end up
losing our URL encoding shortcut (#includesUnsafeUrlCharacter:) which
is likely going to degrade performance on link intensive pages. The
corresponding primitive could easily be written for ByteArray but I'm
not sure we want to go that way.

Cheers
Philippe


More information about the seaside-dev mailing list