[Seaside-dev] Encoding problems with binary vs ascii
Michael Lucas-Smith
mlucas-smith at cincom.com
Mon Jun 15 19:46:36 UTC 2009
Hi All,
I'm having some trouble with the new WACodec behavior.
The tests assume there'll be a #converter selector on the WACodec
subclass (testCodecLatin1). This seems a bit heavy handed, if you want
to let the platforms decide how to achieve their conversion.
Binary conversions are still an issue - take the following code:
| codec binary encoder |
codec := WACodec forEncoding: 'utf-8'.
binary := self utf8String asByteArray.
encoder := codec encoderFor: (WriteStream on: String new).
encoder binary.
encoder nextPutAll: binary.
The encoder is initialized with a non-binary write stream, then it's
told to become binary. You can't do that - the encoder has no way of
knowing what's inside its inner stream, nor should it. If you intend to
put bytes in to the stream, start it with a ByteArray.
Likewise, if you're going to the effort of fixing up encoding issues at
this point, why not get rid of all senders of #binary completely?
From what I've understood, the API is "encoding in, encoding out" which
means you expect to go from strings to strings. This is okay, I guess,
except that I'd also like to be able to go only half way.. put strings
in and get bytes out, this would remove any unnecessary conversions
taking place.
I've heard plenty of times before that this can't be done because of
various different levels of support.. but you're already pushing the
boundaries of what can be done "out of the box" with WACodec, so why not
go the whole way and do it right? Strings<->ByteArray conversions only?
Next, the WACodec expects to implement #name which will return the name
that was used to create it.. I clarify, the tests assume that that is
the behavior. If that's the expected behavior, there's no reason why the
subclasses of WACodec need to implement that particular behavior, as it
can never change.
Finally, I don't entirely understand the motivation of WACodec>>url...
As far as I knew, there's no situation where URL-encoded strings encoded
as shift-jis is going to work. URL-encoding is just another codec in my
mind. What gives with this API?
Cheers,
Michael
More information about the seaside-dev
mailing list