[Seaside-dev] Seaside 3.0a6ish
Philippe Marschall
philippe.marschall at gmail.com
Fri May 21 19:21:04 UTC 2010
2010/5/20 Philippe Marschall <philippe.marschall at gmail.com>:
> 2010/5/19 Paolo Bonzini <bonzini at gnu.org>:
>> On 05/19/2010 06:58 PM, Michael Lucas-Smith wrote:
>>>
>>> Can someone speak to the platforms that have trouble with #= here?
>>
>> GNU Smalltalk has problems comparing an encoded string with #latin1String.
>> The problem is that the GRCodecTest>>#asString: method does not store the
>> encoding of the string in its result, so GNU Smalltalk assumes it is in the
>> default encoding (typically UTF-8). Then when "self latin1String" has to be
>> compared with an ISO-8859-1 string (the output of "codec encode: self
>> decodedString"), GNU Smalltalk fails because it finds an invalid UTF-8
>> sequence in "self latin1String".
>
> Why does there have to be an encoding present? It concatenates
> characters from known code points. There are no bytes involved so no
> mapping or mapping information is required.
>
>> Comparing bytearrays instead takes encodings out of the picture and works.
>>
>> VisualWorks seems to have the opposite problem. #encode: needs to know what
>> encoding was applied in order to convert to raw bytes. This seems to be a
>> bug to me. The #encode:-d representation should contain the raw bytes, not
>> the Unicode characters.
>>
>> So, I could fix it by adding a platform-specific hack to #asString:, but it
>> seems wrong. Can you check what breaks if you return a ByteArray from your
>> codec's #encode: method?
>
> I have a train ride today. I can give it a shot. It might actually
> work because of a recent stream change.
It does work [1]. We loose the ability to handle macroman and utf-16
but that could be added if needed. Everything else seems to be working
just fine.
[1] http://www.squeaksource.com/Seaside31
Cheers
Philippe
More information about the seaside-dev
mailing list