[squeak-dev] Re: [ANN] WebClient and WebServer 1.0 for Squeak

Levente Uzonyi leves at elte.hu
Mon May 10 17:42:11 UTC 2010


On Mon, 10 May 2010, Hannes Hirzel wrote:

> On 5/10/10, Levente Uzonyi <leves at elte.hu> wrote:
>> On Mon, 10 May 2010, Hannes Hirzel wrote:
>>
>>> Unfortunately UTF8TextConverter cannot deal with non-Latin1
>>> characters. So it's usefulness is limited.
>>
>> UTF8TextConverter can deal with non-latin1 characters. I
>> think you're trying to pass a WideString to #encodeByteString: which
>> obviously doesn't work.
>>
>>
>> Levente
>>
>
> Yes I am passing aWideString to
>    #encodeByteString:
>
> as this is the only conversion method UTF8TextConverter.
>
> And you're right I should pass a ByteString.
>
> However as the case
>   ('ä', 8220 asCharacter asString) asByteString   "A"
> shows in comparison to
>  ('ä', 65 asCharacter asString) asByteString      "B"
>
> I get only in case "B" a ByteString, in case "A" it remains a WideString.
>
> So the question is: How do I convert a WideString to UTF8 as
> UTF8TextConverter is limited to code points from 0...255 and I want
> the full Unicode range?

There are various possibilities:
'äbc' squeakToUtf8.
'äbc' convertToEncoding: 'utf-8'.
'äbc' convertToWithConverter: UTF8TextConverter new.
UTF8TextConverter new encodeString: 'äbc'.


Levente

>
> Or put the question otherwise: Is there a textconverter which
> implements the following algorithm
> http://dsc.sun.com/dev/gadc/technicalpublications/articles/utf8.html
>
> -Hannes
>
>


More information about the Squeak-dev mailing list