[squeak-dev] ASN1 encoding of UTF8

Alan Pinch alan.c.pinch at gmail.com
Mon Sep 18 08:32:34 UTC 2017


I had found the same stackover flow question. It is the only place I 
found that mentions that 0x0C is the tag for it.

I am currently encoding thus:

aString squeakToUtf8 asByteArray.

and decoding:

bytes asByteArray asString utf8ToSqueak.

Do you think this lays out the bytes as specified in this page? I gather 
from the stackoverflow that this would be the encoded form of utf8 for asn1.

https://en.wikipedia.org/wiki/UTF-8#Description

Alan

On 09/18/2017 01:46 AM, Jakob Reschke wrote:
> I just did a quick search on the web and it seems like ASN.1 has a 
> UTF8String type (with tag 12) that just contains the sequence of bytes 
> of the UTF-8-encoded string. Can you use that? See also this question 
> on stackoverflow: https://stackoverflow.com/q/28929809 
> <https://stackoverflow.com/q/28929809>
>
> In Squeak, you can convert between UTF-8-encoded byte strings and 
> decoded (Squeak-encoded) character strings with the help of 
> UTF8TextConverter. Have a look at its class-side methods. Also, there 
> are conversion methods in String, IIRC. Try to filter its 
> instance-side methods by "utf8".
>
> Does this answer your question or are you in search of something else?
>
> Kind regards,
> Jakob
>
> Am 18.09.2017 03:49 schrieb "Alan Pinch" <alan.c.pinch at gmail.com 
> <mailto:alan.c.pinch at gmail.com>>:
>
>     I am trying to map utf8 into an ASN1 encoding, where the UTF8 is
>     specified to perhaps extend past one byte in value. I am also
>     interested
>     in retaining this UTF8 characters in squeak to interoperate well. What
>     would be my best approach to this, mapping to/from these bytes on
>     a stream?
>
>     Alan
>
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20170918/6070d9c3/attachment.html>


More information about the Squeak-dev mailing list