I had found the same stackover flow question. It is the only place I found that mentions that 0x0C is the tag for it.
I am currently encoding thus:
aString squeakToUtf8 asByteArray.
and decoding:
bytes asByteArray asString utf8ToSqueak.
I just did a quick search on the web and it seems like ASN.1 has a UTF8String type (with tag 12) that just contains the sequence of bytes of the UTF-8-encoded string. Can you use that? See also this question on stackoverflow: https://stackoverflow.com/q/28929809
In Squeak, you can convert between UTF-8-encoded byte strings and decoded (Squeak-encoded) character strings with the help of UTF8TextConverter. Have a look at its class-side methods. Also, there are conversion methods in String, IIRC. Try to filter its instance-side methods by "utf8".
Does this answer your question or are you in search of something else?
Kind regards,Jakob
Am 18.09.2017 03:49 schrieb "Alan Pinch" <alan.c.pinch@gmail.com>:
I am trying to map utf8 into an ASN1 encoding, where the UTF8 is
specified to perhaps extend past one byte in value. I am also interested
in retaining this UTF8 characters in squeak to interoperate well. What
would be my best approach to this, mapping to/from these bytes on a stream?
Alan