I meant that this:


'ä' squeakToUtf8 squeakToUtf8  asByteArray => #[195 131 194 164]


are the characters which are printed instead of 'ä' in the debug output.


I will look into this tomorrow again. I have not yet investigated the concrete trace to the ChangeList coming from the FileList (so far I have directly opened a ChangeList).


From: Squeak-dev <squeak-dev-bounces@lists.squeakfoundation.org> on behalf of Bert Freudenberg <bert@freudenbergs.de>
Sent: Wednesday, July 19, 2017 15:15
To: The general-purpose Squeak developers list
Subject: Re: [squeak-dev] Parsing privateAuthorsRaw for a changes browser
 
On Wed, Jul 19, 2017 at 2:22 PM, Rein, Patrick <Patrick.Rein@hpi.de> wrote:

Well as feared it did not come through. Let me try this again: The string 'ä' would be 'Ã' 

when interpreted as bytes which encode UTF-8. In turn 'Ã' as bytes encoding UTF-8 is 'ä' which 

is what we actually want. The rest is as described below. 

​In my image (updated from some trunk version) the method looks fine. As for the weird encodings, I think you mean:

'ä' squeakToUtf8 
=> 'ä'
'ä' squeakToUtf8 asByteArray
#[195 164]

'ä' utf8ToSqueak
 'ä'

#[195 164] asString utf8ToSqueak
=> 'ä'

I assume this is a copy-paste error? E.g. I cannot copy+paste the result of

'ä' squeakToUtf8 squeakToUtf8

- Bert -