I noticed that #setConverterForCode still rely on BOM, but my current .changes does not have a BOM...
Note that there are not so many senders of #writeBOMOn: mainly those who want to fileOut a class/method/etc...
So that explain that I do not have a BOM...

Though (SourceFiles at: 2) has a UTF8TextConverter... Why?
That could be a direct send of #converter:, but I rather think that UTF8 is the default converter when we open the file.
So things work only because we don't #setConverterForCode on the .changes nor .sources...
Except that the path that you used does...

IMO, it's not related to condenseChanges, it should equally fail if you pretend you are Stéphane author, modify a method, and browse recent changes form file list...


2017-07-19 18:55 GMT+02:00 Rein, Patrick <Patrick.Rein@hpi.de>:

I meant that this:


'ä' squeakToUtf8 squeakToUtf8  asByteArray => #[195 131 194 164]


are the characters which are printed instead of 'ä' in the debug output.


I will look into this tomorrow again. I have not yet investigated the concrete trace to the ChangeList coming from the FileList (so far I have directly opened a ChangeList).


From: Squeak-dev <squeak-dev-bounces@lists.squeakfoundation.org> on behalf of Bert Freudenberg <bert@freudenbergs.de>
Sent: Wednesday, July 19, 2017 15:15
To: The general-purpose Squeak developers list
Subject: Re: [squeak-dev] Parsing privateAuthorsRaw for a changes browser
 
On Wed, Jul 19, 2017 at 2:22 PM, Rein, Patrick <Patrick.Rein@hpi.de> wrote:

Well as feared it did not come through. Let me try this again: The string 'ä' would be 'Ã' 

when interpreted as bytes which encode UTF-8. In turn 'Ã' as bytes encoding UTF-8 is 'ä' which 

is what we actually want. The rest is as described below. 

​In my image (updated from some trunk version) the method looks fine. As for the weird encodings, I think you mean:

'ä' squeakToUtf8 
=> 'ä'
'ä' squeakToUtf8 asByteArray
#[195 164]

'ä' utf8ToSqueak
 'ä'

#[195 164] asString utf8ToSqueak
=> 'ä'

I assume this is a copy-paste error? E.g. I cannot copy+paste the result of

'ä' squeakToUtf8 squeakToUtf8

- Bert -