[squeak-dev] Parsing privateAuthorsRaw for a changes browser

Rein, Patrick Patrick.Rein at hpi.de
Wed Jul 19 12:18:12 UTC 2017


Hi Eliot,

I started looking into this. So far I could not manage to reproduce this
locally using a new trunk image and using a trunk image from May and
updating it. So far this looks like a mixture of a double encoding and a
wrong decoding issue. The character sequence 'ä' further down (in
Volker Bäcker) would be ä when interpreted as UTF-8 which in
turn when interpreted as UTF-8 is ä, which would be expected in the
string. To get to 'ä' though would require to interpret the ä in
UTF-8 as CP1252 and then encode it again in UTF-8 and decode it once
again using CP1252.

Sanity check before I continue: Does the source code in the method look
right in that image?

(I hope all these weird characters will come through to you :) )

Bests
Patrick​

________________________________
From: Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> on behalf of Eliot Miranda <eliot.miranda at gmail.com>
Sent: Wednesday, July 12, 2017 18:51
To: The general-purpose Squeak developers list
Subject: [squeak-dev] Parsing privateAuthorsRaw for a changes browser

Hi All,

    I had reason to condense changes and then was curious to look for older versions.  But when I came to open a changes browser on the newly condensed changes file the UTF-8 decoder failed to parse the source for SystemNavigation class>>privateAuthorsRaw.  Something breaks the string at the e acute in Stéphane, and then the decoder gets hopelessly confused.

To reproduce:
In a trunk 6.x image do
    Smalltalk condenseChanges
then open a file list, select the changes file, and then click the recent changes button.

here's the SqueakDebug.log:

InvalidUTF8: Invalid utf8: ©phane Rollandin#spfa!Stephane Schitter#stefs!Stephanie Hamburg#MUTTLYSTEPHANIE!Stephen Smith#sst!Stephen Travis Pope#stp!Stephen Vincent Pair#svp!Steve Davies#sld!Steve Elkins#sge!Steve Fuller#snf!Steve Gilbert#slg!Steve Hunter#skh!Steve Knight#knighty!Steve Mccusker#smcc!Steve Messamore#slm!Steve Sanderson#sms!Steve Wart#swart!Steve Wessels#!Steven Darcy#SMD!Steven Greenberg#greenbes!Steven Rodriguez#optionshiftk!Steven Swerling#sps!Sudheendra Hangal#hangal!Sungjin Chun#chunsj!Suzuki Tetsuya#tetsuya!Syed Abid#taxman!Syed Masoodahmad#masden56!Sylvia Sharma#sharma!Symon Chalk#symonc!Takashi Yamamiya#tak!Tansel Ersavas#mte#MTE!Tarek Demiati#TD!Ted Bracht#TB#TB1!Ted Kaehler#tk!Terry Jenkins#TCJ!Thierry Reignier#TREG!Thijs Janssen#TJ!Thomas Bernitt#tber!Thomas Fröb#thf!Thomas Hemme#Namamazu!Thomas J Keller#TJK!Thomas Kowark#tk!Thomas M. Breuel#tmb!Thomas Mahler#ThMa!Thomas Stambaugh#tms!Thomas Zimmermann#TZ!Tim Cuthbertson#tec!Tim Felgentreff#tfel!Tim Lewis#TimLewis!Tim Olson#tao!Tim Rowledge#TPR#tpr!Timm Knape#tik!Timothy Falconer#teefal!Timothy M#tty!Timothy Retz#tgr!Tobias Isenberg#ti!Tobias Pape#topa!Todd Blanchard#tb!Tom Counsell#tamc!Tom Dailey#td!Tom Koenig#tlk!Tom Plick#tap!Tom Rushworth#tbr!Tommy Thorn#tt!Tomohiro Oda#TO!Tony Garnock-Jones#tonyg!Tony Zampogna#zamp!Torge Husfeldt#th!Torsten Bergmann#tbn#TBN!Torsten Sadowski#ts!Travis Kay#tkay#tlk!Trygve Reenskaug#TRee!Tyler Coumbes#mtc!Tzaddi Beltaine#tsb!Udo Schneider#udos!Vaidotas Didžbalis#vd!Vassili Bykov#vb!Vernon Marsden#vmars!Vijay Mathew Pandyalakal#vmp!Vladimir Janousek#vj!Volker Bäcker#volker!Wally Cash#wac!Walter Wilhelm#ww!Ward Cunningham#ward!Wayne Braun#wb!Wayne D. Elias#wdelias!Webb Mcdonald#wxm!Wilkes Joiner#dwj!Willem van Asperen#wva!William Hess#WFH!William Hidden#whidden!Wolfgang Eder#edw!Wolfgang Helbig#whg!Woon Yeo#!Wuilmer Olaya Bardales#wob!Yagendra Dutt Tripathi#yd!Yang Ha Nguyen#yhm!Yann Monclair#YM!Yanni Chiu#yj!Yasuji Nakayama#yasuji!Yoshiki Ohshima#yo!Yuji Ichikawa#ich!Yunhee Lee#yhl!Yutaka Kamite#yk!Zdenek Novy#Zdenye#ZN!Zeljko Nesic#Poparasan!Zeynep Besen#zeyno'
12 July 2017 9:42:40.918319 am

VM: Mac OS - Smalltalk
Image: Squeak6.0alpha [latest update: #17347]

SecurityManager state:
Restricted: false
FileAccess: true
SocketAccess: true
Working Dir /Users/eliot/Squeak/Squeak5.1
Trusted Dir /foobar/tooBar/forSqueak/bogus/
Untrusted Dir /Users/eliot/Library/Preferences/Squeak/Internet/My Squeak/

UTF8TextConverter class>>errorMalformedInput:
Receiver: UTF8TextConverter
Arguments and temporary variables:
aString: '©phane Rollandin#spfa!Stephane Schitter#stefs!Stephanie Hamburg#MUTTL...etc...
Receiver's instance variables:
superclass: TextConverter
methodDict: a MethodDictionary(#backFromStream:->(UTF8TextConverter>>#backFromS...etc...
format: 65538
instanceVariables: nil
organization: ('conversion' backFromStream: decodeString: encodeString: errorMalformedInput:...etc...
subclasses: nil
name: #UTF8TextConverter
classPool: a Dictionary(#StrictUtf8Conversions->nil )
sharedPools: nil
environment: Smalltalk
category: #'Multilingual-TextConversion'
latin1Map: #[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...etc...
latin1Encodings: #(nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil ...etc...

UTF8TextConverter class>>decodeByteString:
Receiver: UTF8TextConverter
Arguments and temporary variables:
aByteString: '©phane Rollandin#spfa!Stephane Schitter#stefs!Stephanie Hamburg#M...etc...
outStream: a WriteStream
lastIndex: 1
nextIndex: 1
byte1: 169
byte2: nil
byte3: nil
byte4: nil
unicode: nil
Receiver's instance variables:
superclass: TextConverter
methodDict: a MethodDictionary(#backFromStream:->(UTF8TextConverter>>#backFromS...etc...
format: 65538
instanceVariables: nil
organization: ('conversion' backFromStream: decodeString: encodeString: errorMalformedInput:...etc...
subclasses: nil
name: #UTF8TextConverter
classPool: a Dictionary(#StrictUtf8Conversions->nil )
sharedPools: nil
environment: Smalltalk
category: #'Multilingual-TextConversion'
latin1Map: #[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...etc...
latin1Encodings: #(nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil ...etc...

UTF8TextConverter>>decodeString:
Receiver: an UTF8TextConverter
Arguments and temporary variables:
aString: '©phane Rollandin#spfa!Stephane Schitter#stefs!Stephanie Hamburg#MUTTL...etc...
result: nil
Receiver's instance variables:
latin1Map: #[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...etc...
latin1Encodings: #(nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil ...etc...

UTF8TextConverter>>nextChunkFromStream:
Receiver: an UTF8TextConverter
Arguments and temporary variables:
input: MultiByteFileStream: '/Users/eliot/Squeak/Squeak5.1/trunk6projectLoad.ch...etc...
Receiver's instance variables:
latin1Map: #[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...etc...
latin1Encodings: #(nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil ...etc...

MultiByteFileStream>>nextChunk
Receiver: MultiByteFileStream: '/Users/eliot/Squeak/Squeak5.1/trunk6projectLoad.changes'
Arguments and temporary variables:

Receiver's instance variables:


ChangeList class>>browseRecentLogOn:
Receiver: ChangeList
Arguments and temporary variables:
origChangesFile: MultiByteFileStream: '/Users/eliot/Squeak/Squeak5.1/trunk6proj...etc...
end: 13286751
done: false
block: 7195999
pos: 7198297
changesFile: MultiByteFileStream: '/Users/eliot/Squeak/Squeak5.1/trunk6projectL...etc...
position: nil
prevBlock: 7197023
chunk: #('privateAuthorsRaw

^ ''Aaron Reichow#ajr!Abigail Sanchez#as!Adam Eng...etc...
Receiver's instance variables:
superclass: CodeHolder
methodDict: a MethodDictionary(#acceptFrom:->(ChangeList>>#acceptFrom: "a CompiledMethod...etc...
format: 65548
instanceVariables: #('changeList' 'list' 'listIndex' 'listSelections' 'file' 'l...etc...
organization: ('accessing' changeList changes:file: currentChange file listHasSingleEntry...etc...
subclasses: {ChangeListForProjects . VersionsBrowser}
name: #ChangeList
classPool: nil
sharedPools: nil
environment: nil
category: #'Tools-Changes'

ChangeList class>>browseRecentLogOnPath:
Receiver: ChangeList
Arguments and temporary variables:
fullName: '/Users/eliot/Squeak/Squeak5.1/trunk6projectLoad.changes'
Receiver's instance variables:
superclass: CodeHolder
methodDict: a MethodDictionary(#acceptFrom:->(ChangeList>>#acceptFrom: "a CompiledMethod...etc...
format: 65548
instanceVariables: #('changeList' 'list' 'listIndex' 'listSelections' 'file' 'l...etc...
organization: ('accessing' changeList changes:file: currentChange file listHasSingleEntry...etc...
subclasses: {ChangeListForProjects . VersionsBrowser}
name: #ChangeList
classPool: nil
sharedPools: nil
environment: nil
category: #'Tools-Changes'
_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20170719/90a17398/attachment.html>


More information about the Squeak-dev mailing list