[squeak-dev] Zero bytes in Multilingual package

Bert Freudenberg bert at freudenbergs.de
Sun Aug 16 15:07:57 UTC 2009


On 16.08.2009, at 05:15, Andreas Raab wrote:

> Ian Trudel wrote:
>> Another issue but with the trunk. I have tried to update code from  
>> the
>> trunk into my image but there's a proxy error with source.squeak.org
>> right at this minute, which causes Squeak to freeze for a minute or
>> two trying to reach the server.
>
> It seems to be fine now. Probably just a temporary issue.

There were three debuggers open in the squeaksource image when I  
looked today. The problem comes from the source server trying to parse  
Multilingual-ar.38 and Multilingual-sn.38. It contains sections of  
code where each character is stored as a long instead of a byte (that  
is, three null bytes and the char code). I've copied the relevant  
portion out of the .mcz's source.st, see attachment. If you try to  
open a changelist browser on that file, you get the same parse error.

I have no idea how these widened characters made it into the mzc's  
source.st file. In particular since this starts in the middle of a  
method (and of a class comment). It extends over a few chunks, then  
reverts back to a regular encoding. Strange.

JapaneseEnvironment class>>isBreakableAt:in: looks suspicious though  
I'm not sure if it is actually broken or not.

I then looked into the trunk's changes file. It has this problem too,  
though apparently only in the class comment of LanguageEnvironment.

"LanguageEnvironment comment string asByteArray" contains this:

116 104 114 101 101 32 99 97 110 32 104 97 118 101 32 40 97 110 100 32  
100 111 101 115 32 104 97 118 101 41 32 100 105 102 102 101 114 101  
110 116 32 101 110 99 111 100 105 110 103 115 46 32 32 83 0 0 0 111 0  
0 0 32 0 0 0 119 0 0 0 101 0 0 0 32 0 0 0 110 0 0 0 101 0 0 0 101 0 0  
0 100 0 0 0 32 0 0 0 116 0 0 0 111 0 0 0 32 0 0 0 109 0 0 0 97 0 0 0  
110 0 0 0 97 0 0 0 103 0 0 0 101 0 0 0 32 0 0 0 116 0 0 0 104 0 0 0  
101 0 0 0 109 0 0 0 32 0 0 0 115 0 0 0 101 0 0 0 112 0 0 0 97 0 0 0  
114 0 0 0 97 0 0 0 116 0 0 0 101 0 0 0 108 0 0 0 121 0 0 0 46 0 0 0 32  
0 0 0 32 0 0 0 78 0 0 0 111 0 0 0 116 0 0 0 101 0 0 0 32 0 0 0 116 0 0  
0 104 0 0 0 97 0 0 0 116 0 0 0 32 0 0 0 116 0 0 0 104 0 0 0 101 0 0 0  
32 0 0 0 101 0 0 0 110 0 0 0 99 0 0 0 111 0 0 0 100 0 0 0 105 0 0 0  
110 0 0 0 103 0 0 0 32 0 0 0 105 0 0 0 110 0 0 0 32 0 0 0 97 0 0 0 32  
0 0 0 102 0 0 0 105 0 0 0 108 0 0 0 101 0 0 0 32 0 0 0 99 0 0 0 97 0 0  
0 110 0 0 0 32 0 0 0 98 0 0 0

Increasingly strange. So I removed the null bytes from the class  
comment and published as Multilingual-bf.39. After updating they are  
indeed gone from the comment. But looking at the source.st in that mcz  
shows the encoding problem again. Bummer.

Something very strange is going on. I'm out of ideas (short of  
debugging into the MCZ save process).

- Bert -

-------------- next part --------------
A non-text attachment was scrubbed...
Name: BuggyMultilingual-ar.38.st.zip
Type: application/zip
Size: 4277 bytes
Desc: not available
Url : http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20090816/3f2a085e/BuggyMultilingual-ar.38.st.zip


More information about the Squeak-dev mailing list