[squeak-dev] [BUG] Timestamps don't work for classes with special character names
Tobias Pape
Das.Linux at gmx.de
Sat Dec 21 18:22:38 UTC 2019
> On 21.12.2019, at 19:11, Tobias Pape <Das.Linux at gmx.de> wrote:
>
>>
>> On 21.12.2019, at 17:36, Thiede, Christoph <Christoph.Thiede at student.hpi.uni-potsdam.de> wrote:
>>
>> Hi Tobias,
>>
>> what do you mean in detail?
>>
>> If I create the class via System Browser and add the method, my change file ends with:
>>
>> Object subclass: #CTTéstClass
>> instanceVariableNames: ''
>> classVariableNames: ''
>> poolDictionaries: ''
>> category: 'CT-Experiments'!
>> !CTTéstClass methodsFor: 'no messages' stamp: 'ct 12/21/2019 17:18'!
>> foo! !
>
>
> Good. that was what I thought was important.
>
>
>>
>> However, CompiledMethod >> #timeStamp returns ''.
>
> What is the result of the following?
>
> (CTTéstClass compiledMethodAt: #foo) preamble
>
>
>>
>> Here is a snapshot of the #timeStamp stackframe:
>>
>>
>>
>> Please note that "tokens at: tokenCount" returns the correct timestamp, but however, stamp is nil. What is this???
>
>
> I see what the problem is. The .changes file is apparently written UTF-8 coded, but read Latin-1 coded.
> This is BAD.
Oh, and we were warned:
CompiledMethod
getPreambleFrom: aFileStream at: endPosition
"This method is an ugly hack. This method assumes that source files have ASCII-compatible encoding and that preambles contain no non-ASCII characters."
| chunkSize chunk |
chunkSize := 160 min: endPosition.
[
| index |
chunk := aFileStream
position: (endPosition - chunkSize + 1 max: 0);
basicNext: chunkSize.
(index := chunk lastIndexOf: $! startingAt: chunk size) ~= 0 ifTrue: [
^chunk copyFrom: index + 1 to: chunk size ].
chunkSize := chunkSize * 2.
chunkSize <= endPosition ] whileTrue.
^chunk
I have the feeling that the problematic send is #basicNext: in line 10 or so. This seems to circumvent the conversion done by MultiByteFileStream.
Best regards
-Tobias
>
> You end up with 7 tokens, because you have three for the class name instead of one. This is because the Latin-1 copyright symbol is classified as binary selector, and thus separates the first part of the Class name from the second part. This happens only because utf8 vs latin.
>
> But the code path for 7-element tokens is different, and it looks for the #stamp: at a different position.
>
> Hence stamp is nil.
>
> A wrong but easy fix would be to call #utf8ToSqueak on the preamble.
>
> Best regards
> -Tobias
>
>
>>
>> I'm not sure if I understand you correctly, but if you told me to search the hex of my change file for a "zero word", the only occurrence I could find is:
>>
>> Which lead me to this:
>>
>> Does not seem related, but still looks somehow wrong ^^
>>
>> Best,
>> Christoph
>>
>> Von: Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> im Auftrag von Tobias Pape <Das.Linux at gmx.de>
>> Gesendet: Samstag, 21. Dezember 2019 15:44 Uhr
>> An: The general-purpose Squeak developers list
>> Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names
>>
>>
>>> On 21.12.2019, at 15:16, Thiede, Christoph <Christoph.Thiede at student.hpi.uni-potsdam.de> wrote:
>>>
>>> Hi all, found just another bug. If you get tired of them, just tell me :-)
>>>
>>> Steps to reproduce:
>>> Print it:
>>> class := Object subclass: #CTTèstClass "sic (with accent in name)!"
>>> instanceVariableNames: ''
>>> classVariableNames: ''
>>> poolDictionaries: ''
>>> category: 'CT-Experiments'.
>>> class compile: 'foo ^ #foo'.
>>> (class >> #foo) timeStamp
>>>
>>> Expected output:
>>> Something like 'ct 12/21/2019 15:13'.
>>>
>>> Actual output:
>>> ''.
>>>
>>> Please note that everything would have worked fine if we named class #CTTestClass (without accent) instead.
>>>
>>> Do we want to support special class names in general? If yes, this is a bug in my opinion. If no, we should raise an error in the first statement.
>>>
>>> Cause of infection not yet investigated.
>>
>> Please look at your .changes file whether at some point \00 bytes appear.
>>
>> Best regards
>> -Tobias
More information about the Squeak-dev
mailing list
|