[squeak-dev] [BUG] Timestamps don't work for classes with special character names

Thiede, Christoph Christoph.Thiede at student.hpi.uni-potsdam.de
Sat Dec 21 19:23:47 UTC 2019


Hi Tobias, thanks for the pointers!


> (CTTéstClass compiledMethodAt: #foo) preamble

Like you said:
[cid:83a80b39-6292-4b27-a568-4108acb2ecc6]

I made the following change:
[cid:ca6036be-fde9-4c4d-9941-5f14ff95ddea]
This seems to fix the conversion issues.

Outputs are:
[cid:ee893e7e-8605-4413-9cdd-3761fc73b0d5]

The next problem is the trailing ! for the CTTéstClass preamble.
Here, the integer returned by expandedSourceFileArray >> #filePositionFromSourcePointer: is too large by one.
If have no idea where these constants come from, but as this is a constant method, I don't see how this calculation could be wrong.

I also tried the following:
[cid:388ac190-510f-4612-8961-689c359f9799]
yielding correctly:
[cid:79731c28-bf8f-4335-889f-ce5652e9091a]
But that seems hacky again.


Looking forward to your reply!


Best,

Christoph

________________________________
Von: Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> im Auftrag von Tobias Pape <Das.Linux at gmx.de>
Gesendet: Samstag, 21. Dezember 2019 19:22:38
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names


> On 21.12.2019, at 19:11, Tobias Pape <Das.Linux at gmx.de> wrote:
>
>>
>> On 21.12.2019, at 17:36, Thiede, Christoph <Christoph.Thiede at student.hpi.uni-potsdam.de> wrote:
>>
>> Hi Tobias,
>>
>> what do you mean in detail?
>>
>> If I create the class via System Browser and add the method, my change file ends with:
>>
>> Object subclass: #CTTéstClass
>> instanceVariableNames: ''
>> classVariableNames: ''
>> poolDictionaries: ''
>> category: 'CT-Experiments'!
>> !CTTéstClass methodsFor: 'no messages' stamp: 'ct 12/21/2019 17:18'!
>> foo! !
>
>
> Good. that was what I thought was important.
>
>
>>
>> However, CompiledMethod >> #timeStamp returns ''.
>
> What is the result of the following?
>
>        (CTTéstClass compiledMethodAt: #foo) preamble
>
>
>>
>> Here is a snapshot of the #timeStamp stackframe:
>>
>>
>>
>> Please note that "tokens at: tokenCount" returns the correct timestamp, but however, stamp is nil. What is this???
>
>
> I see what the problem is. The .changes file is apparently written UTF-8 coded, but read Latin-1 coded.
> This is BAD.

Oh, and we were warned:

CompiledMethod
getPreambleFrom: aFileStream at: endPosition
        "This method is an ugly hack. This method assumes that source files have ASCII-compatible encoding and that preambles contain no non-ASCII characters."

        | chunkSize chunk |
        chunkSize := 160 min: endPosition.
        [
                | index |
                chunk := aFileStream
                        position: (endPosition - chunkSize + 1 max: 0);
                        basicNext: chunkSize.
                (index := chunk lastIndexOf: $! startingAt: chunk size) ~= 0 ifTrue: [
                        ^chunk copyFrom: index + 1 to: chunk size ].
                chunkSize := chunkSize * 2.
                chunkSize <= endPosition ] whileTrue.
        ^chunk


I have the feeling that the problematic send is #basicNext: in line 10 or so. This seems to circumvent the conversion done by MultiByteFileStream.

Best regards
        -Tobias

>
> You end up with 7 tokens, because you have three for the class name instead of one. This is because the Latin-1 copyright symbol is classified as binary selector, and thus separates the first part of the Class name from the second part. This happens only because utf8 vs latin.
>
> But the code path for 7-element tokens is different, and it looks for the #stamp: at a different position.
>
> Hence stamp is nil.
>
> A wrong but easy fix would be to call #utf8ToSqueak on the preamble.
>
> Best regards
>        -Tobias
>
>
>>
>> I'm not sure if I understand you correctly, but if you told me to search the hex of my change file for a "zero word", the only occurrence I could find is:
>>
>> Which lead me to this:
>>
>> Does not seem related, but still looks somehow wrong ^^
>>
>> Best,
>> Christoph
>>
>> Von: Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> im Auftrag von Tobias Pape <Das.Linux at gmx.de>
>> Gesendet: Samstag, 21. Dezember 2019 15:44 Uhr
>> An: The general-purpose Squeak developers list
>> Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names
>>
>>
>>> On 21.12.2019, at 15:16, Thiede, Christoph <Christoph.Thiede at student.hpi.uni-potsdam.de> wrote:
>>>
>>> Hi all, found just another bug. If you get tired of them, just tell me :-)
>>>
>>> Steps to reproduce:
>>> Print it:
>>> class := Object subclass: #CTTèstClass "sic (with accent in name)!"
>>> instanceVariableNames: ''
>>> classVariableNames: ''
>>> poolDictionaries: ''
>>> category: 'CT-Experiments'.
>>> class compile: 'foo ^ #foo'.
>>> (class >> #foo) timeStamp
>>>
>>> Expected output:
>>> Something like 'ct 12/21/2019 15:13'.
>>>
>>> Actual output:
>>> ''.
>>>
>>> Please note that everything would have worked fine if we named class #CTTestClass (without accent) instead.
>>>
>>> Do we want to support special class names in general? If yes, this is a bug in my opinion. If no, we should raise an error in the first statement.
>>>
>>> Cause of infection not yet investigated.
>>
>> Please look at your .changes file whether at some point \00 bytes appear.
>>
>> Best regards
>>        -Tobias



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20191221/d6fd87bf/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pastedImage.png
Type: image/png
Size: 28942 bytes
Desc: pastedImage.png
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20191221/d6fd87bf/attachment-0006.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pastedImage.png
Type: image/png
Size: 37267 bytes
Desc: pastedImage.png
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20191221/d6fd87bf/attachment-0007.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pastedImage.png
Type: image/png
Size: 63724 bytes
Desc: pastedImage.png
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20191221/d6fd87bf/attachment-0008.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pastedImage.png
Type: image/png
Size: 230103 bytes
Desc: pastedImage.png
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20191221/d6fd87bf/attachment-0009.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pastedImage.png
Type: image/png
Size: 22208 bytes
Desc: pastedImage.png
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20191221/d6fd87bf/attachment-0010.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pastedImage.png
Type: image/png
Size: 233098 bytes
Desc: pastedImage.png
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20191221/d6fd87bf/attachment-0011.png>


More information about the Squeak-dev mailing list