Hi all, found just another bug. If you get tired of them, just tell me :-)
Steps to reproduce:
Print it:
class := Object subclass: #CTTèstClass "sic (with accent in name)!" instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'CT-Experiments'. class compile: 'foo ^ #foo'. (class >> #foo) timeStamp
Expected output:
Something like 'ct 12/21/2019 15:13'.
Actual output:
''.
Please note that everything would have worked fine if we named class #CTTestClass (without accent) instead.
Do we want to support special class names in general? If yes, this is a bug in my opinion. If no, we should raise an error in the first statement.
Cause of infection not yet investigated.
Best,
Christoph
On 21.12.2019, at 15:16, Thiede, Christoph Christoph.Thiede@student.hpi.uni-potsdam.de wrote:
Hi all, found just another bug. If you get tired of them, just tell me :-)
Steps to reproduce: Print it: class := Object subclass: #CTTèstClass "sic (with accent in name)!" instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'CT-Experiments'. class compile: 'foo ^ #foo'. (class >> #foo) timeStamp
Expected output: Something like 'ct 12/21/2019 15:13'.
Actual output: ''.
Please note that everything would have worked fine if we named class #CTTestClass (without accent) instead.
Do we want to support special class names in general? If yes, this is a bug in my opinion. If no, we should raise an error in the first statement.
Cause of infection not yet investigated.
Please look at your .changes file whether at some point \00 bytes appear.
Best regards -Tobias
Hi Tobias,
what do you mean in detail?
If I create the class via System Browser and add the method, my change file ends with:
Object subclass: #CTTéstClass instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'CT-Experiments'! !CTTéstClass methodsFor: 'no messages' stamp: 'ct 12/21/2019 17:18'! foo! !
However, CompiledMethod >> #timeStamp returns ''.
Here is a snapshot of the #timeStamp stackframe:
[cid:9a66cd4b-bad0-4e58-8cad-5ee25223b261]
Please note that "tokens at: tokenCount" returns the correct timestamp, but however, stamp is nil. What is this???
I'm not sure if I understand you correctly, but if you told me to search the hex of my change file for a "zero word", the only occurrence I could find is:
[cid:cda09063-bcf3-415d-ac79-97bc9aa8642b]
Which lead me to this:
[cid:9f5e7798-1326-4821-a94d-0b7cbfa567d6]
Does not seem related, but still looks somehow wrong ^^
Best,
Christoph
________________________________ Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Tobias Pape Das.Linux@gmx.de Gesendet: Samstag, 21. Dezember 2019 15:44 Uhr An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names
On 21.12.2019, at 15:16, Thiede, Christoph Christoph.Thiede@student.hpi.uni-potsdam.de wrote:
Hi all, found just another bug. If you get tired of them, just tell me :-)
Steps to reproduce: Print it: class := Object subclass: #CTTèstClass "sic (with accent in name)!" instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'CT-Experiments'. class compile: 'foo ^ #foo'. (class >> #foo) timeStamp
Expected output: Something like 'ct 12/21/2019 15:13'.
Actual output: ''.
Please note that everything would have worked fine if we named class #CTTestClass (without accent) instead.
Do we want to support special class names in general? If yes, this is a bug in my opinion. If no, we should raise an error in the first statement.
Cause of infection not yet investigated.
Please look at your .changes file whether at some point \00 bytes appear.
Best regards -Tobias
Ah ok, the latter was already fixed in Multilingual-nice.249 from the Inbox, nevermind :)
________________________________ Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Thiede, Christoph Gesendet: Samstag, 21. Dezember 2019 17:36:26 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names
Hi Tobias,
what do you mean in detail?
If I create the class via System Browser and add the method, my change file ends with:
Object subclass: #CTTéstClass instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'CT-Experiments'! !CTTéstClass methodsFor: 'no messages' stamp: 'ct 12/21/2019 17:18'! foo! !
However, CompiledMethod >> #timeStamp returns ''.
Here is a snapshot of the #timeStamp stackframe:
[cid:9a66cd4b-bad0-4e58-8cad-5ee25223b261]
Please note that "tokens at: tokenCount" returns the correct timestamp, but however, stamp is nil. What is this???
I'm not sure if I understand you correctly, but if you told me to search the hex of my change file for a "zero word", the only occurrence I could find is:
[cid:cda09063-bcf3-415d-ac79-97bc9aa8642b]
Which lead me to this:
[cid:9f5e7798-1326-4821-a94d-0b7cbfa567d6]
Does not seem related, but still looks somehow wrong ^^
Best,
Christoph
________________________________ Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Tobias Pape Das.Linux@gmx.de Gesendet: Samstag, 21. Dezember 2019 15:44 Uhr An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names
On 21.12.2019, at 15:16, Thiede, Christoph Christoph.Thiede@student.hpi.uni-potsdam.de wrote:
Hi all, found just another bug. If you get tired of them, just tell me :-)
Steps to reproduce: Print it: class := Object subclass: #CTTèstClass "sic (with accent in name)!" instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'CT-Experiments'. class compile: 'foo ^ #foo'. (class >> #foo) timeStamp
Expected output: Something like 'ct 12/21/2019 15:13'.
Actual output: ''.
Please note that everything would have worked fine if we named class #CTTestClass (without accent) instead.
Do we want to support special class names in general? If yes, this is a bug in my opinion. If no, we should raise an error in the first statement.
Cause of infection not yet investigated.
Please look at your .changes file whether at some point \00 bytes appear.
Best regards -Tobias
On 21.12.2019, at 17:36, Thiede, Christoph Christoph.Thiede@student.hpi.uni-potsdam.de wrote:
Hi Tobias,
what do you mean in detail?
If I create the class via System Browser and add the method, my change file ends with:
Object subclass: #CTTéstClass instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'CT-Experiments'! !CTTéstClass methodsFor: 'no messages' stamp: 'ct 12/21/2019 17:18'! foo! !
Good. that was what I thought was important.
However, CompiledMethod >> #timeStamp returns ''.
What is the result of the following?
(CTTéstClass compiledMethodAt: #foo) preamble
Here is a snapshot of the #timeStamp stackframe:
Please note that "tokens at: tokenCount" returns the correct timestamp, but however, stamp is nil. What is this???
I see what the problem is. The .changes file is apparently written UTF-8 coded, but read Latin-1 coded. This is BAD.
You end up with 7 tokens, because you have three for the class name instead of one. This is because the Latin-1 copyright symbol is classified as binary selector, and thus separates the first part of the Class name from the second part. This happens only because utf8 vs latin.
But the code path for 7-element tokens is different, and it looks for the #stamp: at a different position.
Hence stamp is nil.
A wrong but easy fix would be to call #utf8ToSqueak on the preamble.
Best regards -Tobias
I'm not sure if I understand you correctly, but if you told me to search the hex of my change file for a "zero word", the only occurrence I could find is:
Which lead me to this:
Does not seem related, but still looks somehow wrong ^^
Best, Christoph
Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Tobias Pape Das.Linux@gmx.de Gesendet: Samstag, 21. Dezember 2019 15:44 Uhr An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names
On 21.12.2019, at 15:16, Thiede, Christoph Christoph.Thiede@student.hpi.uni-potsdam.de wrote:
Hi all, found just another bug. If you get tired of them, just tell me :-)
Steps to reproduce: Print it: class := Object subclass: #CTTèstClass "sic (with accent in name)!" instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'CT-Experiments'. class compile: 'foo ^ #foo'. (class >> #foo) timeStamp
Expected output: Something like 'ct 12/21/2019 15:13'.
Actual output: ''.
Please note that everything would have worked fine if we named class #CTTestClass (without accent) instead.
Do we want to support special class names in general? If yes, this is a bug in my opinion. If no, we should raise an error in the first statement.
Cause of infection not yet investigated.
Please look at your .changes file whether at some point \00 bytes appear.
Best regards -Tobias
On 21.12.2019, at 19:11, Tobias Pape Das.Linux@gmx.de wrote:
On 21.12.2019, at 17:36, Thiede, Christoph Christoph.Thiede@student.hpi.uni-potsdam.de wrote:
Hi Tobias,
what do you mean in detail?
If I create the class via System Browser and add the method, my change file ends with:
Object subclass: #CTTéstClass instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'CT-Experiments'! !CTTéstClass methodsFor: 'no messages' stamp: 'ct 12/21/2019 17:18'! foo! !
Good. that was what I thought was important.
However, CompiledMethod >> #timeStamp returns ''.
What is the result of the following?
(CTTéstClass compiledMethodAt: #foo) preamble
Here is a snapshot of the #timeStamp stackframe:
Please note that "tokens at: tokenCount" returns the correct timestamp, but however, stamp is nil. What is this???
I see what the problem is. The .changes file is apparently written UTF-8 coded, but read Latin-1 coded. This is BAD.
Oh, and we were warned:
CompiledMethod getPreambleFrom: aFileStream at: endPosition "This method is an ugly hack. This method assumes that source files have ASCII-compatible encoding and that preambles contain no non-ASCII characters."
| chunkSize chunk | chunkSize := 160 min: endPosition. [ | index | chunk := aFileStream position: (endPosition - chunkSize + 1 max: 0); basicNext: chunkSize. (index := chunk lastIndexOf: $! startingAt: chunk size) ~= 0 ifTrue: [ ^chunk copyFrom: index + 1 to: chunk size ]. chunkSize := chunkSize * 2. chunkSize <= endPosition ] whileTrue. ^chunk
I have the feeling that the problematic send is #basicNext: in line 10 or so. This seems to circumvent the conversion done by MultiByteFileStream.
Best regards -Tobias
You end up with 7 tokens, because you have three for the class name instead of one. This is because the Latin-1 copyright symbol is classified as binary selector, and thus separates the first part of the Class name from the second part. This happens only because utf8 vs latin.
But the code path for 7-element tokens is different, and it looks for the #stamp: at a different position.
Hence stamp is nil.
A wrong but easy fix would be to call #utf8ToSqueak on the preamble.
Best regards -Tobias
I'm not sure if I understand you correctly, but if you told me to search the hex of my change file for a "zero word", the only occurrence I could find is:
Which lead me to this:
Does not seem related, but still looks somehow wrong ^^
Best, Christoph
Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Tobias Pape Das.Linux@gmx.de Gesendet: Samstag, 21. Dezember 2019 15:44 Uhr An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names
On 21.12.2019, at 15:16, Thiede, Christoph Christoph.Thiede@student.hpi.uni-potsdam.de wrote:
Hi all, found just another bug. If you get tired of them, just tell me :-)
Steps to reproduce: Print it: class := Object subclass: #CTTèstClass "sic (with accent in name)!" instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'CT-Experiments'. class compile: 'foo ^ #foo'. (class >> #foo) timeStamp
Expected output: Something like 'ct 12/21/2019 15:13'.
Actual output: ''.
Please note that everything would have worked fine if we named class #CTTestClass (without accent) instead.
Do we want to support special class names in general? If yes, this is a bug in my opinion. If no, we should raise an error in the first statement.
Cause of infection not yet investigated.
Please look at your .changes file whether at some point \00 bytes appear.
Best regards -Tobias
Hi Tobias, thanks for the pointers!
(CTTéstClass compiledMethodAt: #foo) preamble
Like you said: [cid:83a80b39-6292-4b27-a568-4108acb2ecc6]
I made the following change: [cid:ca6036be-fde9-4c4d-9941-5f14ff95ddea] This seems to fix the conversion issues.
Outputs are: [cid:ee893e7e-8605-4413-9cdd-3761fc73b0d5]
The next problem is the trailing ! for the CTTéstClass preamble. Here, the integer returned by expandedSourceFileArray >> #filePositionFromSourcePointer: is too large by one. If have no idea where these constants come from, but as this is a constant method, I don't see how this calculation could be wrong.
I also tried the following: [cid:388ac190-510f-4612-8961-689c359f9799] yielding correctly: [cid:79731c28-bf8f-4335-889f-ce5652e9091a] But that seems hacky again.
Looking forward to your reply!
Best,
Christoph
________________________________ Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Tobias Pape Das.Linux@gmx.de Gesendet: Samstag, 21. Dezember 2019 19:22:38 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names
On 21.12.2019, at 19:11, Tobias Pape Das.Linux@gmx.de wrote:
On 21.12.2019, at 17:36, Thiede, Christoph Christoph.Thiede@student.hpi.uni-potsdam.de wrote:
Hi Tobias,
what do you mean in detail?
If I create the class via System Browser and add the method, my change file ends with:
Object subclass: #CTTéstClass instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'CT-Experiments'! !CTTéstClass methodsFor: 'no messages' stamp: 'ct 12/21/2019 17:18'! foo! !
Good. that was what I thought was important.
However, CompiledMethod >> #timeStamp returns ''.
What is the result of the following?
(CTTéstClass compiledMethodAt: #foo) preamble
Here is a snapshot of the #timeStamp stackframe:
Please note that "tokens at: tokenCount" returns the correct timestamp, but however, stamp is nil. What is this???
I see what the problem is. The .changes file is apparently written UTF-8 coded, but read Latin-1 coded. This is BAD.
Oh, and we were warned:
CompiledMethod getPreambleFrom: aFileStream at: endPosition "This method is an ugly hack. This method assumes that source files have ASCII-compatible encoding and that preambles contain no non-ASCII characters."
| chunkSize chunk | chunkSize := 160 min: endPosition. [ | index | chunk := aFileStream position: (endPosition - chunkSize + 1 max: 0); basicNext: chunkSize. (index := chunk lastIndexOf: $! startingAt: chunk size) ~= 0 ifTrue: [ ^chunk copyFrom: index + 1 to: chunk size ]. chunkSize := chunkSize * 2. chunkSize <= endPosition ] whileTrue. ^chunk
I have the feeling that the problematic send is #basicNext: in line 10 or so. This seems to circumvent the conversion done by MultiByteFileStream.
Best regards -Tobias
You end up with 7 tokens, because you have three for the class name instead of one. This is because the Latin-1 copyright symbol is classified as binary selector, and thus separates the first part of the Class name from the second part. This happens only because utf8 vs latin.
But the code path for 7-element tokens is different, and it looks for the #stamp: at a different position.
Hence stamp is nil.
A wrong but easy fix would be to call #utf8ToSqueak on the preamble.
Best regards -Tobias
I'm not sure if I understand you correctly, but if you told me to search the hex of my change file for a "zero word", the only occurrence I could find is:
Which lead me to this:
Does not seem related, but still looks somehow wrong ^^
Best, Christoph
Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Tobias Pape Das.Linux@gmx.de Gesendet: Samstag, 21. Dezember 2019 15:44 Uhr An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names
On 21.12.2019, at 15:16, Thiede, Christoph Christoph.Thiede@student.hpi.uni-potsdam.de wrote:
Hi all, found just another bug. If you get tired of them, just tell me :-)
Steps to reproduce: Print it: class := Object subclass: #CTTèstClass "sic (with accent in name)!" instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'CT-Experiments'. class compile: 'foo ^ #foo'. (class >> #foo) timeStamp
Expected output: Something like 'ct 12/21/2019 15:13'.
Actual output: ''.
Please note that everything would have worked fine if we named class #CTTestClass (without accent) instead.
Do we want to support special class names in general? If yes, this is a bug in my opinion. If no, we should raise an error in the first statement.
Cause of infection not yet investigated.
Please look at your .changes file whether at some point \00 bytes appear.
Best regards -Tobias
On 21.12.2019, at 20:23, Thiede, Christoph Christoph.Thiede@student.hpi.uni-potsdam.de wrote:
Hi Tobias, thanks for the pointers!
(CTTéstClass compiledMethodAt: #foo) preamble
Like you said:
I made the following change:
This seems to fix the conversion issues.
Outputs are:
The next problem is the trailing ! for the CTTéstClass preamble. Here, the integer returned by expandedSourceFileArray >> #filePositionFromSourcePointer: is too large by one. If have no idea where these constants come from, but as this is a constant method, I don't see how this calculation could be wrong.
Because of utf8. it counts raw bytes, but gets returned in count of unicode codepoints. hence + 1...
I also tried the following:
yielding correctly:
Seems lucky..
But that seems hacky again.
Looking forward to your reply!
Best regards -Tobias
PS: maybe copy the code instead of images? its easier to see things then, for me at least :)
Best, Christoph Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Tobias Pape Das.Linux@gmx.de Gesendet: Samstag, 21. Dezember 2019 19:22:38 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names
On 21.12.2019, at 19:11, Tobias Pape Das.Linux@gmx.de wrote:
On 21.12.2019, at 17:36, Thiede, Christoph Christoph.Thiede@student.hpi.uni-potsdam.de wrote:
Hi Tobias,
what do you mean in detail?
If I create the class via System Browser and add the method, my change file ends with:
Object subclass: #CTTéstClass instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'CT-Experiments'! !CTTéstClass methodsFor: 'no messages' stamp: 'ct 12/21/2019 17:18'! foo! !
Good. that was what I thought was important.
However, CompiledMethod >> #timeStamp returns ''.
What is the result of the following?
(CTTéstClass compiledMethodAt: #foo) preamble
Here is a snapshot of the #timeStamp stackframe:
Please note that "tokens at: tokenCount" returns the correct timestamp, but however, stamp is nil. What is this???
I see what the problem is. The .changes file is apparently written UTF-8 coded, but read Latin-1 coded. This is BAD.
Oh, and we were warned:
CompiledMethod getPreambleFrom: aFileStream at: endPosition "This method is an ugly hack. This method assumes that source files have ASCII-compatible encoding and that preambles contain no non-ASCII characters."
| chunkSize chunk | chunkSize := 160 min: endPosition. [ | index | chunk := aFileStream position: (endPosition - chunkSize + 1 max: 0); basicNext: chunkSize. (index := chunk lastIndexOf: $! startingAt: chunk size) ~= 0 ifTrue: [ ^chunk copyFrom: index + 1 to: chunk size ]. chunkSize := chunkSize * 2. chunkSize <= endPosition ] whileTrue. ^chunk
I have the feeling that the problematic send is #basicNext: in line 10 or so. This seems to circumvent the conversion done by MultiByteFileStream.
Best regards -Tobias
You end up with 7 tokens, because you have three for the class name instead of one. This is because the Latin-1 copyright symbol is classified as binary selector, and thus separates the first part of the Class name from the second part. This happens only because utf8 vs latin.
But the code path for 7-element tokens is different, and it looks for the #stamp: at a different position.
Hence stamp is nil.
A wrong but easy fix would be to call #utf8ToSqueak on the preamble.
Best regards -Tobias
I'm not sure if I understand you correctly, but if you told me to search the hex of my change file for a "zero word", the only occurrence I could find is:
Which lead me to this:
Does not seem related, but still looks somehow wrong ^^
Best, Christoph
Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Tobias Pape Das.Linux@gmx.de Gesendet: Samstag, 21. Dezember 2019 15:44 Uhr An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names
On 21.12.2019, at 15:16, Thiede, Christoph Christoph.Thiede@student.hpi.uni-potsdam.de wrote:
Hi all, found just another bug. If you get tired of them, just tell me :-)
Steps to reproduce: Print it: class := Object subclass: #CTTèstClass "sic (with accent in name)!" instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'CT-Experiments'. class compile: 'foo ^ #foo'. (class >> #foo) timeStamp
Expected output: Something like 'ct 12/21/2019 15:13'.
Actual output: ''.
Please note that everything would have worked fine if we named class #CTTestClass (without accent) instead.
Do we want to support special class names in general? If yes, this is a bug in my opinion. If no, we should raise an error in the first statement.
Cause of infection not yet investigated.
Please look at your .changes file whether at some point \00 bytes appear.
Best regards -Tobias
Hi Tobias, sorry for the long delay!
PS: maybe copy the code instead of images? its easier to see things then, for me at least :)
Sorry, you're right. Code is bad for showing the diffs, screenshots are bad for editability :( Please find the attachment.
Best, Christoph
________________________________ Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Tobias Pape Das.Linux@gmx.de Gesendet: Samstag, 21. Dezember 2019 20:47:50 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names
On 21.12.2019, at 20:23, Thiede, Christoph Christoph.Thiede@student.hpi.uni-potsdam.de wrote:
Hi Tobias, thanks for the pointers!
(CTTéstClass compiledMethodAt: #foo) preamble
Like you said:
I made the following change:
This seems to fix the conversion issues.
Outputs are:
The next problem is the trailing ! for the CTTéstClass preamble. Here, the integer returned by expandedSourceFileArray >> #filePositionFromSourcePointer: is too large by one. If have no idea where these constants come from, but as this is a constant method, I don't see how this calculation could be wrong.
Because of utf8. it counts raw bytes, but gets returned in count of unicode codepoints. hence + 1...
I also tried the following:
yielding correctly:
Seems lucky..
But that seems hacky again.
Looking forward to your reply!
Best regards -Tobias
PS: maybe copy the code instead of images? its easier to see things then, for me at least :)
Best, Christoph Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Tobias Pape Das.Linux@gmx.de Gesendet: Samstag, 21. Dezember 2019 19:22:38 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names
On 21.12.2019, at 19:11, Tobias Pape Das.Linux@gmx.de wrote:
On 21.12.2019, at 17:36, Thiede, Christoph Christoph.Thiede@student.hpi.uni-potsdam.de wrote:
Hi Tobias,
what do you mean in detail?
If I create the class via System Browser and add the method, my change file ends with:
Object subclass: #CTTéstClass instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'CT-Experiments'! !CTTéstClass methodsFor: 'no messages' stamp: 'ct 12/21/2019 17:18'! foo! !
Good. that was what I thought was important.
However, CompiledMethod >> #timeStamp returns ''.
What is the result of the following?
(CTTéstClass compiledMethodAt: #foo) preamble
Here is a snapshot of the #timeStamp stackframe:
Please note that "tokens at: tokenCount" returns the correct timestamp, but however, stamp is nil. What is this???
I see what the problem is. The .changes file is apparently written UTF-8 coded, but read Latin-1 coded. This is BAD.
Oh, and we were warned:
CompiledMethod getPreambleFrom: aFileStream at: endPosition "This method is an ugly hack. This method assumes that source files have ASCII-compatible encoding and that preambles contain no non-ASCII characters."
| chunkSize chunk | chunkSize := 160 min: endPosition. [ | index | chunk := aFileStream position: (endPosition - chunkSize + 1 max: 0); basicNext: chunkSize. (index := chunk lastIndexOf: $! startingAt: chunk size) ~= 0 ifTrue: [ ^chunk copyFrom: index + 1 to: chunk size ]. chunkSize := chunkSize * 2. chunkSize <= endPosition ] whileTrue. ^chunk
I have the feeling that the problematic send is #basicNext: in line 10 or so. This seems to circumvent the conversion done by MultiByteFileStream.
Best regards -Tobias
You end up with 7 tokens, because you have three for the class name instead of one. This is because the Latin-1 copyright symbol is classified as binary selector, and thus separates the first part of the Class name from the second part. This happens only because utf8 vs latin.
But the code path for 7-element tokens is different, and it looks for the #stamp: at a different position.
Hence stamp is nil.
A wrong but easy fix would be to call #utf8ToSqueak on the preamble.
Best regards -Tobias
I'm not sure if I understand you correctly, but if you told me to search the hex of my change file for a "zero word", the only occurrence I could find is:
Which lead me to this:
Does not seem related, but still looks somehow wrong ^^
Best, Christoph
Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Tobias Pape Das.Linux@gmx.de Gesendet: Samstag, 21. Dezember 2019 15:44 Uhr An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] [BUG] Timestamps don't work for classes with special character names
On 21.12.2019, at 15:16, Thiede, Christoph Christoph.Thiede@student.hpi.uni-potsdam.de wrote:
Hi all, found just another bug. If you get tired of them, just tell me :-)
Steps to reproduce: Print it: class := Object subclass: #CTTèstClass "sic (with accent in name)!" instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'CT-Experiments'. class compile: 'foo ^ #foo'. (class >> #foo) timeStamp
Expected output: Something like 'ct 12/21/2019 15:13'.
Actual output: ''.
Please note that everything would have worked fine if we named class #CTTestClass (without accent) instead.
Do we want to support special class names in general? If yes, this is a bug in my opinion. If no, we should raise an error in the first statement.
Cause of infection not yet investigated.
Please look at your .changes file whether at some point \00 bytes appear.
Best regards -Tobias
Anyone willing to look into this? I have been testing this for the latest months and did not receive any errors from there. :)
-- Sent from: http://forum.world.st/Squeak-Dev-f45488.html
squeak-dev@lists.squeakfoundation.org