[squeak-dev] Daily Commit Log

commits at source.squeak.org commits at source.squeak.org
Thu Feb 9 23:55:06 UTC 2012


Changes to Trunk (http://source.squeak.org/trunk.html) in the last 24 hours:

http://lists.squeakfoundation.org/pipermail/packages/2012-February/005186.html

Name: Compiler-nice.223
Ancestors: Compiler-nice.222

Correct a bug (see http://code.google.com/p/pharo/issues/detail?id=4650)

self should: [Compiler evaluate: '$'] raise: Error.

I also simplified all the redundancy (aheadChar == DoItCharacter and: [source atEnd]) and replaced with aheadChar == DoItCharacter.
Indeed, these were uggly incomplete guards trying to distinguish a DoItCharacter marking an end of stream from a (Character value: 30 "ASCII character RS means Record Separator") encountered in source...
But they would not even handle the case when DoItCharacter is the last source character., except maybe the uggliest contorsions in xDigit...

Instead I replaced the DoItCharacter marking the endOfStream by a character that we should never encounter in source.
I have chosen 16r10FFFF which is the last unicode and will never be used to encode a character (as all ending in FFFE and FFFF).
A different strategy would be to use a value greater than the last unicode, like 16r110000, and would also work...
Or use a different Object. In this later case, the object would have to understand charCode or we would have to change more Scanner methods (at least typeTableAt:).

Note that with current Character implementation, (Character value: 16r10FFFF) ~~ (Character value: 16r10FFFF).
Since all tests are written with identity test aheadChar or hereChar == DoItCharacter, even if such Character were encountered in source, it wouldn't be interpreted as an endOfStream mark, thus any Character code > 255 could have been used, but this would be more fragile.

Consequently, I also modified the character type table to not interpret (Character value: 30) as end of source (#doIt).
I think that it was previously possible to insert a (Character value: 30) in source, and everything after that would have been ignored (if not between quotes) and could potentially store meta information. But this was both undocumented and AFAIK unused. It's easy to go back if we want to, by restoring previous version of #initializeTypeTable.

While doing this, I noticed that (Character value: 30) will now be interpreted as binary as many other characters including invisible control characters...
It could thus be used in a binary selector! That's crazy and I suggest using xIllegal in the Character typeTable, since I had previously prepared this method...
But one change at a time...

=============================================


More information about the Squeak-dev mailing list