Hi All,
this quick hack (below) is perhaps wrong. Perhaps we should simply accept underscores in selectors period, and not query the preference. If so
Scanner prefAllowUnderscoreSelectors ifFalse: [self class isBytes ifTrue: [(self findSubstring: '~' in: self startingAt: 1 matchTable: Tokenish) > 0 ifTrue: [^ -1]] ifFalse: [2 to: self size do: [:i | (self at: i) tokenish ifFalse: [^ -1]]]].
should read
Scanner prefAllowUnderscoreSelectors ifFalse: [2 to: self size do: [:i | ((self at: i) tokenish or: [(self at: i) == $_]) ifFalse: [^ -1]]].
Instead perhaps tokenish should be redefined to allow underscores? I'd appreciate input from those that have thought about this area carefully (I have not).
On Thu, Sep 29, 2011 at 12:42 PM, commits@source.squeak.org wrote:
Eliot Miranda uploaded a new version of Collections to project The Trunk: http://source.squeak.org/trunk/Collections-eem.460.mcz
==================== Summary ====================
Name: Collections-eem.460 Author: eem Time: 29 September 2011, 12:42:36.39 pm UUID: 0b90a58f-9354-4927-91f0-3432359541fe Ancestors: Collections-ul.459
Fix String>numArgs for prefAllowUnderscoreSelectors regime.
=============== Diff against Collections-ul.459 ===============
Item was changed: ----- Method: String>>numArgs (in category 'accessing') ----- numArgs "Answer either the number of arguments that the receiver would take if considered a selector. Answer -1 if it couldn't be a selector. Note that currently this will answer -1 for anything begining with an uppercase letter even though the system will accept such symbols as selectors. It is intended mostly for the assistance of spelling correction."
| firstChar numColons excess start ix | self size = 0 ifTrue: [^ -1]. firstChar := self at: 1. (firstChar isLetter or: [firstChar = $:]) ifTrue: ["Fast reject if any chars are non-alphanumeric NOTE: fast only for Byte things - Broken for Wide"
Scanner prefAllowUnderscoreSelectors ifFalse:
[self class isBytes
ifTrue: [(self findSubstring: '~' in: self
startingAt: 1 matchTable: Tokenish) > 0 ifTrue: [^ -1]]
ifFalse: [2 to: self size do: [:i | (self
at: i) tokenish ifFalse: [^ -1]]]].
self class isBytes
ifTrue: [(self findSubstring: '~' in: self
startingAt: 1 matchTable: Tokenish) > 0 ifTrue: [^ -1]]
ifFalse: [2 to: self size do: [:i | (self at: i)
tokenish ifFalse: [^ -1]]]. "Fast colon count" numColons := 0. start := 1. [(ix := self indexOf: $: startingAt: start) > 0] whileTrue: [numColons := numColons + 1. start := ix + 1]. numColons = 0 ifTrue: [^ 0]. firstChar = $: ifTrue: [excess := 2 "Has an initial keyword, as #:if:then:else:"] ifFalse: [excess := 0]. self last = $: ifTrue: [^ numColons - excess] ifFalse: [^ numColons - excess - 1 "Has a final keywords as #nextPut::andCR"]]. firstChar isSpecial ifTrue: [self size = 1 ifTrue: [^ 1]. 2 to: self size do: [:i | (self at: i) isSpecial ifFalse: [^ -1]]. ^ 1]. ^ -1.!
On Thu, 29 Sep 2011, Eliot Miranda wrote:
Hi All,
this quick hack (below) is perhaps wrong. Perhaps we should simply accept underscores in selectors period, and not query the preference. If so
Scanner prefAllowUnderscoreSelectors ifFalse: [self class isBytes ifTrue: [(self findSubstring: '~' in: self startingAt: 1 matchTable: Tokenish) > 0 ifTrue: [^ -1]] ifFalse: [2 to: self size do: [:i | (self at: i) tokenish ifFalse: [^ -1]]]].
should read
Scanner prefAllowUnderscoreSelectors ifFalse: [2 to: self size do: [:i | ((self at: i) tokenish or: [(self at: i) == $_]) ifFalse: [^ -1]]].
Instead perhaps tokenish should be redefined to allow underscores? I'd appreciate input from those that have thought about this area carefully (I have not).
I tried to resolve the issue of broken underscores in selectors support in the past, but it's not as easy as it seems to be at first sight. The biggest problem is the per class option for them. The current implmenetation is just scratching the surface, if you try to use it, you'll find that it's broken.
What are the problems: - only instance side methods can use selectors and class names with underscores in them - lack of support in the kernel (e.g. String >> #numArgs, Parser >> #parseSelector:, Parser >> #parseParameterNames, their senders, etc) - the tools lack support for them too, so even if you succeed writing such code, you can't load your code into another image. This problem is kinda impossible to solve in the current model if you have a class with an underscore in it's name.
So IMHO we should just drop the support for the per class option for underscores in selectors, keep the global preference and maybe enable it by default in the long term*. Not because we're about to use such selectors/class names in the core image, but to ease loading of external code.
Back to the original question, I changed #tokenish to return true for $_ when I tried to solve this problem.
Levente
* For those who are worried about losing _ as an assignment operator: this preference is compatible with underscore assignments as long as you separate the operator from the variable and the operand with whitespaces. You can also change the value of the preference in your image if you need support of old-style code.
P.S.: Pharo is not affected by this issue, because they dropped support for underscore assignments and enabled underscores in selectors by default.
On Thu, Sep 29, 2011 at 12:42 PM, commits@source.squeak.org wrote:
Eliot Miranda uploaded a new version of Collections to project The Trunk: http://source.squeak.org/trunk/Collections-eem.460.mcz
==================== Summary ====================
Name: Collections-eem.460 Author: eem Time: 29 September 2011, 12:42:36.39 pm UUID: 0b90a58f-9354-4927-91f0-3432359541fe Ancestors: Collections-ul.459
Fix String>numArgs for prefAllowUnderscoreSelectors regime.
=============== Diff against Collections-ul.459 ===============
Item was changed: ----- Method: String>>numArgs (in category 'accessing') ----- numArgs "Answer either the number of arguments that the receiver would take if considered a selector. Answer -1 if it couldn't be a selector. Note that currently this will answer -1 for anything begining with an uppercase letter even though the system will accept such symbols as selectors. It is intended mostly for the assistance of spelling correction."
| firstChar numColons excess start ix | self size = 0 ifTrue: [^ -1]. firstChar := self at: 1. (firstChar isLetter or: [firstChar = $:]) ifTrue: ["Fast reject if any chars are non-alphanumeric NOTE: fast only for Byte things - Broken for Wide"
Scanner prefAllowUnderscoreSelectors ifFalse:
[self class isBytes
ifTrue: [(self findSubstring: '~' in: self
startingAt: 1 matchTable: Tokenish) > 0 ifTrue: [^ -1]]
ifFalse: [2 to: self size do: [:i | (self
at: i) tokenish ifFalse: [^ -1]]]].
self class isBytes
ifTrue: [(self findSubstring: '~' in: self
startingAt: 1 matchTable: Tokenish) > 0 ifTrue: [^ -1]]
ifFalse: [2 to: self size do: [:i | (self at: i)
tokenish ifFalse: [^ -1]]]. "Fast colon count" numColons := 0. start := 1. [(ix := self indexOf: $: startingAt: start) > 0] whileTrue: [numColons := numColons + 1. start := ix + 1]. numColons = 0 ifTrue: [^ 0]. firstChar = $: ifTrue: [excess := 2 "Has an initial keyword, as #:if:then:else:"] ifFalse: [excess := 0]. self last = $: ifTrue: [^ numColons - excess] ifFalse: [^ numColons - excess - 1 "Has a final keywords as #nextPut::andCR"]]. firstChar isSpecial ifTrue: [self size = 1 ifTrue: [^ 1]. 2 to: self size do: [:i | (self at: i) isSpecial ifFalse: [^ -1]]. ^ 1]. ^ -1.!
-- best, Eliot
On 04.10.2011, at 16:51, Levente Uzonyi wrote:
On Thu, 29 Sep 2011, Eliot Miranda wrote:
Hi All,
this quick hack (below) is perhaps wrong. Perhaps we should simply accept underscores in selectors period, and not query the preference. If so
Scanner prefAllowUnderscoreSelectors ifFalse: [self class isBytes ifTrue: [(self findSubstring: '~' in: self startingAt: 1 matchTable: Tokenish) > 0 ifTrue: [^ -1]] ifFalse: [2 to: self size do: [:i | (self at: i) tokenish ifFalse: [^ -1]]]].
should read
Scanner prefAllowUnderscoreSelectors ifFalse: [2 to: self size do: [:i | ((self at: i) tokenish or: [(self at: i) == $_]) ifFalse: [^ -1]]].
Instead perhaps tokenish should be redefined to allow underscores? I'd appreciate input from those that have thought about this area carefully (I have not).
I tried to resolve the issue of broken underscores in selectors support in the past, but it's not as easy as it seems to be at first sight. The biggest problem is the per class option for them. The current implmenetation is just scratching the surface, if you try to use it, you'll find that it's broken.
What are the problems:
- only instance side methods can use selectors and class names with underscores in them
- lack of support in the kernel (e.g. String >> #numArgs, Parser >> #parseSelector:, Parser >> #parseParameterNames, their senders, etc)
- the tools lack support for them too, so even if you succeed writing such code, you can't load your code into another image. This problem is kinda impossible to solve in the current model if you have a class with an underscore in it's name.
So IMHO we should just drop the support for the per class option for underscores in selectors, keep the global preference and maybe enable it by default in the long term*. Not because we're about to use such selectors/class names in the core image, but to ease loading of external code.
+1
- Bert -
Back to the original question, I changed #tokenish to return true for $_ when I tried to solve this problem.
Levente
- For those who are worried about losing _ as an assignment operator: this preference is compatible with underscore assignments as long as you separate the operator from the variable and the operand with whitespaces. You can also change the value of the preference in your image if you need support of old-style code.
squeak-dev@lists.squeakfoundation.org