Removing OneCharacterSymbols (was: Re: [squeak-dev] The Inbox: Collections-nice.530.mcz)

Bert Freudenberg bert at freudenbergs.de
Thu Sep 5 13:41:56 UTC 2013


On 2013-09-04, at 21:41, Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com> wrote:

> A rapid review of senders indicates that Character asSymbol seems rather rare in a trunk image.
> 
> I found 
> #initTextConstants
> #inOutdent:delta: (Character tab asSymbol asText which is rather crooked way of avoiding an allocation - String tab would allocate a new Object).

Scanner>>scanToken appears to be the most often used sender, and may well be the original reason for caching the 1-char symbols.

It might be a good idea to time compilation of complex methods with and without OneCharacterSymbols. (my guess is it wouldn't make much of a difference these days)

- Bert -


> 
> 
> 2013/9/4 Levente Uzonyi <leves at elte.hu>
> There's another user of OneCharacterSymbols: Character >> #asSymbol. I don't know how often it's used (my guess is that it's rarely used), or how much slower it would be, if we were removing OneCharacterSymbols. The implementation would change from
> 
>         ^Symbol internCharacter: self
> 
> to
> 
>         ^Symbol intern: self asString
> 
> Which means that an extra String will be created, and the lookup is done in the WeakSets, instead of an Array/WeakArray.
> 
> 
> Levente
> 
> On Wed, 4 Sep 2013, Nicolas Cellier wrote:
> 
> Well, as you were the last modifier, I prefer to have your advice :)
> 
> Nicolas
> 
> 
> 2013/9/4 Levente Uzonyi <leves at elte.hu>
>       On Wed, 4 Sep 2013, commits at source.squeak.org wrote:
> 
>             A new version of Collections was added to project The Inbox:
>             http://source.squeak.org/inbox/Collections-nice.530.mcz
> 
>             ==================== Summary ====================
> 
>             Name: Collections-nice.530
>             Author: nice
>             Time: 4 September 2013, 2:21:28.852 am
>             UUID: d732aa23-1c77-4c1e-abc0-d8d4506b7f9f
>             Ancestors: Collections-ul.529
> 
>             Fix this:
>              self assert: (allSymbols select: [:s | s = '+']) size = 1.
> 
>             1) It is un-necessary to enumerate the 1-char symbols, they already are in SymbolTable (the first time they were created via #findIntern: or by virtue of last #rehash) !
> 
>             2) We can use a WeekArray rather than an Array because there is no point in keeping a reference to unused 1-char symbols.
> 
>             Maybe we should get rid of OneCharacterSymbols class var, but I didn't dare...
> 
> 
> I say we should nuke it. The only user is Symbol class >> #selectorsContaining:, but that method is pretty much broken:
> - it uses OneCharacterSymbols only for infix selectors
> - it can't find infix selectors longer than 2 characters
> - it tries to optimize some stuff, but wastes cycles on other stuff
> 
> 
> Levente
> 
> 
>       =============== Diff against Collections-ul.529 ===============
> 
>       Item was changed:
>        ----- Method: Symbol class>>allSymbols (in category 'access') -----
>        allSymbols
>               "Answer all interned symbols"
>               ^Array streamContents:[:s|
>                       s nextPutAll: NewSymbols.
>       -               s nextPutAll: OneCharacterSymbols.
>                       s nextPutAll: SymbolTable.
>               ].
>        !
> 
>       Item was changed:
>        ----- Method: Symbol class>>initialize (in category 'class initialization') -----
>        initialize
> 
>               "Symbol initialize"
> 
>               Symbol rehash.
>       +       OneCharacterSymbols := WeakArray new: 256.
>       +       (0 to: 255) do: [ :byte | byte asCharacter asSymbol].
>       -       OneCharacterSymbols := nil.
>       -       OneCharacterSymbols := (1 to: 256) collect: [ :i | (i - 1) asCharacter asSymbol].
>               Smalltalk addToShutDownList: self.
>        !
> 
>       Item was changed:
>        ----- Method: Symbol class>>internCharacter: (in category 'instance creation') -----
>        internCharacter: aCharacter
>               aCharacter asciiValue > 256 ifTrue:[^self intern: aCharacter asString].
>       +       ^(OneCharacterSymbols at: aCharacter asciiValue + 1) ifNil: [OneCharacterSymbols at: aCharacter asciiValue + 1 put: (self intern: aCharacter asString)]
>       -       OneCharacterSymbols ifNil: [^self intern: aCharacter asString].
>       -       ^OneCharacterSymbols at: aCharacter asciiValue + 1
>        !
> 
>       Item was changed:
>        ----- Method: Symbol class>>selectorsContaining: (in category 'access') -----
>        selectorsContaining: aString
>               "Answer a list of selectors that contain aString within them. Case-insensitive.  Does return symbols that begin with a capital letter."
> 
>               | size selectorList ascii |
> 
>               selectorList := OrderedCollection new.
>               (size := aString size) = 0 ifTrue: [^selectorList].
> 
>               aString size = 1 ifTrue:
>                       [
>                               ascii := aString first asciiValue.
>       +                       ascii < 128 ifTrue: [(OneCharacterSymbols at: ascii+1) ifNotNil: [:s | selectorList add: s]]
>       -                       ascii < 128 ifTrue: [selectorList add: (OneCharacterSymbols at: ascii+1)]
>                       ].
> 
>               (aString first isLetter or: [aString first isDigit]) ifFalse:
>                       [
>                               aString size = 2 ifTrue:
>                                       [Symbol hasInterned: aString ifTrue:
>                                               [:s | selectorList add: s]].
>                               ^selectorList
>                       ].
> 
>               selectorList := selectorList copyFrom: 2 to: selectorList size.
> 
>               self allSymbolTablesDo: [:each |
>                       each size >= size ifTrue:
>                               [(each findSubstring: aString in: each startingAt: 1
>                                       matchTable: CaseInsensitiveOrder) > 0
>                                                       ifTrue: [selectorList add: each]]].
> 
>               ^selectorList reject: [:each | "reject non-selectors, but keep ones that begin with an uppercase"
>                       each numArgs < 0 and: [each asString withFirstCharacterDownshifted numArgs < 0]].
> 
>        "Symbol selectorsContaining: 'scon'"!
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20130905/9e59f8fb/attachment.htm


More information about the Squeak-dev mailing list