Removing OneCharacterSymbols (was: Re: [squeak-dev] The Inbox: Collections-nice.530.mcz)

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Thu Sep 5 23:58:20 UTC 2013


Thanks Bert, I just published a Compiler version that does not...


2013/9/5 Bert Freudenberg <bert at freudenbergs.de>

>
> On 2013-09-04, at 21:41, Nicolas Cellier <
> nicolas.cellier.aka.nice at gmail.com> wrote:
>
> A rapid review of senders indicates that Character asSymbol seems rather
> rare in a trunk image.
>
> I found
> #initTextConstants
> #inOutdent:delta: (Character tab asSymbol asText which is rather crooked
> way of avoiding an allocation - String tab would allocate a new Object).
>
>
> Scanner>>scanToken appears to be the most often used sender, and may well
> be the original reason for caching the 1-char symbols.
>
> It might be a good idea to time compilation of complex methods with and
> without OneCharacterSymbols. (my guess is it wouldn't make much of a
> difference these days)
>
> - Bert -
>
>
>
>
> 2013/9/4 Levente Uzonyi <leves at elte.hu>
>
>> There's another user of OneCharacterSymbols: Character >> #asSymbol. I
>> don't know how often it's used (my guess is that it's rarely used), or how
>> much slower it would be, if we were removing OneCharacterSymbols. The
>> implementation would change from
>>
>>         ^Symbol internCharacter: self
>>
>> to
>>
>>         ^Symbol intern: self asString
>>
>> Which means that an extra String will be created, and the lookup is done
>> in the WeakSets, instead of an Array/WeakArray.
>>
>>
>> Levente
>>
>> On Wed, 4 Sep 2013, Nicolas Cellier wrote:
>>
>>  Well, as you were the last modifier, I prefer to have your advice :)
>>>
>>> Nicolas
>>>
>>>
>>> 2013/9/4 Levente Uzonyi <leves at elte.hu>
>>>       On Wed, 4 Sep 2013, commits at source.squeak.org wrote:
>>>
>>>             A new version of Collections was added to project The Inbox:
>>>             http://source.squeak.org/**inbox/Collections-nice.530.mcz<http://source.squeak.org/inbox/Collections-nice.530.mcz>
>>>
>>>             ==================== Summary ====================
>>>
>>>             Name: Collections-nice.530
>>>             Author: nice
>>>             Time: 4 September 2013, 2:21:28.852 am
>>>             UUID: d732aa23-1c77-4c1e-abc0-**d8d4506b7f9f
>>>             Ancestors: Collections-ul.529
>>>
>>>             Fix this:
>>>              self assert: (allSymbols select: [:s | s = '+']) size = 1.
>>>
>>>             1) It is un-necessary to enumerate the 1-char symbols, they
>>> already are in SymbolTable (the first time they were created via
>>> #findIntern: or by virtue of last #rehash) !
>>>
>>>             2) We can use a WeekArray rather than an Array because there
>>> is no point in keeping a reference to unused 1-char symbols.
>>>
>>>             Maybe we should get rid of OneCharacterSymbols class var,
>>> but I didn't dare...
>>>
>>>
>>> I say we should nuke it. The only user is Symbol class >>
>>> #selectorsContaining:, but that method is pretty much broken:
>>> - it uses OneCharacterSymbols only for infix selectors
>>> - it can't find infix selectors longer than 2 characters
>>> - it tries to optimize some stuff, but wastes cycles on other stuff
>>>
>>>
>>> Levente
>>>
>>>
>>>       =============== Diff against Collections-ul.529 ===============
>>>
>>>       Item was changed:
>>>        ----- Method: Symbol class>>allSymbols (in category 'access')
>>> -----
>>>        allSymbols
>>>               "Answer all interned symbols"
>>>               ^Array streamContents:[:s|
>>>                       s nextPutAll: NewSymbols.
>>>       -               s nextPutAll: OneCharacterSymbols.
>>>                       s nextPutAll: SymbolTable.
>>>               ].
>>>        !
>>>
>>>       Item was changed:
>>>        ----- Method: Symbol class>>initialize (in category 'class
>>> initialization') -----
>>>        initialize
>>>
>>>               "Symbol initialize"
>>>
>>>               Symbol rehash.
>>>       +       OneCharacterSymbols := WeakArray new: 256.
>>>       +       (0 to: 255) do: [ :byte | byte asCharacter asSymbol].
>>>       -       OneCharacterSymbols := nil.
>>>       -       OneCharacterSymbols := (1 to: 256) collect: [ :i | (i - 1)
>>> asCharacter asSymbol].
>>>               Smalltalk addToShutDownList: self.
>>>        !
>>>
>>>       Item was changed:
>>>        ----- Method: Symbol class>>internCharacter: (in category
>>> 'instance creation') -----
>>>        internCharacter: aCharacter
>>>               aCharacter asciiValue > 256 ifTrue:[^self intern:
>>> aCharacter asString].
>>>       +       ^(OneCharacterSymbols at: aCharacter asciiValue + 1)
>>> ifNil: [OneCharacterSymbols at: aCharacter asciiValue + 1 put: (self
>>> intern: aCharacter asString)]
>>>       -       OneCharacterSymbols ifNil: [^self intern: aCharacter
>>> asString].
>>>       -       ^OneCharacterSymbols at: aCharacter asciiValue + 1
>>>        !
>>>
>>>       Item was changed:
>>>        ----- Method: Symbol class>>selectorsContaining: (in category
>>> 'access') -----
>>>        selectorsContaining: aString
>>>               "Answer a list of selectors that contain aString within
>>> them. Case-insensitive.  Does return symbols that begin with a capital
>>> letter."
>>>
>>>               | size selectorList ascii |
>>>
>>>               selectorList := OrderedCollection new.
>>>               (size := aString size) = 0 ifTrue: [^selectorList].
>>>
>>>               aString size = 1 ifTrue:
>>>                       [
>>>                               ascii := aString first asciiValue.
>>>       +                       ascii < 128 ifTrue: [(OneCharacterSymbols
>>> at: ascii+1) ifNotNil: [:s | selectorList add: s]]
>>>       -                       ascii < 128 ifTrue: [selectorList add:
>>> (OneCharacterSymbols at: ascii+1)]
>>>                       ].
>>>
>>>               (aString first isLetter or: [aString first isDigit])
>>> ifFalse:
>>>                       [
>>>                               aString size = 2 ifTrue:
>>>                                       [Symbol hasInterned: aString
>>> ifTrue:
>>>                                               [:s | selectorList add:
>>> s]].
>>>                               ^selectorList
>>>                       ].
>>>
>>>               selectorList := selectorList copyFrom: 2 to: selectorList
>>> size.
>>>
>>>               self allSymbolTablesDo: [:each |
>>>                       each size >= size ifTrue:
>>>                               [(each findSubstring: aString in: each
>>> startingAt: 1
>>>                                       matchTable: CaseInsensitiveOrder)
>>> > 0
>>>                                                       ifTrue:
>>> [selectorList add: each]]].
>>>
>>>               ^selectorList reject: [:each | "reject non-selectors, but
>>> keep ones that begin with an uppercase"
>>>                       each numArgs < 0 and: [each asString
>>> withFirstCharacterDownshifted numArgs < 0]].
>>>
>>>        "Symbol selectorsContaining: 'scon'"!
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20130906/1cf99f2d/attachment.htm


More information about the Squeak-dev mailing list