Nicolas Cellier uploaded a new version of Collections to project The Trunk: http://source.squeak.org/trunk/Collections-nice.775.mcz
==================== Summary ====================
Name: Collections-nice.775 Author: nice Time: 1 December 2017, 1:25:19.882602 am UUID: 5ef7a80c-6213-4e8e-8f0c-b45e110ce15e Ancestors: Collections-nice.774
Rename CharacterSet -> ByteCharacterSet
This is step 1: - create a parallel ByteCharacterSet - then mutate CharacterSet instances -> ByteCharacterSet in postscript
=============== Diff against Collections-nice.774 ===============
Item was added: + ----- Method: AbstractCharacterSet>>species (in category 'private') ----- + species + ^CharacterSet!
Item was added: + Collection subclass: #ByteCharacterSet + instanceVariableNames: 'byteArrayMap tally' + classVariableNames: 'CrLf NonSeparators Separators' + poolDictionaries: '' + category: 'Collections-Support'! + + !ByteCharacterSet commentStamp: '<historical>' prior: 0! + A set of characters. Lookups for inclusion are very fast.!
Item was added: + ----- Method: ByteCharacterSet class>>allCharacters (in category 'instance creation') ----- + allCharacters + "return a set containing all characters" + + | set | + set := self empty. + 0 to: 255 do: [ :ascii | set add: (Character value: ascii) ]. + ^set!
Item was added: + ----- Method: ByteCharacterSet class>>cleanUp: (in category 'initialize-release') ----- + cleanUp: aggressive + + CrLf := NonSeparators := Separators := nil!
Item was added: + ----- Method: ByteCharacterSet class>>crlf (in category 'accessing') ----- + crlf + + ^CrLf ifNil: [ CrLf := self with: Character cr with: Character lf ]!
Item was added: + ----- Method: ByteCharacterSet class>>empty (in category 'instance creation') ----- + empty + "return an empty set of characters" + ^self new!
Item was added: + ----- Method: ByteCharacterSet class>>newFrom: (in category 'instance creation') ----- + newFrom: aCollection + | newCollection | + newCollection := self new. + newCollection addAll: aCollection. + ^newCollection!
Item was added: + ----- Method: ByteCharacterSet class>>nonSeparators (in category 'accessing') ----- + nonSeparators + "return a set containing everything but the whitespace characters" + + ^NonSeparators ifNil: [ + NonSeparators := self separators complement ]!
Item was added: + ----- Method: ByteCharacterSet class>>separators (in category 'accessing') ----- + separators + "return a set containing just the whitespace characters" + + ^Separators ifNil: [ Separators := self newFrom: Character separators ]!
Item was added: + ----- Method: ByteCharacterSet class>>withAll: (in category 'instance creation') ----- + withAll: aCollection + "Create a new ByteCharacterSet containing all the characters from aCollection." + + ^self newFrom: aCollection!
Item was added: + ----- Method: ByteCharacterSet>>= (in category 'comparing') ----- + = anObject + + self species == anObject species ifFalse: [ ^false ]. + anObject size = tally ifFalse: [ ^false ]. + ^self byteArrayMap = anObject byteArrayMap!
Item was added: + ----- Method: ByteCharacterSet>>add: (in category 'adding') ----- + add: aCharacter + "I automatically become a WideByteCharacterSet if you add a wide character to myself" + + | index | + (index := aCharacter asInteger + 1) <= 256 ifFalse: [ + | wide | + wide := WideCharacterSet new. + wide addAll: self. + wide add: aCharacter. + self becomeForward: wide. + ^aCharacter ]. + (byteArrayMap at: index) = 1 ifFalse: [ + byteArrayMap at: index put: 1. + tally := tally + 1 ]. + ^aCharacter!
Item was added: + ----- Method: ByteCharacterSet>>asString (in category 'conversion') ----- + asString + "Convert the receiver into a String" + + ^String new: self size streamContents:[:s| + self do:[:ch| s nextPut: ch]. + ].!
Item was added: + ----- Method: ByteCharacterSet>>byteArrayMap (in category 'private') ----- + byteArrayMap + "return a ByteArray mapping each ascii value to a 1 if that ascii value is in the set, and a 0 if it isn't. Intended for use by primitives only" + ^byteArrayMap!
Item was added: + ----- Method: ByteCharacterSet>>byteComplement (in category 'conversion') ----- + byteComplement + "return a character set containing precisely the single byte characters the receiver does not" + + | set | + set := ByteCharacterSet allCharacters. + self do: [ :c | set remove: c ]. + ^set!
Item was added: + ----- Method: ByteCharacterSet>>complement (in category 'conversion') ----- + complement + "return a character set containing precisely the characters the receiver does not" + + ^ByteCharacterSetComplement of: self copy!
Item was added: + ----- Method: ByteCharacterSet>>do: (in category 'enumerating') ----- + do: aBlock + "evaluate aBlock with each character in the set" + + | index | + tally >= 128 ifTrue: [ "dense" + index := 0. + [ (index := index + 1) <= 256 ] whileTrue: [ + (byteArrayMap at: index) = 1 ifTrue: [ + aBlock value: (Character value: index - 1) ] ]. + ^self ]. + "sparse" + index := 0. + [ (index := byteArrayMap indexOf: 1 startingAt: index + 1) = 0 ] whileFalse: [ + aBlock value: (Character value: index - 1) ]. + !
Item was added: + ----- Method: ByteCharacterSet>>findFirstInByteString:startingAt: (in category 'zap me later') ----- + findFirstInByteString: aByteString startingAt: startIndex + "Double dispatching: since we know this is a ByteString, we can use a superfast primitive using a ByteArray map with 0 slots for byte characters not included and 1 for byte characters included in the receiver." + ^ByteString + findFirstInString: aByteString + inSet: self byteArrayMap + startingAt: startIndex!
Item was added: + ----- Method: ByteCharacterSet>>hasWideCharacters (in category 'testing') ----- + hasWideCharacters + ^false!
Item was added: + ----- Method: ByteCharacterSet>>hash (in category 'comparing') ----- + hash + ^self byteArrayMap hash!
Item was added: + ----- Method: ByteCharacterSet>>includes: (in category 'testing') ----- + includes: anObject + + | index | + anObject isCharacter ifFalse: [ ^false ]. + (index := anObject asInteger + 1) > 256 ifTrue: [ ^false ]. + ^(byteArrayMap at: index) > 0!
Item was added: + ----- Method: ByteCharacterSet>>initialize (in category 'private') ----- + initialize + + byteArrayMap := ByteArray new: 256. + tally := 0!
Item was added: + ----- Method: ByteCharacterSet>>isEmpty (in category 'testing') ----- + isEmpty + ^tally = 0!
Item was added: + ----- Method: ByteCharacterSet>>occurrencesOf: (in category 'zap me later') ----- + occurrencesOf: anObject + "Answer how many of the receiver's elements are equal to anObject. Optimized version." + + (self includes: anObject) ifTrue: [ ^1 ]. + ^0!
Item was added: + ----- Method: ByteCharacterSet>>postCopy (in category 'copying') ----- + postCopy + super postCopy. + byteArrayMap := byteArrayMap copy!
Item was added: + ----- Method: ByteCharacterSet>>remove: (in category 'removing') ----- + remove: aCharacter + + ^self remove: aCharacter ifAbsent: aCharacter!
Item was added: + ----- Method: ByteCharacterSet>>remove:ifAbsent: (in category 'removing') ----- + remove: aCharacter ifAbsent: aBlock + + | index | + (index := aCharacter asciiValue + 1) <= 256 ifFalse: [ ^aBlock value ]. + (byteArrayMap at: index) = 0 ifTrue: [ ^aBlock value ]. + byteArrayMap at: index put: 0. + tally := tally - 1. + ^aCharacter!
Item was added: + ----- Method: ByteCharacterSet>>removeAll (in category 'removing') ----- + removeAll + + byteArrayMap atAllPut: 0. + tally := 0!
Item was added: + ----- Method: ByteCharacterSet>>size (in category 'accessing') ----- + size + + ^tally!
Item was added: + ----- Method: ByteCharacterSet>>species (in category 'zap me later') ----- + species + ^CharacterSet!
Item was added: + ----- Method: ByteCharacterSet>>union: (in category 'enumerating') ----- + union: aCollection + (self species = aCollection species or: [aCollection isString or: [aCollection allSatisfy: [:e | e isCharacter]]]) ifFalse: [^super union: aCollection]. + (self species = aCollection species and: [self class ~= aCollection class]) ifTrue: [^aCollection union: self]. + ^self copy addAll: aCollection; yourself!
Item was added: + ----- Method: ByteCharacterSet>>wideCharacterMap (in category 'private') ----- + wideCharacterMap + "used for comparing with WideByteCharacterSet" + + | wide | + wide := WideByteCharacterSet new. + wide addAll: self. + ^wide wideCharacterMap!
Item was changed: ----- Method: CharacterSet class>>crlf (in category 'accessing') ----- crlf
+ ^CrLf ifNil: [ CrLf := ByteCharacterSet with: Character cr with: Character lf ]! - ^CrLf ifNil: [ CrLf := self with: Character cr with: Character lf ]!
Item was changed: ----- Method: CharacterSet class>>empty (in category 'instance creation') ----- empty "return an empty set of characters" + ^ByteCharacterSet new! - ^self new!
Item was changed: ----- Method: CharacterSet class>>newFrom: (in category 'instance creation') ----- newFrom: aCollection | newCollection | + newCollection := ByteCharacterSet new. - newCollection := self new. newCollection addAll: aCollection. ^newCollection!
Item was removed: - ----- Method: CharacterSetComplement>>findFirstInByteString:startingAt: (in category 'enumerating') ----- - findFirstInByteString: aByteString startingAt: startIndex - "Double dispatching: since we know this is a ByteString, we can use a superfast primitive using a ByteArray map with 0 slots for byte characters not included and 1 for byte characters included in the receiver." - ^ByteString - findFirstInString: aByteString - inSet: self byteArrayMap - startingAt: startIndex!
Item was removed: - ----- Method: WideCharacterSet>>findFirstInByteString:startingAt: (in category 'enumerating') ----- - findFirstInByteString: aByteString startingAt: startIndex - "Double dispatching: since we know this is a ByteString, we can use a superfast primitive using a ByteArray map with 0 slots for byte characters not included and 1 for byte characters included in the receiver." - - ^ByteString - findFirstInString: aByteString - inSet: byteArrayMap - startingAt: startIndex!
Item was removed: - ----- Method: WideCharacterSet>>species (in category 'comparing') ----- - species - ^self hasWideCharacters - ifTrue: [WideCharacterSet] - ifFalse: [CharacterSet]!
Item was changed: + (PackageInfo named: 'Collections') postscript: 'CharacterSet allInstancesDo: [:e | ByteCharacterSet adoptInstance: e ]'! - (PackageInfo named: 'Collections') postscript: 'CharacterSet allInstancesDo: #size'!
Sorry for introducing a few Undeclared at this stage... They should disappear at next update.
2017-12-01 1:25 GMT+01:00 commits@source.squeak.org:
Nicolas Cellier uploaded a new version of Collections to project The Trunk: http://source.squeak.org/trunk/Collections-nice.775.mcz
==================== Summary ====================
Name: Collections-nice.775 Author: nice Time: 1 December 2017, 1:25:19.882602 am UUID: 5ef7a80c-6213-4e8e-8f0c-b45e110ce15e Ancestors: Collections-nice.774
Rename CharacterSet -> ByteCharacterSet
This is step 1:
- create a parallel ByteCharacterSet
- then mutate CharacterSet instances -> ByteCharacterSet in postscript
=============== Diff against Collections-nice.774 ===============
Item was added:
- ----- Method: AbstractCharacterSet>>species (in category 'private') -----
- species
^CharacterSet!
Item was added:
- Collection subclass: #ByteCharacterSet
instanceVariableNames: 'byteArrayMap tally'
classVariableNames: 'CrLf NonSeparators Separators'
poolDictionaries: ''
category: 'Collections-Support'!
- !ByteCharacterSet commentStamp: '<historical>' prior: 0!
- A set of characters. Lookups for inclusion are very fast.!
Item was added:
- ----- Method: ByteCharacterSet class>>allCharacters (in category
'instance creation') -----
- allCharacters
"return a set containing all characters"
| set |
set := self empty.
0 to: 255 do: [ :ascii | set add: (Character value: ascii) ].
^set!
Item was added:
- ----- Method: ByteCharacterSet class>>cleanUp: (in category
'initialize-release') -----
- cleanUp: aggressive
CrLf := NonSeparators := Separators := nil!
Item was added:
- ----- Method: ByteCharacterSet class>>crlf (in category 'accessing')
- crlf
^CrLf ifNil: [ CrLf := self with: Character cr with: Character lf
]!
Item was added:
- ----- Method: ByteCharacterSet class>>empty (in category 'instance
creation') -----
- empty
"return an empty set of characters"
^self new!
Item was added:
- ----- Method: ByteCharacterSet class>>newFrom: (in category 'instance
creation') -----
- newFrom: aCollection
| newCollection |
newCollection := self new.
newCollection addAll: aCollection.
^newCollection!
Item was added:
- ----- Method: ByteCharacterSet class>>nonSeparators (in category
'accessing') -----
- nonSeparators
"return a set containing everything but the whitespace characters"
^NonSeparators ifNil: [
NonSeparators := self separators complement ]!
Item was added:
- ----- Method: ByteCharacterSet class>>separators (in category
'accessing') -----
- separators
"return a set containing just the whitespace characters"
^Separators ifNil: [ Separators := self newFrom: Character
separators ]!
Item was added:
- ----- Method: ByteCharacterSet class>>withAll: (in category 'instance
creation') -----
- withAll: aCollection
"Create a new ByteCharacterSet containing all the characters from
aCollection."
^self newFrom: aCollection!
Item was added:
- ----- Method: ByteCharacterSet>>= (in category 'comparing') -----
- = anObject
self species == anObject species ifFalse: [ ^false ].
anObject size = tally ifFalse: [ ^false ].
^self byteArrayMap = anObject byteArrayMap!
Item was added:
- ----- Method: ByteCharacterSet>>add: (in category 'adding') -----
- add: aCharacter
"I automatically become a WideByteCharacterSet if you add a wide
character to myself"
| index |
(index := aCharacter asInteger + 1) <= 256 ifFalse: [
| wide |
wide := WideCharacterSet new.
wide addAll: self.
wide add: aCharacter.
self becomeForward: wide.
^aCharacter ].
(byteArrayMap at: index) = 1 ifFalse: [
byteArrayMap at: index put: 1.
tally := tally + 1 ].
^aCharacter!
Item was added:
- ----- Method: ByteCharacterSet>>asString (in category 'conversion') -----
- asString
"Convert the receiver into a String"
^String new: self size streamContents:[:s|
self do:[:ch| s nextPut: ch].
].!
Item was added:
- ----- Method: ByteCharacterSet>>byteArrayMap (in category 'private')
- byteArrayMap
"return a ByteArray mapping each ascii value to a 1 if that ascii
value is in the set, and a 0 if it isn't. Intended for use by primitives only"
^byteArrayMap!
Item was added:
- ----- Method: ByteCharacterSet>>byteComplement (in category
'conversion') -----
- byteComplement
"return a character set containing precisely the single byte
characters the receiver does not"
| set |
set := ByteCharacterSet allCharacters.
self do: [ :c | set remove: c ].
^set!
Item was added:
- ----- Method: ByteCharacterSet>>complement (in category 'conversion')
- complement
"return a character set containing precisely the characters the
receiver does not"
^ByteCharacterSetComplement of: self copy!
Item was added:
- ----- Method: ByteCharacterSet>>do: (in category 'enumerating') -----
- do: aBlock
"evaluate aBlock with each character in the set"
| index |
tally >= 128 ifTrue: [ "dense"
index := 0.
[ (index := index + 1) <= 256 ] whileTrue: [
(byteArrayMap at: index) = 1 ifTrue: [
aBlock value: (Character value: index - 1)
] ].
^self ].
"sparse"
index := 0.
[ (index := byteArrayMap indexOf: 1 startingAt: index + 1) = 0 ]
whileFalse: [
aBlock value: (Character value: index - 1) ].
!
Item was added:
- ----- Method: ByteCharacterSet>>findFirstInByteString:startingAt: (in
category 'zap me later') -----
- findFirstInByteString: aByteString startingAt: startIndex
"Double dispatching: since we know this is a ByteString, we can
use a superfast primitive using a ByteArray map with 0 slots for byte characters not included and 1 for byte characters included in the receiver."
^ByteString
findFirstInString: aByteString
inSet: self byteArrayMap
startingAt: startIndex!
Item was added:
- ----- Method: ByteCharacterSet>>hasWideCharacters (in category
'testing') -----
- hasWideCharacters
^false!
Item was added:
- ----- Method: ByteCharacterSet>>hash (in category 'comparing') -----
- hash
^self byteArrayMap hash!
Item was added:
- ----- Method: ByteCharacterSet>>includes: (in category 'testing') -----
- includes: anObject
| index |
anObject isCharacter ifFalse: [ ^false ].
(index := anObject asInteger + 1) > 256 ifTrue: [ ^false ].
^(byteArrayMap at: index) > 0!
Item was added:
- ----- Method: ByteCharacterSet>>initialize (in category 'private') -----
- initialize
byteArrayMap := ByteArray new: 256.
tally := 0!
Item was added:
- ----- Method: ByteCharacterSet>>isEmpty (in category 'testing') -----
- isEmpty
^tally = 0!
Item was added:
- ----- Method: ByteCharacterSet>>occurrencesOf: (in category 'zap me
later') -----
- occurrencesOf: anObject
"Answer how many of the receiver's elements are equal to anObject.
Optimized version."
(self includes: anObject) ifTrue: [ ^1 ].
^0!
Item was added:
- ----- Method: ByteCharacterSet>>postCopy (in category 'copying') -----
- postCopy
super postCopy.
byteArrayMap := byteArrayMap copy!
Item was added:
- ----- Method: ByteCharacterSet>>remove: (in category 'removing') -----
- remove: aCharacter
^self remove: aCharacter ifAbsent: aCharacter!
Item was added:
- ----- Method: ByteCharacterSet>>remove:ifAbsent: (in category
'removing') -----
- remove: aCharacter ifAbsent: aBlock
| index |
(index := aCharacter asciiValue + 1) <= 256 ifFalse: [ ^aBlock
value ].
(byteArrayMap at: index) = 0 ifTrue: [ ^aBlock value ].
byteArrayMap at: index put: 0.
tally := tally - 1.
^aCharacter!
Item was added:
- ----- Method: ByteCharacterSet>>removeAll (in category 'removing') -----
- removeAll
byteArrayMap atAllPut: 0.
tally := 0!
Item was added:
- ----- Method: ByteCharacterSet>>size (in category 'accessing') -----
- size
^tally!
Item was added:
- ----- Method: ByteCharacterSet>>species (in category 'zap me later')
- species
^CharacterSet!
Item was added:
- ----- Method: ByteCharacterSet>>union: (in category 'enumerating') -----
- union: aCollection
(self species = aCollection species or: [aCollection isString or:
[aCollection allSatisfy: [:e | e isCharacter]]]) ifFalse: [^super union: aCollection].
(self species = aCollection species and: [self class ~=
aCollection class]) ifTrue: [^aCollection union: self].
^self copy addAll: aCollection; yourself!
Item was added:
- ----- Method: ByteCharacterSet>>wideCharacterMap (in category
'private') -----
- wideCharacterMap
"used for comparing with WideByteCharacterSet"
| wide |
wide := WideByteCharacterSet new.
wide addAll: self.
^wide wideCharacterMap!
Item was changed: ----- Method: CharacterSet class>>crlf (in category 'accessing') ----- crlf
^CrLf ifNil: [ CrLf := ByteCharacterSet with: Character cr with:
Character lf ]!
^CrLf ifNil: [ CrLf := self with: Character cr with: Character lf
]!
Item was changed: ----- Method: CharacterSet class>>empty (in category 'instance creation') ----- empty "return an empty set of characters"
^ByteCharacterSet new!
^self new!
Item was changed: ----- Method: CharacterSet class>>newFrom: (in category 'instance creation') ----- newFrom: aCollection | newCollection |
newCollection := ByteCharacterSet new.
newCollection := self new. newCollection addAll: aCollection. ^newCollection!
Item was removed:
- ----- Method: CharacterSetComplement>>findFirstInByteString:startingAt:
(in category 'enumerating') -----
- findFirstInByteString: aByteString startingAt: startIndex
"Double dispatching: since we know this is a ByteString, we can
use a superfast primitive using a ByteArray map with 0 slots for byte characters not included and 1 for byte characters included in the receiver."
^ByteString
findFirstInString: aByteString
inSet: self byteArrayMap
startingAt: startIndex!
Item was removed:
- ----- Method: WideCharacterSet>>findFirstInByteString:startingAt: (in
category 'enumerating') -----
- findFirstInByteString: aByteString startingAt: startIndex
"Double dispatching: since we know this is a ByteString, we can
use a superfast primitive using a ByteArray map with 0 slots for byte characters not included and 1 for byte characters included in the receiver."
^ByteString
findFirstInString: aByteString
inSet: byteArrayMap
startingAt: startIndex!
Item was removed:
- ----- Method: WideCharacterSet>>species (in category 'comparing') -----
- species
^self hasWideCharacters
ifTrue: [WideCharacterSet]
ifFalse: [CharacterSet]!
Item was changed:
- (PackageInfo named: 'Collections') postscript: 'CharacterSet
allInstancesDo: [:e | ByteCharacterSet adoptInstance: e ]'!
- (PackageInfo named: 'Collections') postscript: 'CharacterSet
allInstancesDo: #size'!
squeak-dev@lists.squeakfoundation.org