Nicolas Cellier uploaded a new version of Collections to project The Trunk:
http://source.squeak.org/trunk/Collections-nice.775.mcz
==================== Summary ====================
Name: Collections-nice.775
Author: nice
Time: 1 December 2017, 1:25:19.882602 am
UUID: 5ef7a80c-6213-4e8e-8f0c-b45e110ce15e
Ancestors: Collections-nice.774
Rename CharacterSet -> ByteCharacterSet
This is step 1:
- create a parallel ByteCharacterSet
- then mutate CharacterSet instances -> ByteCharacterSet in postscript
=============== Diff against Collections-nice.774 ===============
Item was added:
+ ----- Method: AbstractCharacterSet>>species (in category 'private') -----
+ species
+ ^CharacterSet!
Item was added:
+ Collection subclass: #ByteCharacterSet
+ instanceVariableNames: 'byteArrayMap tally'
+ classVariableNames: 'CrLf NonSeparators Separators'
+ poolDictionaries: ''
+ category: 'Collections-Support'!
+
+ !ByteCharacterSet commentStamp: '<historical>' prior: 0!
+ A set of characters. Lookups for inclusion are very fast.!
Item was added:
+ ----- Method: ByteCharacterSet class>>allCharacters (in category 'instance creation') -----
+ allCharacters
+ "return a set containing all characters"
+
+ | set |
+ set := self empty.
+ 0 to: 255 do: [ :ascii | set add: (Character value: ascii) ].
+ ^set!
Item was added:
+ ----- Method: ByteCharacterSet class>>cleanUp: (in category 'initialize-release') -----
+ cleanUp: aggressive
+
+ CrLf := NonSeparators := Separators := nil!
Item was added:
+ ----- Method: ByteCharacterSet class>>crlf (in category 'accessing') -----
+ crlf
+
+ ^CrLf ifNil: [ CrLf := self with: Character cr with: Character lf ]!
Item was added:
+ ----- Method: ByteCharacterSet class>>empty (in category 'instance creation') -----
+ empty
+ "return an empty set of characters"
+ ^self new!
Item was added:
+ ----- Method: ByteCharacterSet class>>newFrom: (in category 'instance creation') -----
+ newFrom: aCollection
+ | newCollection |
+ newCollection := self new.
+ newCollection addAll: aCollection.
+ ^newCollection!
Item was added:
+ ----- Method: ByteCharacterSet class>>nonSeparators (in category 'accessing') -----
+ nonSeparators
+ "return a set containing everything but the whitespace characters"
+
+ ^NonSeparators ifNil: [
+ NonSeparators := self separators complement ]!
Item was added:
+ ----- Method: ByteCharacterSet class>>separators (in category 'accessing') -----
+ separators
+ "return a set containing just the whitespace characters"
+
+ ^Separators ifNil: [ Separators := self newFrom: Character separators ]!
Item was added:
+ ----- Method: ByteCharacterSet class>>withAll: (in category 'instance creation') -----
+ withAll: aCollection
+ "Create a new ByteCharacterSet containing all the characters from aCollection."
+
+ ^self newFrom: aCollection!
Item was added:
+ ----- Method: ByteCharacterSet>>= (in category 'comparing') -----
+ = anObject
+
+ self species == anObject species ifFalse: [ ^false ].
+ anObject size = tally ifFalse: [ ^false ].
+ ^self byteArrayMap = anObject byteArrayMap!
Item was added:
+ ----- Method: ByteCharacterSet>>add: (in category 'adding') -----
+ add: aCharacter
+ "I automatically become a WideByteCharacterSet if you add a wide character to myself"
+
+ | index |
+ (index := aCharacter asInteger + 1) <= 256 ifFalse: [
+ | wide |
+ wide := WideCharacterSet new.
+ wide addAll: self.
+ wide add: aCharacter.
+ self becomeForward: wide.
+ ^aCharacter ].
+ (byteArrayMap at: index) = 1 ifFalse: [
+ byteArrayMap at: index put: 1.
+ tally := tally + 1 ].
+ ^aCharacter!
Item was added:
+ ----- Method: ByteCharacterSet>>asString (in category 'conversion') -----
+ asString
+ "Convert the receiver into a String"
+
+ ^String new: self size streamContents:[:s|
+ self do:[:ch| s nextPut: ch].
+ ].!
Item was added:
+ ----- Method: ByteCharacterSet>>byteArrayMap (in category 'private') -----
+ byteArrayMap
+ "return a ByteArray mapping each ascii value to a 1 if that ascii value is in the set, and a 0 if it isn't. Intended for use by primitives only"
+ ^byteArrayMap!
Item was added:
+ ----- Method: ByteCharacterSet>>byteComplement (in category 'conversion') -----
+ byteComplement
+ "return a character set containing precisely the single byte characters the receiver does not"
+
+ | set |
+ set := ByteCharacterSet allCharacters.
+ self do: [ :c | set remove: c ].
+ ^set!
Item was added:
+ ----- Method: ByteCharacterSet>>complement (in category 'conversion') -----
+ complement
+ "return a character set containing precisely the characters the receiver does not"
+
+ ^ByteCharacterSetComplement of: self copy!
Item was added:
+ ----- Method: ByteCharacterSet>>do: (in category 'enumerating') -----
+ do: aBlock
+ "evaluate aBlock with each character in the set"
+
+ | index |
+ tally >= 128 ifTrue: [ "dense"
+ index := 0.
+ [ (index := index + 1) <= 256 ] whileTrue: [
+ (byteArrayMap at: index) = 1 ifTrue: [
+ aBlock value: (Character value: index - 1) ] ].
+ ^self ].
+ "sparse"
+ index := 0.
+ [ (index := byteArrayMap indexOf: 1 startingAt: index + 1) = 0 ] whileFalse: [
+ aBlock value: (Character value: index - 1) ].
+ !
Item was added:
+ ----- Method: ByteCharacterSet>>findFirstInByteString: startingAt: (in category 'zap me later') -----
+ findFirstInByteString: aByteString startingAt: startIndex
+ "Double dispatching: since we know this is a ByteString, we can use a superfast primitive using a ByteArray map with 0 slots for byte characters not included and 1 for byte characters included in the receiver."
+ ^ByteString
+ findFirstInString: aByteString
+ inSet: self byteArrayMap
+ startingAt: startIndex!
Item was added:
+ ----- Method: ByteCharacterSet>>hasWideCharacters (in category 'testing') -----
+ hasWideCharacters
+ ^false!
Item was added:
+ ----- Method: ByteCharacterSet>>hash (in category 'comparing') -----
+ hash
+ ^self byteArrayMap hash!
Item was added:
+ ----- Method: ByteCharacterSet>>includes: (in category 'testing') -----
+ includes: anObject
+
+ | index |
+ anObject isCharacter ifFalse: [ ^false ].
+ (index := anObject asInteger + 1) > 256 ifTrue: [ ^false ].
+ ^(byteArrayMap at: index) > 0!
Item was added:
+ ----- Method: ByteCharacterSet>>initialize (in category 'private') -----
+ initialize
+
+ byteArrayMap := ByteArray new: 256.
+ tally := 0!
Item was added:
+ ----- Method: ByteCharacterSet>>isEmpty (in category 'testing') -----
+ isEmpty
+ ^tally = 0!
Item was added:
+ ----- Method: ByteCharacterSet>>occurrencesOf: (in category 'zap me later') -----
+ occurrencesOf: anObject
+ "Answer how many of the receiver's elements are equal to anObject. Optimized version."
+
+ (self includes: anObject) ifTrue: [ ^1 ].
+ ^0!
Item was added:
+ ----- Method: ByteCharacterSet>>postCopy (in category 'copying') -----
+ postCopy
+ super postCopy.
+ byteArrayMap := byteArrayMap copy!
Item was added:
+ ----- Method: ByteCharacterSet>>remove: (in category 'removing') -----
+ remove: aCharacter
+
+ ^self remove: aCharacter ifAbsent: aCharacter!
Item was added:
+ ----- Method: ByteCharacterSet>>remove:ifAbsent: (in category 'removing') -----
+ remove: aCharacter ifAbsent: aBlock
+
+ | index |
+ (index := aCharacter asciiValue + 1) <= 256 ifFalse: [ ^aBlock value ].
+ (byteArrayMap at: index) = 0 ifTrue: [ ^aBlock value ].
+ byteArrayMap at: index put: 0.
+ tally := tally - 1.
+ ^aCharacter!
Item was added:
+ ----- Method: ByteCharacterSet>>removeAll (in category 'removing') -----
+ removeAll
+
+ byteArrayMap atAllPut: 0.
+ tally := 0!
Item was added:
+ ----- Method: ByteCharacterSet>>size (in category 'accessing') -----
+ size
+
+ ^tally!
Item was added:
+ ----- Method: ByteCharacterSet>>species (in category 'zap me later') -----
+ species
+ ^CharacterSet!
Item was added:
+ ----- Method: ByteCharacterSet>>union: (in category 'enumerating') -----
+ union: aCollection
+ (self species = aCollection species or: [aCollection isString or: [aCollection allSatisfy: [:e | e isCharacter]]]) ifFalse: [^super union: aCollection].
+ (self species = aCollection species and: [self class ~= aCollection class]) ifTrue: [^aCollection union: self].
+ ^self copy addAll: aCollection; yourself!
Item was added:
+ ----- Method: ByteCharacterSet>>wideCharacterMap (in category 'private') -----
+ wideCharacterMap
+ "used for comparing with WideByteCharacterSet"
+
+ | wide |
+ wide := WideByteCharacterSet new.
+ wide addAll: self.
+ ^wide wideCharacterMap!
Item was changed:
----- Method: CharacterSet class>>crlf (in category 'accessing') -----
crlf
+ ^CrLf ifNil: [ CrLf := ByteCharacterSet with: Character cr with: Character lf ]!
- ^CrLf ifNil: [ CrLf := self with: Character cr with: Character lf ]!
Item was changed:
----- Method: CharacterSet class>>empty (in category 'instance creation') -----
empty
"return an empty set of characters"
+ ^ByteCharacterSet new!
- ^self new!
Item was changed:
----- Method: CharacterSet class>>newFrom: (in category 'instance creation') -----
newFrom: aCollection
| newCollection |
+ newCollection := ByteCharacterSet new.
- newCollection := self new.
newCollection addAll: aCollection.
^newCollection!
Item was removed:
- ----- Method: CharacterSetComplement>>findFirstInByteString: startingAt: (in category 'enumerating') -----
- findFirstInByteString: aByteString startingAt: startIndex
- "Double dispatching: since we know this is a ByteString, we can use a superfast primitive using a ByteArray map with 0 slots for byte characters not included and 1 for byte characters included in the receiver."
- ^ByteString
- findFirstInString: aByteString
- inSet: self byteArrayMap
- startingAt: startIndex!
Item was removed:
- ----- Method: WideCharacterSet>>findFirstInByteString: startingAt: (in category 'enumerating') -----
- findFirstInByteString: aByteString startingAt: startIndex
- "Double dispatching: since we know this is a ByteString, we can use a superfast primitive using a ByteArray map with 0 slots for byte characters not included and 1 for byte characters included in the receiver."
-
- ^ByteString
- findFirstInString: aByteString
- inSet: byteArrayMap
- startingAt: startIndex!
Item was removed:
- ----- Method: WideCharacterSet>>species (in category 'comparing') -----
- species
- ^self hasWideCharacters
- ifTrue: [WideCharacterSet]
- ifFalse: [CharacterSet]!
Item was changed:
+ (PackageInfo named: 'Collections') postscript: 'CharacterSet allInstancesDo: [:e | ByteCharacterSet adoptInstance: e ]'!
- (PackageInfo named: 'Collections') postscript: 'CharacterSet allInstancesDo: #size'!