[squeak-dev] The Trunk: Collections-nice.775.mcz

commits at source.squeak.org commits at source.squeak.org
Fri Dec 1 00:25:40 UTC 2017


Nicolas Cellier uploaded a new version of Collections to project The Trunk:
http://source.squeak.org/trunk/Collections-nice.775.mcz

==================== Summary ====================

Name: Collections-nice.775
Author: nice
Time: 1 December 2017, 1:25:19.882602 am
UUID: 5ef7a80c-6213-4e8e-8f0c-b45e110ce15e
Ancestors: Collections-nice.774

Rename CharacterSet -> ByteCharacterSet

This is step 1:
- create a parallel ByteCharacterSet
- then mutate CharacterSet instances -> ByteCharacterSet in postscript

=============== Diff against Collections-nice.774 ===============

Item was added:
+ ----- Method: AbstractCharacterSet>>species (in category 'private') -----
+ species
+ 	^CharacterSet!

Item was added:
+ Collection subclass: #ByteCharacterSet
+ 	instanceVariableNames: 'byteArrayMap tally'
+ 	classVariableNames: 'CrLf NonSeparators Separators'
+ 	poolDictionaries: ''
+ 	category: 'Collections-Support'!
+ 
+ !ByteCharacterSet commentStamp: '<historical>' prior: 0!
+ A set of characters.  Lookups for inclusion are very fast.!

Item was added:
+ ----- Method: ByteCharacterSet class>>allCharacters (in category 'instance creation') -----
+ allCharacters
+ 	"return a set containing all characters"
+ 
+ 	| set |
+ 	set := self empty.
+ 	0 to: 255 do: [ :ascii | set add: (Character value: ascii) ].
+ 	^set!

Item was added:
+ ----- Method: ByteCharacterSet class>>cleanUp: (in category 'initialize-release') -----
+ cleanUp: aggressive
+ 
+ 	CrLf := NonSeparators := Separators := nil!

Item was added:
+ ----- Method: ByteCharacterSet class>>crlf (in category 'accessing') -----
+ crlf
+ 
+ 	^CrLf ifNil: [ CrLf := self with: Character cr with: Character lf ]!

Item was added:
+ ----- Method: ByteCharacterSet class>>empty (in category 'instance creation') -----
+ empty
+  	"return an empty set of characters"
+ 	^self new!

Item was added:
+ ----- Method: ByteCharacterSet class>>newFrom: (in category 'instance creation') -----
+ newFrom: aCollection
+ 	| newCollection |
+ 	newCollection := self new.
+ 	newCollection addAll: aCollection.
+ 	^newCollection!

Item was added:
+ ----- Method: ByteCharacterSet class>>nonSeparators (in category 'accessing') -----
+ nonSeparators
+ 	"return a set containing everything but the whitespace characters"
+ 
+ 	^NonSeparators ifNil: [
+ 		NonSeparators := self separators complement ]!

Item was added:
+ ----- Method: ByteCharacterSet class>>separators (in category 'accessing') -----
+ separators
+ 	"return a set containing just the whitespace characters"
+ 
+ 	^Separators ifNil: [ Separators := self newFrom: Character separators ]!

Item was added:
+ ----- Method: ByteCharacterSet class>>withAll: (in category 'instance creation') -----
+ withAll: aCollection
+ 	"Create a new ByteCharacterSet containing all the characters from aCollection."
+ 
+ 	^self newFrom: aCollection!

Item was added:
+ ----- Method: ByteCharacterSet>>= (in category 'comparing') -----
+ = anObject
+ 	
+ 	self species == anObject species ifFalse: [ ^false ].
+ 	anObject size = tally ifFalse: [ ^false ].
+ 	^self byteArrayMap = anObject byteArrayMap!

Item was added:
+ ----- Method: ByteCharacterSet>>add: (in category 'adding') -----
+ add: aCharacter
+ 	"I automatically become a WideByteCharacterSet if you add a wide character to myself"
+ 	
+ 	| index |
+ 	(index := aCharacter asInteger + 1) <= 256 ifFalse: [
+ 		| wide |
+ 		wide := WideCharacterSet new.
+ 		wide addAll: self.
+ 		wide add: aCharacter.
+ 		self becomeForward: wide.
+ 		^aCharacter ].
+ 	(byteArrayMap at: index) = 1 ifFalse: [
+ 		byteArrayMap at: index put: 1.
+ 		tally := tally + 1 ].
+ 	^aCharacter!

Item was added:
+ ----- Method: ByteCharacterSet>>asString (in category 'conversion') -----
+ asString
+ 	"Convert the receiver into a String"
+ 
+ 	^String new: self size streamContents:[:s|
+ 		self do:[:ch| s nextPut: ch].
+ 	].!

Item was added:
+ ----- Method: ByteCharacterSet>>byteArrayMap (in category 'private') -----
+ byteArrayMap
+ 	"return a ByteArray mapping each ascii value to a 1 if that ascii value is in the set, and a 0 if it isn't.  Intended for use by primitives only"
+ 	^byteArrayMap!

Item was added:
+ ----- Method: ByteCharacterSet>>byteComplement (in category 'conversion') -----
+ byteComplement
+ 	"return a character set containing precisely the single byte characters the receiver does not"
+ 	
+ 	| set |
+ 	set := ByteCharacterSet allCharacters.
+ 	self do: [ :c | set remove: c ].
+ 	^set!

Item was added:
+ ----- Method: ByteCharacterSet>>complement (in category 'conversion') -----
+ complement
+ 	"return a character set containing precisely the characters the receiver does not"
+ 	
+ 	^ByteCharacterSetComplement of: self copy!

Item was added:
+ ----- Method: ByteCharacterSet>>do: (in category 'enumerating') -----
+ do: aBlock
+ 	"evaluate aBlock with each character in the set"
+ 
+ 	| index |
+ 	tally >= 128 ifTrue: [ "dense"
+ 		index := 0.
+ 		[ (index := index + 1) <= 256 ] whileTrue: [
+ 			(byteArrayMap at: index) = 1 ifTrue: [
+ 				aBlock value: (Character value: index - 1) ] ].
+ 		^self ].
+ 	"sparse"
+ 	index := 0.
+ 	[ (index := byteArrayMap indexOf: 1 startingAt: index + 1) = 0 ] whileFalse: [
+ 		aBlock value: (Character value: index - 1) ].
+ 	!

Item was added:
+ ----- Method: ByteCharacterSet>>findFirstInByteString:startingAt: (in category 'zap me later') -----
+ findFirstInByteString: aByteString startingAt: startIndex
+ 	"Double dispatching: since we know this is a ByteString, we can use a superfast primitive using a ByteArray map with 0 slots for byte characters not included and 1 for byte characters included in the receiver."
+ 	^ByteString
+ 		findFirstInString: aByteString
+ 		inSet: self byteArrayMap
+ 		startingAt: startIndex!

Item was added:
+ ----- Method: ByteCharacterSet>>hasWideCharacters (in category 'testing') -----
+ hasWideCharacters
+ 	^false!

Item was added:
+ ----- Method: ByteCharacterSet>>hash (in category 'comparing') -----
+ hash
+ 	^self byteArrayMap hash!

Item was added:
+ ----- Method: ByteCharacterSet>>includes: (in category 'testing') -----
+ includes: anObject
+ 
+ 	| index |
+ 	anObject isCharacter ifFalse: [ ^false ].
+ 	(index := anObject asInteger + 1) > 256 ifTrue: [ ^false ].
+ 	^(byteArrayMap at: index) > 0!

Item was added:
+ ----- Method: ByteCharacterSet>>initialize (in category 'private') -----
+ initialize
+ 
+ 	byteArrayMap := ByteArray new: 256.
+ 	tally := 0!

Item was added:
+ ----- Method: ByteCharacterSet>>isEmpty (in category 'testing') -----
+ isEmpty
+ 	^tally = 0!

Item was added:
+ ----- Method: ByteCharacterSet>>occurrencesOf: (in category 'zap me later') -----
+ occurrencesOf: anObject
+ 	"Answer how many of the receiver's elements are equal to anObject. Optimized version."
+ 
+ 	(self includes: anObject) ifTrue: [ ^1 ].
+ 	^0!

Item was added:
+ ----- Method: ByteCharacterSet>>postCopy (in category 'copying') -----
+ postCopy
+ 	super postCopy.
+ 	byteArrayMap := byteArrayMap copy!

Item was added:
+ ----- Method: ByteCharacterSet>>remove: (in category 'removing') -----
+ remove: aCharacter
+ 
+ 	^self remove: aCharacter ifAbsent: aCharacter!

Item was added:
+ ----- Method: ByteCharacterSet>>remove:ifAbsent: (in category 'removing') -----
+ remove: aCharacter ifAbsent: aBlock
+ 
+ 	| index |
+ 	(index := aCharacter asciiValue + 1) <= 256 ifFalse: [ ^aBlock value ].
+ 	(byteArrayMap at: index) = 0 ifTrue: [ ^aBlock value ].
+ 	byteArrayMap at: index put: 0.
+ 	tally := tally - 1.
+ 	^aCharacter!

Item was added:
+ ----- Method: ByteCharacterSet>>removeAll (in category 'removing') -----
+ removeAll
+ 
+ 	byteArrayMap atAllPut: 0.
+ 	tally := 0!

Item was added:
+ ----- Method: ByteCharacterSet>>size (in category 'accessing') -----
+ size
+ 
+ 	^tally!

Item was added:
+ ----- Method: ByteCharacterSet>>species (in category 'zap me later') -----
+ species
+ 	^CharacterSet!

Item was added:
+ ----- Method: ByteCharacterSet>>union: (in category 'enumerating') -----
+ union: aCollection
+ 	(self species = aCollection species or: [aCollection isString or: [aCollection allSatisfy: [:e | e isCharacter]]]) ifFalse: [^super union: aCollection].
+ 	(self species = aCollection species and: [self class ~= aCollection class]) ifTrue: [^aCollection union: self].
+ 	^self copy addAll: aCollection; yourself!

Item was added:
+ ----- Method: ByteCharacterSet>>wideCharacterMap (in category 'private') -----
+ wideCharacterMap
+ 	"used for comparing with WideByteCharacterSet"
+ 	
+ 	| wide |
+ 	wide := WideByteCharacterSet new.
+ 	wide addAll: self.
+ 	^wide wideCharacterMap!

Item was changed:
  ----- Method: CharacterSet class>>crlf (in category 'accessing') -----
  crlf
  
+ 	^CrLf ifNil: [ CrLf := ByteCharacterSet with: Character cr with: Character lf ]!
- 	^CrLf ifNil: [ CrLf := self with: Character cr with: Character lf ]!

Item was changed:
  ----- Method: CharacterSet class>>empty (in category 'instance creation') -----
  empty
   	"return an empty set of characters"
+ 	^ByteCharacterSet new!
- 	^self new!

Item was changed:
  ----- Method: CharacterSet class>>newFrom: (in category 'instance creation') -----
  newFrom: aCollection
  	| newCollection |
+ 	newCollection := ByteCharacterSet new.
- 	newCollection := self new.
  	newCollection addAll: aCollection.
  	^newCollection!

Item was removed:
- ----- Method: CharacterSetComplement>>findFirstInByteString:startingAt: (in category 'enumerating') -----
- findFirstInByteString: aByteString startingAt: startIndex
- 	"Double dispatching: since we know this is a ByteString, we can use a superfast primitive using a ByteArray map with 0 slots for byte characters not included and 1 for byte characters included in the receiver."
- 	^ByteString
- 		findFirstInString: aByteString
- 		inSet: self byteArrayMap
- 		startingAt: startIndex!

Item was removed:
- ----- Method: WideCharacterSet>>findFirstInByteString:startingAt: (in category 'enumerating') -----
- findFirstInByteString: aByteString startingAt: startIndex
- 	"Double dispatching: since we know this is a ByteString, we can use a superfast primitive using a ByteArray map with 0 slots for byte characters not included and 1 for byte characters included in the receiver."
- 
- 	^ByteString
- 		findFirstInString: aByteString
- 		inSet: byteArrayMap
- 		startingAt: startIndex!

Item was removed:
- ----- Method: WideCharacterSet>>species (in category 'comparing') -----
- species
- 	^self hasWideCharacters
- 		ifTrue: [WideCharacterSet]
- 		ifFalse: [CharacterSet]!

Item was changed:
+ (PackageInfo named: 'Collections') postscript: 'CharacterSet allInstancesDo: [:e | ByteCharacterSet adoptInstance: e ]'!
- (PackageInfo named: 'Collections') postscript: 'CharacterSet allInstancesDo: #size'!



More information about the Squeak-dev mailing list