[squeak-dev] The Trunk: Collections-nice.246.mcz

commits at source.squeak.org commits at source.squeak.org
Wed Dec 9 20:03:34 UTC 2009


Nicolas Cellier uploaded a new version of Collections to project The Trunk:
http://source.squeak.org/trunk/Collections-nice.246.mcz

==================== Summary ====================

Name: Collections-nice.246
Author: nice
Time: 9 December 2009, 9:03:30 am
UUID: 93e084cc-d836-1146-a61d-5ce2a8972cba
Ancestors: Collections-nice.245

After an idea from Henrik Johansen, speed-up WideCharacterSet enumeration
- 1) do not use a WordArray but a ByteArray to avoid large integers
- 2) inline enumeration of bits in a byte with optimistic subtraction of highest bit (work best if bits pattern is sparse)

=============== Diff against Collections-nice.245 ===============

Item was changed:
  ----- Method: WideCharacterSet>>setBitmap:at: (in category 'private') -----
  setBitmap: aMap at: shortInteger
  	"set a single bit in aMap.
  	shortInteger should be between: 0 and: 16rFFFF"
  	
  	| collecIndex bitIndex |
+ 	collecIndex := shortInteger bitShift: -3.
+ 	bitIndex := shortInteger bitAnd: 7.
- 	collecIndex := shortInteger bitShift: -5.
- 	bitIndex := shortInteger bitAnd: 16r1F.
  	^aMap at: collecIndex + 1 put: ((aMap at: collecIndex + 1) bitOr: (1 bitShift: bitIndex))!

Item was added:
+ ----- Method: WideCharacterSet>>convertInternalRepresentationToByteArray (in category 'private') -----
+ convertInternalRepresentationToByteArray
+ 	| oldMap |
+ 	(map anySatisfy: [:e | e class == WordArray]) ifTrue: [
+ 		oldMap := map.
+ 		map := Dictionary new.
+ 		oldMap keysAndValuesDo: [:i :words |
+ 			self wordBitmap: words do: [:code | self add: (Character value: (i bitShift: 16) + code)]]].!

Item was changed:
  ----- Method: WideCharacterSet>>add: (in category 'collection ops') -----
  add: aCharacter 
  	| val high low lowmap |
  	val := aCharacter asciiValue.
  	high := val bitShift: -16.
  	low := val bitAnd: 16rFFFF.
+ 	lowmap := map at: high ifAbsentPut: ["create a chunk of 65536=8192*8 bits"
+ 		ByteArray new: 8192].
- 	lowmap := map at: high ifAbsentPut: [WordArray new: 2048].
  	self setBitmap: lowmap at: low.
  	^ aCharacter!

Item was changed:
  ----- Method: WideCharacterSet>>byteArrayMap (in category 'comparing') -----
  byteArrayMap
  	"return a ByteArray mapping each ascii value to a 1 if that ascii value is in the set, and a 0 if it isn't.
  	Intended for use by primitives only. (and comparison)
  	This version will answer a subset with only byte characters"
  	
  	| aMap lowmap |
  	aMap := ByteArray new: 256.
  	lowmap := map at: 0 ifAbsent: [^aMap].
+ 	lowmap := lowmap copyFrom: 1 to: 32. "Keep first 8*32=256 bits..."
- 	lowmap := lowmap copyFrom: 1 to: 8. "Keep first 8*32=256 bits..."
  	self bitmap: lowmap do: [:code | aMap at: code + 1 put: 1].
  	^aMap!

Item was changed:
  ----- Method: WideCharacterSet>>bitmap:at: (in category 'private') -----
  bitmap: aMap at: shortInteger
  	"access a single bit in aMap.
  	shortInteger should be between: 0 and: 16rFFFF"
  	
  	| collecIndex bitIndex |
+ 	collecIndex := shortInteger bitShift: -3.
+ 	bitIndex := shortInteger bitAnd: 7.
- 	collecIndex := shortInteger bitShift: -5.
- 	bitIndex := shortInteger bitAnd: 16r1F.
  	^(aMap at: collecIndex + 1) bitAnd: (1 bitShift: bitIndex)!

Item was added:
+ ----- Method: WideCharacterSet class>>initialize (in category 'class initialization') -----
+ initialize
+ 	"Old representation use WordArray.
+ 	Accessing a WordArray can create slow LargeInteger and is inefficient.
+ 	This is a temporary hack to mutate internal representation to faster ByteArray"
+ 	self allInstancesDo: [:e | e convertInternalRepresentationToByteArray]!

Item was changed:
  ----- Method: WideCharacterSet>>bitmap:do: (in category 'private') -----
  bitmap: aMap do: aBlock
+ 	"Execute a block with each value (0 based) corresponding to set bits.
+ 	Implementation: this version works best for sparse maps.
+ 	The powers of two and highBit tables are inlined for speed"
- 	"Execute a block with each value (0 based) corresponding to set bits"
  	
+ 	| byte bitOffset byteOffset powersOf2 highBits |
+ 	powersOf2 := #[1 2 4 8 16 32 64 128].
+ 	highBits := #[0 1 1 2 2 2 2 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7].
+ 	1 to: aMap size do: [:i | 
+ 		(byte := aMap at: i) = 0 ifFalse: [
+ 			byteOffset := (i - 1) bitShift: 3.
+ 			[aBlock value: (byteOffset + (bitOffset := highBits at: byte)).
+ 			(byte := byte - (powersOf2 at: 1 + bitOffset)) = 0] whileFalse]]!
- 	0 to: 31 do: [:shift |
- 		| mask |
- 		mask := 1 bitShift: shift.
- 		1 to: aMap size do: [:i | 
- 			((aMap at: i) bitAnd: mask) isZero ifFalse: [aBlock value: ((i - 1 bitShift: 5) bitOr: shift)]]]!

Item was changed:
  ----- Method: WideCharacterSet>>clearBitmap:at: (in category 'private') -----
  clearBitmap: aMap at: shortInteger
  	"clear a single bit in aMap.
  	shortInteger should be between: 0 and: 16rFFFF"
  	
  	| collecIndex bitIndex |
+ 	collecIndex := shortInteger bitShift: -3.
+ 	bitIndex := shortInteger bitAnd: 7.
- 	collecIndex := shortInteger bitShift: -5.
- 	bitIndex := shortInteger bitAnd: 16r1F.
  	^aMap at: collecIndex + 1 put: ((aMap at: collecIndex + 1) bitClear: (1 bitShift: bitIndex))!

Item was added:
+ ----- Method: WideCharacterSet>>wordBitmap:do: (in category 'private') -----
+ wordBitmap: aMap do: aBlock
+ 	"Execute a block with each value (0 based) corresponding to set bits"
+ 	
+ 	0 to: 31 do: [:shift |
+ 		| mask |
+ 		mask := 1 bitShift: shift.
+ 		1 to: aMap size do: [:i | 
+ 			((aMap at: i) bitAnd: mask) isZero ifFalse: [aBlock value: ((i - 1 bitShift: 5) bitOr: shift)]]]!




More information about the Squeak-dev mailing list