[squeak-dev] The Trunk: Collections-nice.572.mcz

commits at source.squeak.org commits at source.squeak.org
Thu May 29 13:10:40 UTC 2014


Nicolas Cellier uploaded a new version of Collections to project The Trunk:
http://source.squeak.org/trunk/Collections-nice.572.mcz

==================== Summary ====================

Name: Collections-nice.572
Author: nice
Time: 29 May 2014, 3:09:56.261 pm
UUID: 387e24c8-d4ec-4a93-8bbd-cb72a293fb0b
Ancestors: Collections-eem.571

Let asUppercase and asLowercase use the unicode tables for wide strings/characters.
Care is also taken to correctly handle characters with east asian encoding, but I'm not sure how healthy is this support in trunk...

Remove Character>>basicSqueakToIso which is totally obsolete (does not the right thing) and is not sent.

=============== Diff against Collections-eem.571 ===============

Item was changed:
  ----- Method: Character>>asLowercase (in category 'converting') -----
  asLowercase
  	"If the receiver is uppercase, answer its matching lowercase Character."
  	"A tentative implementation.  Eventually this should consult the Unicode table."
  
  	| v |
  	v := self charCode.
  	(((8r101 <= v and: [v <= 8r132]) or: [16rC0 <= v and: [v <= 16rD6]]) or: [16rD8 <= v and: [v <= 16rDE]])
+ 		ifTrue: [^ Character value: v + 8r40].
+ 	v < 256 ifTrue: [^self].
+ 	^self class value: ((value < 16r400000
+ 		ifTrue: [Unicode]
+ 		ifFalse: [(EncodedCharSet charsetAt: self leadingChar) charsetClass])
+ 			toLowercaseCode: v)!
- 		ifTrue: [^ Character value: value + 8r40]
- 		ifFalse: [^ self]!

Item was changed:
  ----- Method: Character>>asUnicode (in category 'converting') -----
  asUnicode
+ 	"Answer the unicode encoding of the receiver"
- 	| table charset v |
  	self leadingChar = 0 ifTrue: [^ value].
+ 	^(EncodedCharSet charsetAt: self leadingChar) charsetClass convertToUnicode: self charCode
- 	(charset := EncodedCharSet charsetAt: self leadingChar)
- 		isCharset ifFalse: [^ self charCode].
- 	(table := charset ucsTable)
- 		ifNil: [^ 16rFFFD].
- 	(v := table at: 1 + self charCode)
- 		= -1 ifTrue: [^ 16rFFFD].
- 	^ v.
  !

Item was changed:
  ----- Method: Character>>asUppercase (in category 'converting') -----
  asUppercase
  	"If the receiver is lowercase, answer its matching uppercase Character."
  	"A tentative implementation.  Eventually this should consult the Unicode table."	
  
  	| v |
  	v := self charCode.
  	(((8r141 <= v and: [v <= 8r172]) or: [16rE0 <= v and: [v <= 16rF6]]) or: [16rF8 <= v and: [v <= 16rFE]])
+ 		ifTrue: [^ Character value: v - 8r40].
+ 	v < 256 ifTrue: [^self].
+ 	^self class value: ((value < 16r400000
+ 		ifTrue: [Unicode]
+ 		ifFalse: [(EncodedCharSet charsetAt: self leadingChar) charsetClass])
+ 			toUppercaseCode: v)!
- 		ifTrue: [^ Character value: value - 8r40]
- 		ifFalse: [^ self]
- !

Item was removed:
- ----- Method: Character>>basicSqueakToIso (in category 'converting') -----
- basicSqueakToIso
- 	| asciiValue |
- 
- 	value < 128 ifTrue: [^ self].
- 	value > 255 ifTrue: [^ self].
- 	asciiValue := #(196 197 199 201 209 214 220 225 224 226 228 227 229 231 233 232 234 235 237 236 238 239 241 243 242 244 246 245 250 249 251 252 134 176 162 163 167 149 182 223 174 169 153 180 168 128 198 216 129 177 138 141 165 181 142 143 144 154 157 170 186 158 230 248 191 161 172 166 131 173 178 171 187 133 160 192 195 213 140 156 150 151 147 148 145 146 247 179 253 159 185 164 139 155 188 189 135 183 130 132 137 194 202 193 203 200 205 206 207 204 211 212 190 210 218 219 217 208 136 152 175 215 221 222 184 240 254 255 256 ) at: self asciiValue - 127.
- 	^ Character value: asciiValue.
- !

Item was added:
+ ----- Method: WideString>>asLowercase (in category 'converting') -----
+ asLowercase
+ 	^self collect: [:e | e asLowercase]!

Item was added:
+ ----- Method: WideString>>asUppercase (in category 'converting') -----
+ asUppercase
+ 	^self collect: [:e | e asUppercase]!



More information about the Squeak-dev mailing list