[Pkg] The Trunk: Multilingual-nice.123.mcz

commits at source.squeak.org commits at source.squeak.org
Wed Jul 14 11:17:30 UTC 2010


Nicolas Cellier uploaded a new version of Multilingual to project The Trunk:
http://source.squeak.org/trunk/Multilingual-nice.123.mcz

==================== Summary ====================

Name: Multilingual-nice.123
Author: nice
Time: 14 July 2010, 1:17:02.219 pm
UUID: ec8f05b8-78a6-4496-aca9-8f9b2e54823d
Ancestors: Multilingual-ul.122

1) simplify a case of at:ifAbsentPut: pattern in SparseXTable
2) provide a simple mapping of unicode upper/lower case characters as described at http://unicode.org/reports/tr21/tr21-5.html
		
Note 1: Unicode class now provides two utilities to transform case of a String rather than of a Character. This is for enabling future enhancements like handling special casings when case folding does change the number of characters.

Note 2: there is no automatic initialization performed yet. You'll have to execute this before using above utilities:
Unicode initializeCaseMappings.

This is only an unoptimized, first attempt proposal. Comments and changes are of course welcome.

=============== Diff against Multilingual-ul.122 ===============

Item was changed:
  ----- Method: SparseXTable>>tableFor: (in category 'accessing') -----
  tableFor: code
  
  	| div |
  	div := code // 65536.
+ 	^xTables at: div ifAbsentPut:
+ 		[Array new: 65536 withAll: 0].
- 	^xTables at: div ifAbsent: [
- 		| table |
- 		table := Array new: 65536 withAll: 0.
- 		xTables at: div put: table.
- 		table].
  !

Item was added:
+ ----- Method: Unicode class>>initializeCaseMappings (in category 'casing') -----
+ initializeCaseMappings
+ 	"Unicode initializeCaseMappings"
+ 	ToUpper := IdentityDictionary new.
+ 	ToLower := IdentityDictionary new.
+ 	UIManager default informUserDuring: [:bar|
+ 		| stream |
+ 		bar value: 'Downloading Unicode data'.
+ 		stream := HTTPClient httpGet: 'http://www.unicode.org/Public/3.2-Update/CaseFolding-3.2.0.txt'.
+ 		(stream isKindOf: RWBinaryOrTextStream) ifFalse:[^self error: 'Download failed'].
+ 		stream reset.
+ 		bar value: 'Updating Case Mappings'.
+ 		self parseCaseMappingFrom: stream.
+ 	].!

Item was changed:
  EncodedCharSet subclass: #Unicode
  	instanceVariableNames: ''
+ 	classVariableNames: 'Cc Cf Cn Co Compositions Cs DecimalProperty Decompositions GeneralCategory Ll Lm Lo Lt Lu Mc Me Mn Nd Nl No Pc Pd Pe Pf Pi Po Ps Sc Sk Sm So ToLower ToUpper Zl Zp Zs'
- 	classVariableNames: 'Cc Cf Cn Co Compositions Cs DecimalProperty Decompositions GeneralCategory Ll Lm Lo Lt Lu Mc Me Mn Nd Nl No Pc Pd Pe Pf Pi Po Ps Sc Sk Sm So Zl Zp Zs'
  	poolDictionaries: ''
  	category: 'Multilingual-Encodings'!
  
  !Unicode commentStamp: 'yo 10/19/2004 20:44' prior: 0!
  This class holds the entry points for the utility functions around characters.
  !

Item was added:
+ ----- Method: Unicode class>>toUppercase: (in category 'casing') -----
+ toUppercase: aWideString
+ 	"Transform a Wide String into uppercase.
+ 	This does not handle special cases where number of characters could change.
+ 	The algorithm would work for ByteString, however it's far from the most efficient."
+ 	
+ 	^aWideString collect: [:e |
+ 		(ToUpper at: e charCode ifAbsent: [nil])
+ 			ifNil: [e]
+ 			ifNotNil: [:up | self value: up]]!

Item was added:
+ ----- Method: Unicode class>>toLowercase: (in category 'casing') -----
+ toLowercase: aWideString
+ 	"Transform a Wide String into lowercase.
+ 	This does not handle special cases where number of characters could change.
+ 	The algorithm would work for ByteString, however it's far from the most efficient."
+ 	
+ 	^aWideString collect: [:e |
+ 		(ToLower at: e charCode ifAbsent: [nil])
+ 			ifNil: [e]
+ 			ifNotNil: [:low | self value: low]]!

Item was added:
+ ----- Method: Unicode class>>parseCaseMappingFrom: (in category 'casing') -----
+ parseCaseMappingFrom: stream
+ 	"Parse the Unicode casing mappings from the given stream.
+ 	Handle only the simple mappings"
+ 	"
+ 		Unicode initializeCaseMappings.
+ 	"
+ 	| fields line lowerCode upperCode |
+ 
+ 	ToUpper := IdentityDictionary new: 4096.
+ 	ToLower := IdentityDictionary new: 4096.
+ 
+ 	[stream atEnd] whileFalse:[
+ 		line := stream nextLine copyUpTo: $#.
+ 		fields := line withBlanksTrimmed findTokens: $;.
+ 		(fields size > 2 and: [#('C' 'S') includes: (fields at: 2) withBlanksTrimmed]) ifTrue:[
+ 			upperCode := Integer readFrom: (fields at: 1) withBlanksTrimmed base: 16.
+ 			lowerCode := Integer readFrom: (fields at: 3) withBlanksTrimmed base: 16.
+ 			ToUpper at: lowerCode put: upperCode.
+ 			ToLower at: upperCode put: lowerCode.
+ 		].
+ 	].
+ !



More information about the Packages mailing list