[squeak-dev] The Trunk: Multilingual-topa.239.mcz

commits at source.squeak.org commits at source.squeak.org
Wed Sep 12 13:26:01 UTC 2018


Tobias Pape uploaded a new version of Multilingual to project The Trunk:
http://source.squeak.org/trunk/Multilingual-topa.239.mcz

==================== Summary ====================

Name: Multilingual-topa.239
Author: topa
Time: 12 September 2018, 3:25:37.647469 pm
UUID: 9a2616f0-677b-4d2b-9a60-4495a0fae606
Ancestors: Multilingual-ul.238

Fix and unify Unicode data downloading

(old url was stable but newer is preferred).

=============== Diff against Multilingual-ul.238 ===============

Item was changed:
+ ----- Method: Unicode class>>caseFoldingData (in category 'unicode data') -----
- ----- Method: Unicode class>>caseFoldingData (in category 'casing') -----
  caseFoldingData
  	
+ 	 ^ self fetch: 'CaseFolding Unicode data' fromUnicodeData: 'CaseFolding.txt'
+ !
- 	UIManager default informUserDuring: [ :bar |
- 		| stream |
- 		bar value: 'Downloading CaseFolding Unicode data'.
- 		stream := HTTPClient httpGet: 'http://www.unicode.org/Public/UNIDATA/CaseFolding.txt'.
- 		(stream isKindOf: RWBinaryOrTextStream) ifFalse: [
- 			^self error: 'Download failed' ].
- 		^stream reset; contents ]!

Item was added:
+ ----- Method: Unicode class>>fetch:fromUnicodeData: (in category 'unicode data') -----
+ fetch: what fromUnicodeData: fileName
+ 	| unicodeLocation |
+ 	unicodeLocation := 'https://www.unicode.org/Public/UCD/latest/ucd/'.
+ 	UIManager default informUser: 'Downloading ', what  during: 
+ 		[ | response|
+ 		response := WebClient httpGet: unicodeLocation, fileName.
+ 		^ response isSuccess
+ 			ifFalse: [self error: 'Download failed']
+ 			ifTrue: [response content]].
+ 		
+ 		 !

Item was changed:
+ ----- Method: Unicode class>>generalCategory (in category 'unicode data') -----
- ----- Method: Unicode class>>generalCategory (in category 'class methods') -----
  generalCategory
  
  	^ GeneralCategory.
  
  !

Item was changed:
+ ----- Method: Unicode class>>generalCategoryComment (in category 'unicode data') -----
- ----- Method: Unicode class>>generalCategoryComment (in category 'class methods') -----
  generalCategoryComment
  "
  Lu Letter, Uppercase 
  Ll Letter, Lowercase 
  Lt Letter, Titlecase 
  Lm Letter, Modifier 
  Lo Letter, Other 
  Mn Mark, Non-Spacing 
  Mc Mark, Spacing Combining 
  Me Mark, Enclosing 
  Nd Number, Decimal 
  Nl Number, Letter 
  No Number, Other 
  Pc Punctuation, Connector 
  Pd Punctuation, Dash 
  Ps Punctuation, Open 
  Pe Punctuation, Close 
  Pi Punctuation, Initial quote (may behave like Ps or Pe depending on usage) 
  Pf Punctuation, Final quote (may behave like Ps or Pe depending on usage) 
  Po Punctuation, Other 
  Sm Symbol, Math 
  Sc Symbol, Currency 
  Sk Symbol, Modifier 
  So Symbol, Other 
  Zs Separator, Space 
  Zl Separator, Line 
  Zp Separator, Paragraph 
  Cc Other, Control 
  Cf Other, Format 
  Cs Other, Surrogate 
  Co Other, Private Use 
  Cn Other, Not Assigned (no characters in the file have this property) 
  "!

Item was changed:
+ ----- Method: Unicode class>>parseUnicodeDataFrom: (in category 'unicode data') -----
- ----- Method: Unicode class>>parseUnicodeDataFrom: (in category 'class methods') -----
  parseUnicodeDataFrom: stream
  "
  	self halt.
  	self parseUnicodeDataFile
  "
  
  	| line fieldEnd point fieldStart toNumber generalCategory decimalProperty |
  
  	toNumber := [:quad | ('16r', quad) asNumber].
  
  	GeneralCategory := SparseLargeTable new: 16rE0080 chunkSize: 1024 arrayClass: Array base: 1 defaultValue:  'Cn'.
  	DecimalProperty := SparseLargeTable new: 16rE0080 chunkSize: 32 arrayClass: Array base: 1 defaultValue: -1.
  
  	16r3400 to: 16r4DB5 do: [:i | GeneralCategory at: i+1 put: 'Lo'].
  	16r4E00 to: 16r9FA5 do: [:i | GeneralCategory at: i+1 put: 'Lo'].
  	16rAC00 to: 16rD7FF do: [:i | GeneralCategory at: i+1 put: 'Lo'].
  
  	[(line := stream nextLine) size > 0] whileTrue: [
  		fieldEnd := line indexOf: $; startingAt: 1.
  		point := toNumber value: (line copyFrom: 1 to: fieldEnd - 1).
  		point > 16rE007F ifTrue: [
  			GeneralCategory zapDefaultOnlyEntries.
  			DecimalProperty zapDefaultOnlyEntries.
  			^ self].
  		2 to: 3 do: [:i |
  			fieldStart := fieldEnd + 1.
  			fieldEnd := line indexOf: $; startingAt: fieldStart.
  		].
  		generalCategory := line copyFrom: fieldStart to: fieldEnd - 1.
  		GeneralCategory at: point+1 put: generalCategory.
  		generalCategory = 'Nd' ifTrue: [
  			4 to: 7 do: [:i |
  				fieldStart := fieldEnd + 1.
  				fieldEnd := line indexOf: $; startingAt: fieldStart.
  			].
  			decimalProperty :=  line copyFrom: fieldStart to: fieldEnd - 1.
  			DecimalProperty at: point+1 put: decimalProperty asNumber.
  		].
  	].
  	GeneralCategory zapDefaultOnlyEntries.
  	DecimalProperty zapDefaultOnlyEntries.
  !

Item was changed:
+ ----- Method: Unicode class>>unicodeData (in category 'unicode data') -----
- ----- Method: Unicode class>>unicodeData (in category 'composing') -----
  unicodeData
  	
+ 	^ self fetch: 'Unicode Data' fromUnicodeData: 'UnicodeData.txt'
+ !
- 	UIManager default informUserDuring: [ :bar |
- 		| stream |
- 		bar value: 'Downloading Unicode data'.
- 		stream := HTTPClient httpGet: 'http://www.unicode.org/Public/UNIDATA/UnicodeData.txt'.
- 		(stream isKindOf: RWBinaryOrTextStream) ifFalse: [
- 			^self error: 'Download failed' ].
- 		^stream reset; contents ]!



More information about the Squeak-dev mailing list