[squeak-dev] The Trunk: Multilingual-pre.228.mcz

commits at source.squeak.org commits at source.squeak.org
Thu Jun 8 07:40:20 UTC 2017


Patrick Rein uploaded a new version of Multilingual to project The Trunk:
http://source.squeak.org/trunk/Multilingual-pre.228.mcz

==================== Summary ====================

Name: Multilingual-pre.228
Author: pre
Time: 8 June 2017, 9:40:08.341697 am
UUID: cb64d235-8b3f-a140-b34f-4695b78dd94e
Ancestors: Multilingual-pre.227

Adds a UTF32 TextConverter. Updates the comments of some of the TextConverter. Updates the encoding names of utf16.

=============== Diff against Multilingual-pre.227 ===============

Item was changed:
  ISO8859TextConverter subclass: #ISO88592TextConverter
  	instanceVariableNames: ''
  	classVariableNames: ''
  	poolDictionaries: ''
  	category: 'Multilingual-TextConversion'!
+ 
+ !ISO88592TextConverter commentStamp: '<historical>' prior: 0!
+ Text converter for ISO 8859-2.  An international encoding used in Eastern Europe.!

Item was changed:
  ISO8859TextConverter subclass: #ISO88597TextConverter
  	instanceVariableNames: ''
  	classVariableNames: ''
  	poolDictionaries: ''
  	category: 'Multilingual-TextConversion'!
+ 
+ !ISO88597TextConverter commentStamp: '<historical>' prior: 0!
+ Text converter for ISO 8859-7.  An international encoding used for Greek.!

Item was changed:
  ISO88591TextConverter subclass: #Latin1TextConverter
  	instanceVariableNames: ''
  	classVariableNames: ''
  	poolDictionaries: ''
  	category: 'Multilingual-TextConversion'!
+ 
+ !Latin1TextConverter commentStamp: '<historical>' prior: 0!
+ Text converter for ISO 8859-1.  An international encoding used in Western Europe.!

Item was changed:
  ISO885915TextConverter subclass: #Latin9TextConverter
  	instanceVariableNames: ''
  	classVariableNames: ''
  	poolDictionaries: ''
  	category: 'Multilingual-TextConversion'!
+ 
+ !Latin9TextConverter commentStamp: 'pre 4/21/2017 11:40' prior: 0!
+ Text converter for ISO 8859-15.  An international encoding also used in Western Europe.!

Item was changed:
  ----- Method: UTF16TextConverter class>>encodingNames (in category 'utilities') -----
  encodingNames
  
+ 	^ #('utf-16' 'utf16' 'utf-16-le' 'utf-16-be' 'utf-16be' 'utf-16le') copy.
- 	^ #('utf-16' 'utf16' 'utf-16-le' 'utf-16-be') copy.
  !

Item was added:
+ TextConverter subclass: #UTF32TextConverter
+ 	instanceVariableNames: 'useLittleEndian useByteOrderMark byteOrderMarkDone'
+ 	classVariableNames: ''
+ 	poolDictionaries: ''
+ 	category: 'Multilingual-TextConversion'!
+ 
+ !UTF32TextConverter commentStamp: 'pre 6/7/2017 17:55' prior: 0!
+ Text converter for UTF-32.  It supports the endianness and byte order mark.!

Item was added:
+ ----- Method: UTF32TextConverter class>>encodingNames (in category 'utilities') -----
+ encodingNames
+ 
+ 	^ #( 'utf32' 'utf32be' 'utf32le' 'utf-32' 'utf-32be' 'utf-32le' 'ucs4' 'ucs4be' 'ucs4le') copy
+ !

Item was added:
+ ----- Method: UTF32TextConverter class>>initializeLatin1MapAndEncodings (in category 'utilities') -----
+ initializeLatin1MapAndEncodings
+ 	"Initialize the latin1Map and latin1Encodings.
+ 	These variables ensure that conversions from latin1 ByteString is reasonably fast"
+ 	
+ 	latin1Map := (ByteArray new: 256) atAllPut: 1.
+ 	latin1Encodings := (0 to: 255) collect: [:i | (ByteArray newFrom: {0 . 0 . 0 . i}) asString]!

Item was added:
+ ----- Method: UTF32TextConverter>>initialize (in category 'initialize-release') -----
+ initialize
+ 
+ 	super initialize.
+ 	useLittleEndian := useByteOrderMark := byteOrderMarkDone := false!

Item was added:
+ ----- Method: UTF32TextConverter>>next16BitValue:toStream: (in category 'private') -----
+ next16BitValue: value toStream: aStream
+ 
+ 	| v1 v2 |
+ 	v1 := (value bitShift: -8) bitAnd: 16rFF.
+ 	v2 := value bitAnd: 16rFF.
+ 	useLittleEndian
+ 		ifTrue: [
+ 			aStream 
+ 				basicNextPut: (Character value: v2);
+ 				basicNextPut: (Character value: v1) ]
+ 		ifFalse: [
+ 			aStream
+ 				basicNextPut: (Character value: v1);
+ 				basicNextPut: (Character value: v2) ].
+ !

Item was added:
+ ----- Method: UTF32TextConverter>>next32BitValue:toStream: (in category 'private') -----
+ next32BitValue: value toStream: aStream
+ 
+ 	| v1 v2 v3 v4 |
+ 	v1 := (value bitShift: -24) bitAnd: 16rFF.
+ 	v2 := (value bitShift: -16) bitAnd: 16rFF.
+ 	v3 := (value bitShift: -8) bitAnd: 16rFF.
+ 	v4 := (value bitShift: 0) bitAnd: 16rFF.
+ 	useLittleEndian
+ 		ifTrue: [
+ 			aStream 
+ 				basicNextPut: (Character value: v4);
+ 				basicNextPut: (Character value: v3);
+ 				basicNextPut: (Character value: v2);
+ 				basicNextPut: (Character value: v1) ]
+ 		ifFalse: [
+ 			aStream
+ 				basicNextPut: (Character value: v1);
+ 				basicNextPut: (Character value: v2);
+ 				basicNextPut: (Character value: v3);
+ 				basicNextPut: (Character value: v4) ].
+ !

Item was added:
+ ----- Method: UTF32TextConverter>>nextFromStream: (in category 'conversion') -----
+ nextFromStream: aStream
+ 
+ 	| character1 character2 readBOM charValue character3 character4 |
+ 	aStream isBinary ifTrue: [ ^aStream basicNext ].
+ 	character1 := aStream basicNext ifNil: [ ^nil ].
+ 	character2 := aStream basicNext ifNil: [ ^nil ].
+ 	character3 := aStream basicNext ifNil: [ ^nil ].
+ 	character4 := aStream basicNext ifNil: [ ^nil ].
+ 	
+ 	readBOM := false.
+ 	(character1 asciiValue = 16rFF and: [character2 asciiValue = 16rFE]) ifTrue: [
+ 		self
+ 			useByteOrderMark: true;
+ 			useLittleEndian: true.
+ 		readBOM := true ].
+ 	
+ 	((character1 asciiValue = 0 and: [character2 asciiValue = 0]) 
+ 	and: [character3 asciiValue = 16rFE and: [character4 asciiValue = 16rFF]]) ifTrue: [
+ 		self
+ 			useByteOrderMark: true;
+ 			useLittleEndian: false.
+ 		readBOM := true ].
+ 
+ 	readBOM ifTrue: [
+ 		"Re-initialize character variables if they contain BOM"
+ 		character1 := aStream basicNext ifNil: [ ^nil ].
+ 		character2 := aStream basicNext ifNil: [ ^nil ].
+ 		character3 := aStream basicNext ifNil: [ ^nil ].
+ 		character4 := aStream basicNext ifNil: [ ^nil ]. ].
+ 
+ 	useLittleEndian 
+ 		ifTrue: [ charValue := (character4 charCode bitShift: 24) + (character3 charCode bitShift: 16) + (character2 charCode bitShift: 8) + character1 charCode ]
+ 		ifFalse: [ charValue := (character1 charCode bitShift: 24) + (character2 charCode bitShift: 16) + (character3 charCode bitShift: 8) + character4 charCode ].
+ 
+ 	^ Unicode value: charValue!

Item was added:
+ ----- Method: UTF32TextConverter>>nextPut:toStream: (in category 'conversion') -----
+ nextPut: aCharacter toStream: aStream
+ 
+ 	| charCode |
+ 	aStream isBinary ifTrue: [ ^aCharacter storeBinaryOn: aStream ].
+ 	(useByteOrderMark and: [ byteOrderMarkDone not ]) ifTrue: [
+ 		self next32BitValue: 16r0000FEFF toStream: aStream.
+ 		byteOrderMarkDone := true ].
+ 	(charCode := aCharacter charCode) < 256
+ 		ifTrue: [
+ 			(latin1Encodings at: charCode + 1)
+ 				ifNil: [ self next32BitValue: charCode toStream: aStream ]
+ 				ifNotNil: [ :encodedString | aStream basicNextPutAll: encodedString ] ]
+ 		ifFalse: [
+ 			self next32BitValue: charCode toStream: aStream ].
+ 	^aCharacter!

Item was added:
+ ----- Method: UTF32TextConverter>>swapLatin1EncodingByteOrder (in category 'private') -----
+ swapLatin1EncodingByteOrder
+ 	latin1Encodings := latin1Encodings collect: [:each | 
+ 		each ifNotNil: [each reverse]]!

Item was added:
+ ----- Method: UTF32TextConverter>>useByteOrderMark (in category 'accessing') -----
+ useByteOrderMark
+ 
+ 	^useByteOrderMark
+ !

Item was added:
+ ----- Method: UTF32TextConverter>>useByteOrderMark: (in category 'accessing') -----
+ useByteOrderMark: aBoolean
+ 
+ 	useByteOrderMark := aBoolean.
+ !

Item was added:
+ ----- Method: UTF32TextConverter>>useLittleEndian (in category 'accessing') -----
+ useLittleEndian
+ 
+ 	^useLittleEndian
+ !

Item was added:
+ ----- Method: UTF32TextConverter>>useLittleEndian: (in category 'accessing') -----
+ useLittleEndian: aBoolean
+ 
+ 	aBoolean = useLittleEndian ifFalse: [ self swapLatin1EncodingByteOrder ].
+ 	useLittleEndian := aBoolean.
+ !



More information about the Squeak-dev mailing list