[squeak-dev] The Trunk: Multilingual-ul.85.mcz

commits at source.squeak.org commits at source.squeak.org
Sat Feb 6 19:44:24 UTC 2010


Andreas Raab uploaded a new version of Multilingual to project The Trunk:
http://source.squeak.org/trunk/Multilingual-ul.85.mcz

==================== Summary ====================

Name: Multilingual-ul.85
Author: ul
Time: 6 February 2010, 12:37:47.216 am
UUID: 47b3a790-8f3b-6341-abf3-21bf95892821
Ancestors: Multilingual-nice.84

- fix #basicNext: and #basicUpTo: in MultiByteFileStream
- add chunk reading capabilities to TextConverter
- assume that MultiByteFileStream's converter is properly initialized in #next
- MultiByteFileStream >> #nextChunk uses its converter's chunk reading capabilities, this speeds gives >3x speedup if the file has UTF-8 encoding
- fix: MultiByteFileStream lost it's position if the ! character was encoded to more than a single byte (ex UTF16)

=============== Diff against Multilingual-nice.84 ===============

Item was changed:
  ----- Method: MultiByteFileStream>>nextChunk (in category 'fileIn/Out') -----
  nextChunk
  	"Answer the contents of the receiver, up to the next terminator character. Doubled terminators indicate an embedded terminator character."
  
+ 	^converter nextChunkFromStream: self!
- 	self skipSeparators.
- 	^self parseLangTagFor: (
- 		String new: 1000 streamContents: [ :stream |
- 			| character |
- 			[ 
- 				(character := self next) == nil or: [ 
- 					character == $!! and: [ 
- 						self next ~~ $!! ] ] ] 
- 				whileFalse: [ stream nextPut: character ].
- 			character ifNotNil: [ self skip: -1 ] ])!

Item was added:
+ ----- Method: TextConverter>>nextChunkFromStream: (in category 'fileIn/Out') -----
+ nextChunkFromStream: input
+ 	"Answer the contents of input, up to the next terminator character. Doubled terminators indicate an embedded terminator character."
+ 	
+ 	input skipSeparators.
+ 	^input parseLangTagFor: (
+ 		String new: 1000 streamContents: [ :output |
+ 			| character state |
+ 			[ 
+ 				(character := self nextFromStream: input) == nil or: [ 
+ 					character == $!! and: [ 
+ 						state := self saveStateOf: input.
+ 						(self nextFromStream: input) ~~ $!! ] ] ] 
+ 				whileFalse: [ output nextPut: character ].
+ 			character ifNotNil: [ 
+ 				self restoreStateOf: input with: state ] ])!

Item was added:
+ ----- Method: UTF8TextConverter>>nextChunkFromStream: (in category 'fileIn/Out') -----
+ nextChunkFromStream: input
+ 	"Answer the contents of input, up to the next terminator character. Doubled terminators indicate an embedded terminator character."
+ 	
+ 	input skipSeparators.
+ 	^input parseLangTagFor: (
+ 		String new: 1000 streamContents: [ :stream |
+ 			[
+ 				stream nextPutAll: (input basicUpTo: $!!).
+ 				input basicNext == $!! ]
+ 					whileTrue: [ 
+ 						stream nextPut: $!! ].
+ 			input atEnd ifFalse: [ input skip: -1 ] ]) utf8ToSqueak!

Item was changed:
  ----- Method: MultiByteFileStream>>next (in category 'public') -----
  next
  
  	| char secondChar state |
+ 	char := converter nextFromStream: self.
- 	char := (converter ifNil: [ self converter ]) nextFromStream: self.
  	(wantsLineEndConversion == true and: [ lineEndConvention notNil ]) "#doConversion is inlined here"
  		 ifTrue: [
  			char == Cr ifTrue: [
  				state := converter saveStateOf: self.
  				secondChar := self bareNext.
  				secondChar ifNotNil: [
  					secondChar == Lf ifFalse: [ converter restoreStateOf: self with: state ] ].
  				^Cr ].
  			char == Lf ifTrue: [
  				^Cr ] ].
  	^char.
  
  !

Item was changed:
  ----- Method: MultiByteFileStream>>basicUpTo: (in category 'private basic') -----
+ basicUpTo: delim 
+ 	"Fast version to speed up nextChunk"
+ 	| pos buffer count |
+ 	collection ifNotNil: [
+ 		(position < readLimit and: [
+ 			(pos := collection indexOf: delim startingAt: position + 1) <= readLimit and: [
+ 				pos > 0 ] ]) ifTrue: [
+ 					^collection copyFrom: position + 1 to: (position := pos) - 1 ] ].
+ 	pos := self position.
+ 	buffer := self basicNext: 2000.
+ 	(count := buffer indexOf: delim) > 0 ifTrue: 
+ 		["Found the delimiter part way into buffer"
+ 		self position: pos + count.
+ 		^ buffer copyFrom: 1 to: count - 1].
+ 	self atEnd ifTrue:
+ 		["Never found it, and hit end of file"
+ 		^ buffer].
+ 	"Never found it, but there's more..."
+ 	^ buffer , (self basicUpTo: delim)!
- basicUpTo: delim
- 
- 	^ super upTo: delim.
- !

Item was changed:
  ----- Method: MultiByteFileStream>>basicNext: (in category 'private basic') -----
  basicNext: anInteger
  
+ 	^self basicNextInto: (self collectionSpecies new: anInteger)!
- 	^ super next: anInteger.
- !




More information about the Squeak-dev mailing list