[squeak-dev] The Trunk: System-nice.539.mcz

commits at source.squeak.org commits at source.squeak.org
Thu May 30 22:55:41 UTC 2013


Nicolas Cellier uploaded a new version of System to project The Trunk:
http://source.squeak.org/trunk/System-nice.539.mcz

==================== Summary ====================

Name: System-nice.539
Author: nice
Time: 31 May 2013, 12:54:45.518 am
UUID: 54f88b1d-e023-472f-924e-cec4ca5fb3d8
Ancestors: System-fbs.538

Now that .mcz/snapshot/source.st are encoded in UTF8,
Let MczInstaller decode UTF8.

Previously, snapshot/source.st was encoded in latin-1 (iso-8859-L1) for ByteString and UTF-32BE for WideString.
And would always be decoded in MacRoman by MczInstaller (or UTF-8 if there were a BOM, but there weren't)

Note that .mcz do not use a BOM for compatibility with legacy code.
Thus compatibility is handled by catching InvalidUTF8 exception.
The rationale is that previous interpretation of snapshot/source.st was completely broken and presumably not used.
So adding a bit of complexity with a BOM compatibility is not really worth.

=============== Diff against System-fbs.538 ===============

Item was changed:
  ----- Method: MczInstaller>>installMember: (in category 'installation') -----
  installMember: member
  	 
  	self useNewChangeSetDuring:
+ 		[ | str |
+ 		str := member contentStream text readStream contents.
+ 		str := [str utf8ToSqueak] on: InvalidUTF8 do: [:exc |
+ 			"Case of legacy encoding, presumably it is latin-1 and we do not need to do anything
+ 			But if contents starts with a null character, it might be a case of WideString encoded in UTF-32BE"
+ 			exc return: (((str beginsWith: Character null asString) and: [ str size \\ 4 = 0 ])
+ 				ifTrue: [WideString fromByteArray: str asByteArray]
+ 				ifFalse: [str])].
+ 		str readStream fileInAnnouncing: 'loading ', member fileName ]!
- 		[ | str |str := member contentStream text.
- 		str setConverterForCode.
- 		str fileInAnnouncing: 'loading ', member fileName]!



More information about the Squeak-dev mailing list