[squeak-dev] Inboard Squeak Sources

David T. Lewis lewis at mail.msen.com
Sat Nov 19 16:06:57 UTC 2022

Attached is an update to the Internalize changes file change set, notes below.

On Fri, Oct 21, 2022 at 09:43:31PM -0400, David T. Lewis wrote:
> On Mon, Sep 26, 2022 at 08:06:31PM -0400, David T. Lewis wrote:
> > Attached is a change set that allows the changes log to be kept in the image,
> > rather than in an exteral file. Along with our existing "Cache sources file"
> > preference, the "Cache changes file" preference allows the image to be moved,
> > copied, and run independent of any external sources and changes files.
> > 
> > This is (sort of, and rather belatedly) a follow-up from this email from
> > Dan Ingalls back in 1998:
> > 
> >   http://lists.squeakfoundation.org/pipermail/squeak-dev/1998-March/007670.html
> > 
> > Dan was referring to Scott Wallace's inplementation from 1996, which can be
> > seen in early Squeak images. For example, go to http://try.squeak.org and try
> > running the Squeak 1.13 under SqueakJS in your web browser. Scott's original
> > implementation is in SystemDictionary>>internalizeChangeLog and
> > SystemDictionary>>internalizeSources.
> > 
> > This change set is the trivial re-implementation of moving the changes log
> > into the image. It allows myimage.image to be saved with the sources and
> > changes files internalized, then started and run on another computer with
> > no sources file or changes file present.
> > 
> > I have tried to do sensible things with respect to writing the changes log
> > back out to the file system when the preference setting is changed. I've
> > been playing with this for a couple of days and it seems to work well so
> > far, although I have tested only on Linux. The basic file policy is:
> > 
> > - When internalizing the changes log, leave the existing changes file
> > alone, do not delete it.
> > 
> > - When externalizing the changes log, check first if a changes file exists,
> > if yes then give the user the option of overwriting it (or doing nothing).
> > If no changes file is present, create a new one.
> > 
> > Dave
> Attached is an update to the change set that fixes an issue that I found
> after using it for a while. When internalizing the changes log, we should
> read the changes file with a StandardFileStream to avoid accidental
> conversion of the file contents to WideString (thereby consuming 4X the
> memory space, not what we are looking for here).
> This is a small but useful change set. I will live with it a while longer,
> then add it to trunk if no further issues.

The 'Internalize changes file.17.cs' change set adds UTF8 conversion
for source chunks that contain Japanese characters or other wide
characters. The internal changes log is represented as ChangeLogStream,
which works as an extension of ReadWriteStream on a ByteStream.
Positioning in the stream is by byte position, and reading and
writing source chunks is done with a UTF8TextConverter that handles
four byte characters in the same manner as a MultiByteFileStream on
a changes file.

In addition to handling source text conversion properly for wide
characters, I also found an interesting (and to me unexpected)
behavior. If we have a ReadWriteStream on a ByteString, and then
put a wide character to the stream, the ByteString automatically
becomes a WideString. This is four times the size of the original
string (4 bytes per character), which is definitely not what we
want for a changes log stored in the image.

I have been using this change set in a couple of working images,
and it seems to be safe for general use at this point. If anyone
else is using it, please say if there are any issues or problems.

I will plan to put this update into trunk in another week or so.



-------------- next part --------------
���'From Squeak6.1alpha of 18 October 2022 [latest update: #22263] on 6 November 2022 at 12:55:35 pm'!
"Change Set:		Internalize changes file
Date:			6 November 2022
Author:			David T. Lewis

Allow the changes log to be maintained within the image rather than as an external file. Provide a preference for changing the configuration. UTF8 encoding is assumed, and several conversion methods are duplicated from MultiByteFileStream.

Inspired by the original Scott Wallace implementation circa 1996, see SystemDictionary>>internalizeChangeLog and SystemDictionary>>internalizeSources in early Squeak images."!

ReadWriteStream subclass: #ChangeLogStream
	instanceVariableNames: 'TextConverter'
	classVariableNames: ''
	poolDictionaries: ''
	category: 'Files-System'!

!ChangeLogStream commentStamp: 'dtl 11/5/2022 16:22' prior: 0!
A ChangeLogStream is a memory resident changes log, equivalent to the traditional file based changes but not stored on the external file system. UTF8 encoding is assumed for all sources and a UTF8TextConverter is used for reading and writing to the stream.

SequenceableCollection subclass: #SourceFileArray
	instanceVariableNames: ''
	classVariableNames: 'CachedChanges '
	poolDictionaries: ''
	category: 'Files-System'!

!ChangeLogStream methodsFor: 'converting' stamp: 'dtl 11/5/2022 16:09'!
	"Assume UTF8 for all sources"

	^TextConverter ifNil: [TextConverter := UTF8TextConverter new].! !

!ChangeLogStream methodsFor: 'converting' stamp: 'dtl 9/25/2022 16:56'!
	^ self! !

!ChangeLogStream methodsFor: 'converting' stamp: 'dtl 9/25/2022 10:53'!
	^self! !

!ChangeLogStream methodsFor: 'fileIn/Out' stamp: 'dtl 11/6/2022 12:48'!
	"Deliver the next chunk as a Text.  Decode the following ]style[ chunk if present.  Position at start of next real chunk."
	^self converter nextChunkTextFromStream: self! !

!ChangeLogStream methodsFor: 'testing' stamp: 'dtl 9/24/2022 19:14'!
	^false! !

!ChangeLogStream methodsFor: 'accessing' stamp: 'dtl 11/5/2022 16:17'!
nextPut: aCharacter

	^self converter nextPut: aCharacter toStream: self! !

!ChangeLogStream methodsFor: 'accessing' stamp: 'dtl 11/5/2022 16:13'!
nextPutAll: aCollection

	^self converter nextPutAll: aCollection toStream: self! !

!ChangeLogStream methodsFor: 'private basic' stamp: 'ul 1/31/2011 02:05'!
basicNext: anInteger putAll: aCollection startingAt: startIndex

	^super next: anInteger putAll: aCollection startingAt: startIndex! !

!ChangeLogStream methodsFor: 'private basic' stamp: 'yo 8/28/2002 11:07'!
basicNextPut: char

	^ super nextPut: char.
! !

!ChangeLogStream methodsFor: 'private basic' stamp: 'yo 8/28/2002 11:07'!
basicNextPutAll: aString

	^ super nextPutAll: aString.
! !

!SmalltalkImage methodsFor: 'image, changes names' stamp: 'dtl 9/25/2022 16:15'!
openSourcesAndChanges: changesName forImage: imageName 
	"Open the changes and sources files and install them in SourceFiles. Inform the user of problems regarding write permissions or CR/CRLF mixups."
	"Note: SourcesName and imageName are full paths; changesName is a  
	local name."
	| sources changes msg wmsg |
	msg := 'Squeak cannot locate {1}.

Please check that the file is named properly and is in the same directory as this image.'.
	wmsg := 'Squeak cannot write to {1}.

Please check that you have write permission for this file.

You won''t be able to save this image correctly until you fix this.'.

	sources := Smalltalk openSources.
	sources ifNotNil: [sources setConverterForCode].
	changes := SourceFileArray cachedChanges
		ifNil: [Smalltalk openChanges: changesName forImage: imageName].
	changes ifNotNil: [changes setConverterForCode].
	((sources == nil or: [sources atEnd])
			and: [Preferences valueOfFlag: #warnIfNoSourcesFile])
		ifTrue: [Smalltalk platformName = 'Mac OS'
				ifTrue: [msg := msg , '
Make sure the sources file is not an Alias.'].
self inform: (msg format: { 'the sources file named ' , self sourcesName })].

	(changes == nil
			and: [Preferences valueOfFlag: #warnIfNoChangesFile])
		ifTrue: [self inform: (msg format: { 'the changes file named ' , changesName })].

	((Preferences valueOfFlag: #warnIfNoChangesFile) and: [changes notNil])
		ifTrue: [changes isReadOnly
				ifTrue: [self inform: (wmsg format: { 'the changes file named ' , changesName })].

			((changes next: 200)
					includesSubstring: String crlf)
				ifTrue: [self inform: ('The changes file named <b>{1}</b> has been injured by an unpacking utility. Line endings were changed from Cr to CrLf.<br><br>Please set the preferences in your decompressing program to 
<b>do not convert text files</b> and unpack the system again.' translated format: { changesName }) asTextFromHtml]].

	SourceFiles := Array with: sources with: changes! !

!SmalltalkImage methodsFor: 'sources, changes log' stamp: 'dtl 9/26/2022 11:22'!
saveAs: newName
	"Save the image  under that new name."
	newName ifNil:[^ self].

	(SourceFileArray internalizeChanges or: (SourceFiles at: 2) isNil) ifFalse:
		[self closeSourceFiles; "so copying the changes file will always work"
			 saveChangesInFileNamed: (self fullNameForChangesNamed: newName)].
	self saveImageInFileNamed: (self fullNameForImageNamed: newName)! !

!SmalltalkImage methodsFor: 'sources, changes log' stamp: 'dtl 9/26/2022 11:25'!
	"Save the image/changes using the next available version number."
	"Smalltalk saveAsNewVersion"
	| newName changesName aName anIndex |
	aName := FileDirectory baseNameFor: (FileDirectory default localNameFor: self imageName).
	anIndex := aName lastIndexOf: FileDirectory dot asCharacter ifAbsent: [nil].
	(anIndex notNil and: [(aName copyFrom: anIndex + 1 to: aName size) isAllDigits])
			[aName := aName copyFrom: 1 to: anIndex - 1].

	newName := FileDirectory default nextNameFor: aName extension: FileDirectory imageSuffix.
	changesName := self fullNameForChangesNamed: newName.

	(SourceFileArray internalizeChanges or: (SourceFiles at: 2) isNil) ifFalse: [
		"Check to see if there is a .changes file that would cause a problem if we saved a new .image file with the new version number"
		(FileDirectory default fileOrDirectoryExists: changesName)
				[^ self inform:
'There is already .changes file of the desired name,
', newName, '
curiously already present, even though there is
no corresponding .image file.   Please remedy
manually and then repeat your request.'].

		self closeSourceFiles; "so copying the changes file will always work"
			saveChangesInFileNamed: (self fullNameForChangesNamed: newName)
	self saveImageInFileNamed: (self fullNameForImageNamed: newName)

! !

!SourceFileArray class methodsFor: 'internalize changes' stamp: 'dtl 9/25/2022 17:57'!
	"When present, CachedChanges replaces the traditional external changes
	file with an internal stream in the image."
	^CachedChanges! !

!SourceFileArray class methodsFor: 'internalize changes' stamp: 'dtl 10/19/2022 19:03'!
	| changes fs contents |
	changes := SourceFiles at: 2.
	(changes isKindOf: FileStream)
		ifTrue: [[fs := StandardFileStream "Use StandardFileStream, not FileStream concreteStream, to avoid WideString conversion"
				readOnlyFileNamed: changes name.
			fs binary.
			contents := fs upToEnd]
				ensure: [fs close].
			"Smalltalk openSourceFiles."
			^ contents asString]
		ifFalse: [^ changes contents]! !

!SourceFileArray class methodsFor: 'internalize changes' stamp: 'dtl 11/5/2022 15:06'!
	"Move the changes log to an external file. This is the traditional packaging, in
	which the sources file and changes file are maintained separately from the
	image file."
	self safeToExportChanges
		ifTrue: [ | fs |
			[fs := FileStream fileNamed: Smalltalk changesName.
			fs binary.
			fs nextPutAll: CachedChanges contents.
			CachedChanges := nil]
				ensure: [ fs close ].
			Smalltalk openSourceFiles].
! !

!SourceFileArray class methodsFor: 'internalize changes' stamp: 'dtl 11/5/2022 15:47'!
	"Move the change log to an internal stream and maintain it within the image.
	This configuration may be risky because the record of recently logged changes
	is no longer available in a separate external file. If the image file itself becomes
	unusable, the recent changes log file will not be available."
	(ChangeLogStream with: self changeLogContents)
		ifNotNil: [:strm | 
			CachedChanges := strm.
			SourceFiles at: 2 put: CachedChanges]! !

!SourceFileArray class methodsFor: 'internalize changes' stamp: 'dtl 9/25/2022 17:35'!

	<preference: 'Cache changes file'
	category: 'Files'
	description: 'Log changes to internal stream rather than external file. External changes file will not be available for error recovery.'
	type: #Boolean>
	^CachedChanges notNil
! !

!SourceFileArray class methodsFor: 'internalize changes' stamp: 'dtl 9/25/2022 16:43'!
internalizeChanges: internalize

		ifTrue: [ self internalizeChangeLog ]
		ifFalse: [self externalizeChangeLog ].! !

!SourceFileArray class methodsFor: 'internalize changes' stamp: 'dtl 9/25/2022 17:32'!
	"Changes log is currently internal, and there is no external changes file with
	the name that would be used for saving externally or the user confirms that
	it safe to overwrite that existing file. Existing changes file will be removed if
	user confirms."

	^ CachedChanges
		ifNil: [ false ]
		ifNotNil: [ (FileDirectory default fileExists: Smalltalk changesName)
			ifTrue: [ | response |
				(response :=self confirm: Smalltalk changesName , ' exists, overwrite?')
					ifTrue: [[ FileDirectory default deleteFileNamed: Smalltalk changesName ]
						on: Error do: [ self notify: 'cannot delete ', Smalltalk changesName. ^false ]].
				^ response ]
			ifFalse: [true]].

! !

SequenceableCollection subclass: #SourceFileArray
	instanceVariableNames: ''
	classVariableNames: 'CachedChanges'
	poolDictionaries: ''
	category: 'Files-System'!

More information about the Squeak-dev mailing list