[squeak-dev] Reassessing the hack in WriteStream>>nextChunkPut: (was: Squeak 4.6 release update)

Levente Uzonyi leves at elte.hu
Sun Mar 8 19:03:54 UTC 2015


I checked the code, and came to the conclusion that it's not an OS bug. 
The cause of the problem is that we're writing the file (the changes file 
in this case) using one file descriptor, and try to read its contents 
using other descriptors (the read-only copies of the source files).
But the written bytes will only become visible to other file 
descriptors of the same file after calling the fflush() function (which is 
what does #flush do).

Here are some snippets showing how it works:

"This one should fail, because the bytes are not flushed."
StandardFileStream newFileNamed: 'test.txt' do: [ :file |
 	| readContents |
 	file nextPutAll: 'test'.
 	readContents := StandardFileStream readOnlyFileNamed: 'test.txt' do: [ :file2 |
 		file2 contents ].
 	self assert: readContents = 'test' ].

"Sending #flush will make it work."
StandardFileStream newFileNamed: 'test.txt' do: [ :file |
 	| readContents |
 	file nextPutAll: 'test'; flush.
 	readContents := StandardFileStream readOnlyFileNamed: 'test.txt' do: [ :file2 |
 		file2 contents ].
 	self assert: readContents = 'test' ].

"Reading from the same file descriptor always works."
StandardFileStream newFileNamed: 'test.txt' do: [ :file |
 	| readContents |
 	file nextPutAll: 'test'.
 	readContents := file reset; next: 4.
 	self assert: readContents = 'test' ]

The reason why the old code used to work, is because there was only one 
file descriptor used to read and write the changes file.

#flush is pretty costly, and IMO it's called way too often if it's in
#nextChunkPut:. The most common workaround to avoid frequent calls is to 
use InMidstOfFileinNotification to check if it's a bulk write, and flush 
only once in those cases. For some reason this technique is not used in 
case of class comments.
I changed the last lines of ClassDescription >> #classComment:stamp: in my 
image to

 	self organization classComment: (RemoteString newString: aString onFileNumber: 2) stamp: aStamp.
 	InMidstOfFileinNotification signal ifFalse: [file flush].
 	SystemChangeNotifier uniqueInstance classCommented: self.

Then removed the #flush from WriteStream >> #nextChunkPut:, and it seems 
to me that the problem is gone.

Levente

On Sun, 8 Mar 2015, David T. Lewis wrote:

> On Sun, Mar 08, 2015 at 11:29:18AM -0400, David T. Lewis wrote:
>> On Sun, Mar 08, 2015 at 11:44:18AM +0100, Tobias Pape wrote:
>>> Hi,
>>>
>>> On 08.03.2015, at 04:18, Levente Uzonyi <leves at elte.hu> wrote:
>>>
>>>> On Sat, 7 Mar 2015, David T. Lewis wrote:
>>>>
>>>>> On Sat, Mar 07, 2015 at 08:28:55PM +0100, Tobias Pape wrote:
>>>>>> Hey,
>>>>>> On 07.03.2015, at 19:01, gettimothy <gettimothy at zoho.com> wrote:
>>>>>>
>>>>>>> The image fixes the WriteStream>>NextChunkPut error we where getting when trying to modify comments which makes me happy (:
>>>>>>
>>>>>> to which fix do you refer?
>>>>>> There is a hack out there calling flush every now and then,
>>>>>> but this all calls for a principled solution for streams that
>>>>>> read _and_ write.
>>>>>
>>>>> You are right that it is a hack, but the actual problem is a defect
>>>>> in some C runtime libraries, notably on Ubuntu but possibly others
>>>>> as well.
>>>>
>>>> Are you sure it's a bug in the OS? Isn't it just the allocate-on-flush behavior?https://en.wikipedia.org/wiki/Allocate-on-flush
>>
>> No, I am not sure. I had convinced myself that it was a libc issue, but I may be wrong.
>>
>
> To check this, I tried running an image from a thumb drive with vfat file
> system, which presumably does not have the allocate-on-flush feature. The
> bug is still present (confirmed by reverting WriteStream>>nextChunkPut: to
> the earlier version). So the problem does not appear to be associated with
> the file system.
>
> Dave
>
>
>
>>>
>>> I too doubt an OS bug.
>>> Someone (i don't remember) said to me, that the behavior of the c calls
>>> make clear not to rely on certain thing.
>>> I plan to investigate the issue soonish.
>>>
>>> best
>>> 	-tobias
>>
>> That would be great.
>>
>> The workaround that "fixes" the comment problem is in WriteStream>>nextChunkPut:,
>> so if you remove the #flush at the end of this method, the problem can be
>> reproduced. At the time, I was not able to come up with a unit test that would
>> reproduce the problem, but my best guess as to what was happening is in the
>> update comment:
>>
>>
>>   Name: Collections-dtl.568
>>   Author: dtl
>>   Time: 5 May 2014, 12:39:30.026 pm
>>
>>   Add a flush to WriteStream>>nextChunkPut:
>>
>>   This is a workaround for a bug in the runtime library of some versions of
>>   Ubuntu. The symptom is that creation of a class comment for a class that
>>   previously had no comment leads to a file size error in the new RemoteStream
>>   that points to the class comment. Actual file size and contents of the
>>   changes file are not affected by this bug, and the error occurs when reading
>>   contents of the changes file immediately following the initial save, Flushing
>>   the stream after writing a chunk to the changes file prevents the problem.
>>
>> Dave
>>
>
>


More information about the Squeak-dev mailing list