file size primitive bug?

Julian Fitzell julian at beta4.com
Fri Feb 27 02:20:35 UTC 2004


So the fix I originally had seems problematic.  I noticed it when filing 
in.  I think what's going on is that for each method change:

1. we call #setToEnd
2. #setToEnd gets the file size and sets the position to that
3. we write out data

The problem is that by the time we call #setToEnd the next time, the 
data may not have been flushed to the disk yet.  Previously when writing 
out data, the VM was updating the cached file size, so the next call to 
#setToEnd would go to that point (what FileStream wants the end of the 
file to be).  By changing the file size primitive to return the actual 
file size, we return the actual file size (not surprisingly) which in 
case is earlier in the file.

The best solution Avi and I could come up with to keep me going in the 
short term is to change the flush primitive to update the cached file 
size.  This way, you can call flush before you call #setToEnd if you are 
expecting another process to have modified the file.  Taking the broad 
interpretation of flush as meaning "I want my file stream to be in synch 
with the file on disk" this makes a certain amount of sense, but I'm not 
sure it's something we actually want to put in the VM (seems a tad hackish).

So maybe Flow is the right solution -- I don't know -- but I don't think 
I'll submit this fix right now and that's about all the time I have to 
spend on this problem at the moment.  Others should be aware though 
that, until we come up with a better solution, there is risk of data 
corruption if you are appending to a file from two processes (yes, even 
if you use locking).

Julian

tim Rowledge wrote:
> Julian Fitzell wrote:
> 
>> I've encountered a problem, while using OmniBase, in that squeak 
>> returns an incorrect file size if the size has changed since it was 
>> opened. This appears to be because the file size is returned from a 
>> rarely-updated file structure.  I have solved this by changing 
>> sqFileSize() to get a file descriptor with "fileno(f->file)" and then 
>> using fstat() to get the file size.
> 
> 
> It's pretty horrible isn't it. I've never been able to work out why it 
> would be done like that. The size value  is only increased by writing to 
> the file; thus if you have two accessors on the same file (haven't I 
> complained about this before?) it is quite possible for one to claim the 
> file has reached its end whilst the other is happily writing well beyond 
> that point.
> 
> Yet again we see that it is time to replace the lot.
> 
>> 1) Is there some reason I shouldn't be doing this in my own VM?
>> 2) Should I be preparing this to submit as a fix? 
> 
> 
> 
> Ooh, yes please. A complete redesign would be a good start. Or a 
> proposal to incorporate Flow.
> 
> tim
> 
> 



More information about the Squeak-dev mailing list