[squeak-dev] new VM appears not to be flushing

Beckmann, Tom Tom.Beckmann at student.hpi.uni-potsdam.de
Tue Jan 14 08:38:18 UTC 2020

A random thought without proper understanding of the problem: could the problem or change in behavior be related to the linux build recently switching to clang rather than GCC (at least I believe it used to be GCC)? These lowlevel primitives would likely be in the libc that the compiler links to, no?

From: Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> on behalf of Eliot Miranda <eliot.miranda at gmail.com>
Sent: Tuesday, January 14, 2020 9:37:03 AM
To: The general-purpose Squeak developers list
Subject: Re: [squeak-dev] new VM appears not to be flushing

On Jan 14, 2020, at 12:07 AM, Phil B <pbpublist at gmail.com> wrote:

Has anything changed in the VM related to process scheduling?  Assuming you're doing your data store writes and transaction logging via processes, that would be one of the first places I'd look.

No, the scheduler/interrupt code is unchanged.  Let’s let Chris stabilize his tests and then find out if and by which versions of the vm  they are affected by.  I’m not convinced there’s a signal there yet.

On Mon, Jan 13, 2020 at 11:57 PM Chris Muller <asqueaker at gmail.com<mailto:asqueaker at gmail.com>> wrote:
Oh, okay, #sync does what I *thought* #flush did.  I would never discover some of these things if you weren't so generous with your expertise, thanks.  I think #sync is what I want since I want it to be safe from sudden power outage, not just a process kill.

I did try changing it to #sync, but it still failed the same.  I then went back and tried my #reopen a second time, this time it failed in the same way too!  So either it got lucky before, or I err'd in my testing somehow, but I'm actually glad that it seems to be failing consistently now.

There's definitely something different with the new VM, but it's not flush.  Sorry for the false alarm.  I'll have to keep digging.

 - Chris

On Mon, Jan 13, 2020 at 4:39 PM Levente Uzonyi <leves at caesar.elte.hu<mailto:leves at caesar.elte.hu>> wrote:
Hi Chris,

#flush calls fflush(), #sync calls #fflush() and then #fsync().
The former does not write data to disk, the latter does. And the latter is
obviously a lot slower. If that explanation is not clear, this might be
better: https://stackoverflow.com/a/2340641

On Mon, 13 Jan 2020, Chris Muller wrote:

> The test case proves that it does.

Can you give us a small snippet which reproduces the bug and can be
executed in a Trunk image?

> The comment of the method is,
>     "When writing, flush the current buffer out to disk."

That comment is wrong. That might have been true 21 years ago, even
though that's not very likely either.
The comment on StandardFileStream >> #flush is correct, but could be more
verbose: "Flush pending changes".


> I know what filesystem I deploy to, but Squeak appears to be silently ignoring this (rather important) expectation about #flush, that its own
> comment presents.
>   - Chris
> On Sun, Jan 12, 2020 at 6:15 PM Levente Uzonyi <leves at caesar.elte.hu<mailto:leves at caesar.elte.hu>> wrote:
>       Hi Chris,
>       Do you expect #flush to write the changes to disk?
>       Levente
>       On Sun, 12 Jan 2020, Chris Muller wrote:
>       > Magma has been stable in 5.2 for a long time under an older VM, all tests pass.  But by changing ONLY the VM (not the image) to the
>       new release-candidate, it fails the forward-recovery test.  This test tests the scenario of a
>       > server failure during mid-write.  Unless I change StandardFileStream>>#flush as in Files-cmm.182, the recovery data which Magma
>       relies on #flush to ensure is preserved is, in fact, not preserved.  It appears to be a breakage
>       > of the contract which causes the test to fail.  This functionality is important to avoid corrupting databases.
>       > I saw a discussion on the Cuis list in which someone was asserting that flush is no longer necessary(!!), and made a vague reference
>       to a "thread on squeak-dev" which I never found.
>       >
>       > I hope this is just an oversight, otherwise I'll have to rely something like Files-cmm.182, which is half the speed of the old
>       #flush.
>       >
>       > Best,
>       >   Chris
>       >
>       >

More information about the Squeak-dev mailing list