[squeak-dev] new VM appears not to be flushing

Chris Muller asqueaker at gmail.com
Thu Jan 16 00:04:04 UTC 2020


After a two-day rabbit-hole adventure, I finally found the cause of the
failing test.  I'll spare myself having to type the long story and you
having to see it and just say, it came down to

     NetNameResolver localHostName

having gotten fixed in the VM, and reporting the hostname in the new VM,
instead of the IP string, as in the old.  This caused the test to take a
different path which I've now accounted for.  The Magma suite now passes in
5.2 with the new VM.  Now I can begin debugging the issues it's having in
Squeak 5.3!   :/

Thanks again for all y'all's help.

 - Chris

On Tue, Jan 14, 2020 at 2:38 AM Beckmann, Tom <
Tom.Beckmann at student.hpi.uni-potsdam.de> wrote:

> A random thought without proper understanding of the problem: could the
> problem or change in behavior be related to the linux build recently
> switching to clang rather than GCC (at least I believe it used to be GCC)?
> These lowlevel primitives would likely be in the libc that the compiler
> links to, no?
>
> Best,
> Tom
> ________________________________________
> From: Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> on
> behalf of Eliot Miranda <eliot.miranda at gmail.com>
> Sent: Tuesday, January 14, 2020 9:37:03 AM
> To: The general-purpose Squeak developers list
> Subject: Re: [squeak-dev] new VM appears not to be flushing
>
> On Jan 14, 2020, at 12:07 AM, Phil B <pbpublist at gmail.com> wrote:
>
> 
> Has anything changed in the VM related to process scheduling?  Assuming
> you're doing your data store writes and transaction logging via processes,
> that would be one of the first places I'd look.
>
> No, the scheduler/interrupt code is unchanged.  Let’s let Chris stabilize
> his tests and then find out if and by which versions of the vm  they are
> affected by.  I’m not convinced there’s a signal there yet.
>
> On Mon, Jan 13, 2020 at 11:57 PM Chris Muller <asqueaker at gmail.com<mailto:
> asqueaker at gmail.com>> wrote:
> Oh, okay, #sync does what I *thought* #flush did.  I would never discover
> some of these things if you weren't so generous with your expertise,
> thanks.  I think #sync is what I want since I want it to be safe from
> sudden power outage, not just a process kill.
>
> I did try changing it to #sync, but it still failed the same.  I then went
> back and tried my #reopen a second time, this time it failed in the same
> way too!  So either it got lucky before, or I err'd in my testing somehow,
> but I'm actually glad that it seems to be failing consistently now.
>
> There's definitely something different with the new VM, but it's not
> flush.  Sorry for the false alarm.  I'll have to keep digging.
>
>  - Chris
>
> On Mon, Jan 13, 2020 at 4:39 PM Levente Uzonyi <leves at caesar.elte.hu
> <mailto:leves at caesar.elte.hu>> wrote:
> Hi Chris,
>
> #flush calls fflush(), #sync calls #fflush() and then #fsync().
> The former does not write data to disk, the latter does. And the latter is
> obviously a lot slower. If that explanation is not clear, this might be
> better: https://stackoverflow.com/a/2340641
>
> On Mon, 13 Jan 2020, Chris Muller wrote:
>
> > The test case proves that it does.
>
> Can you give us a small snippet which reproduces the bug and can be
> executed in a Trunk image?
>
> >
> > The comment of the method is,
> >
> >     "When writing, flush the current buffer out to disk."
>
> That comment is wrong. That might have been true 21 years ago, even
> though that's not very likely either.
> The comment on StandardFileStream >> #flush is correct, but could be more
> verbose: "Flush pending changes".
>
>
> Levente
>
> >
> > I know what filesystem I deploy to, but Squeak appears to be silently
> ignoring this (rather important) expectation about #flush, that its own
> > comment presents.
> >
> >   - Chris
> >
> >
> >
> > On Sun, Jan 12, 2020 at 6:15 PM Levente Uzonyi <leves at caesar.elte.hu
> <mailto:leves at caesar.elte.hu>> wrote:
> >       Hi Chris,
> >
> >       Do you expect #flush to write the changes to disk?
> >
> >       Levente
> >
> >       On Sun, 12 Jan 2020, Chris Muller wrote:
> >
> >       > Magma has been stable in 5.2 for a long time under an older VM,
> all tests pass.  But by changing ONLY the VM (not the image) to the
> >       new release-candidate, it fails the forward-recovery test.  This
> test tests the scenario of a
> >       > server failure during mid-write.  Unless I change
> StandardFileStream>>#flush as in Files-cmm.182, the recovery data which
> Magma
> >       relies on #flush to ensure is preserved is, in fact, not
> preserved.  It appears to be a breakage
> >       > of the contract which causes the test to fail.  This
> functionality is important to avoid corrupting databases.
> >       > I saw a discussion on the Cuis list in which someone was
> asserting that flush is no longer necessary(!!), and made a vague reference
> >       to a "thread on squeak-dev" which I never found.
> >       >
> >       > I hope this is just an oversight, otherwise I'll have to rely
> something like Files-cmm.182, which is half the speed of the old
> >       #flush.
> >       >
> >       > Best,
> >       >   Chris
> >       >
> >       >
> >
> >
> >
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20200115/34b3e9a1/attachment.html>


More information about the Squeak-dev mailing list