[SqueakSource] SSFilesystem woes

Philippe Marschall philippe.marschall at gmail.com
Fri Aug 3 07:25:45 UTC 2007


2007/8/3, Andreas Raab <andreas.raab at gmx.de>:
> Hi -
>
> [I couldn't find a mailing list dedicated to SqueakSource discussions so
> I'm abusing Squeak-dev here. If there is one, please point me to it]
>
> We've been using SqueakSource at Qwaq for our internal projects and
> unfortunately it works only so-so. Mostly it works but every other day
> it basically spins out at 100% CPU and needs to be killed for good.
> Since this usually looses the last checkin(s) it's a major annoyance
> which we work around by sending an email message which includes a
> portion of the log so that at least you have a chance to see if it was
> "your" checkin that was just lost.
>
> That was until two days ago. About a week or so ago, we ran out of disk
> space on the box and after restoring it the server was working quite
> well until it again spun out in 100%. After the restart we noticed that
> we hadn't lost just one but several dozen checkins - basically
> everything that happened after we run out of disk space didn't show up.
>
> Since this smelled like major desaster I actually dug into the
> SqueakSource code to see what can be done to restore our data
> (fortunately, I could see that all of the data was actually on the
> server). This immediately showed a couple of major issues:
>
> 1) When SSFilesystem saves a repository it uses a mutex to serialize
> access but it doesn't protect a client from modifying the repository
> *while* it is saving. Since this is a process running in background
> priority, two saves in quick succession will lead to the second save
> modifying the repository that the first one is trying to write on disk.
> And indeed, looking at our problems, many of them show a pattern of two
> commits closely together like here:
>
> 2007-07-10T21:09:57+00:00 PUT /Qwaq/QwaqForums-1.0.42.mcm (qwaq)
> 2007-07-10T21:09:57+00:00 MODIFIED by SSSession>>putRequest:
> 2007-07-10T21:09:57+00:00 BEGIN SAVING
> 2007-07-10T21:11:03+00:00 PUT /Qwaq/QwaqForums-1.0.41.mcm (qwaq)
> 2007-07-10T21:11:03+00:00 MODIFIED by SSSession>>putRequest:
>
> (note that the "END SAVING" is missing before the second put) So it
> seems like one of the failure modes is that the repository is being
> modified *while* it is being saved. In addition, I think that one of the
> reasons while so many of the saved snapshots are "kaputt" is simply that
> they are broken by the same concurrent modification.
>
> I'd appreciate some insight from the authors (or anyone else
> knowledgeable) what the right fix for this problem might be. I have no
> idea how Seaside in general deals with these concurrency issues but it
> seems pretty clear that SSFilesystem is *not* safe in the face of
> concurrent modifications of the repository.
>
> 2) Much to my surprise I found that SSFilesystem actually *has* code
> that can be used to recover versions if any of the above happens
> (SSFilesystem>>importVersionsFor:) but it seems to be pretty much unused
> and affected by some bit rot. One of the things I did for our version is
> to hook this code up with the case that the last snapshot is kaputt, so
> that if there is a broken snapshot SSFilesystem automatically imports
> all the versions that aren't currently present in the repository. I'm
> attaching the recovery code in case anyone else has had similar problems.
>
> Question: Does anyone use similar/other changes like those? If so I'd be
> interested in learning about them.
>
> 3) The speed (and snapshot size) of SSFilesystem is pretty abysmal (on
> our box a repository snapshot is about 4mins and about 4MB each).
> Looking at what it's writing it seems that most of it is information
> that is easily available from the .MCZs and really doesn't need to be
> kept in the snapshot.
>
> Question: Is anyone using alternative storage mechanisms (lightweight &
> fast perhaps)? If so, what do you use and how does it work out?
> Generally speaking, what *do* people use for Squeaksource storage given
> that SSFilesystem is generally quite unreliable?

There is a Magama version in the Impara repository. It has also been
ported to Gemstone/S. If have never used either of them or heard any
usage reports.

Cheers
Philippe

> I'd appreciate any help on the above issues.
>
> Cheers,
>    - Andreas
>
>
>
>
>
>



More information about the Squeak-dev mailing list