[squeak-dev] FileStreams Limit

Chris Muller asqueaker at gmail.com
Sat Feb 19 05:00:17 UTC 2022


Hi Jörg,

My problem is simply that I need to leave the streams open coz reopening
> for every write is too slow.
>

I'm all too-familiar with this challenge!  For Magma's file
storage, depending on the app's design and client behavior, there is no
theoretical upper limit on the number of files utilized.  As you can
imagine, it didn't take long for a medium sized domain to run into the
upper limit of simultaneously-open files, and affect the server (this was
back in 2005).  I realized, to have a resilient server, Magma's design
would be *required* to be able to operate within all manner of
resource limits.

How does it solve this particular one?  It defaults to a maximum of only
180 files open at a given time (low enough for almost any environment, but
large enough to be able to perform), which can be adjusted up or down by
the app administrator according to their knowledge of their VM and OS
environment.  Internally, Magma adds opened FileStreams into a fixed-size
LRU cache.  Upon any access, a FileStream is "renewed" back to the top,
while as more streams are opened beyond the set capacity, the least-used
are closed just before being pushed off the bottom.

It's a strategy that has worked remarkably well over the years.

I have realtime data coming through a socket in nanosecond precision and
> the file handling must be very fast. Currently I have 120 nanosecond
> realtime streams and 2645 minute-based streams.
>

Magma is fast enough for well-designed applications that can tolerate sub
second response times, but not sub nanosecond requirements.  To do the kind
of real time application you mentioned in Squeak, I think you would have to
just dump it to a file that is consumed separately, or make some sort of
implementation dedicated to that use-case.  Darn, I hate to have to say
that, sorry.


> As I use now binary format instead of the previous CSV format, I cannot
> read the plain data files anyway, so maybe I will give Magma a try. It does
> not matter if I can’t read binary files or can’t read Magma files with a
> text editor :-)
>

The entire database can be browsed by simply opening the #root in an
Explorer window and navigating down.  Even if the total model is terabytes
in size, other than the few milliseconds pause when opening big branches,
it's a completely transparent experience to exploring a local object.

 - Chris

PS -- Incidentally, the number of simultaneously-open FileStreams is not
the only constrained resource.  Depending on the host platform, there may
be limitations with maximum file sizes, too.  The same FileStream subclass
for Magma solves this too, a default max size of 1.8GB per physical
file, with .2., .3., etc. created and accessed transparently, as if it were
one big file...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20220218/dc38d1e3/attachment.html>


More information about the Squeak-dev mailing list