Next steps

Fri Jan 13 05:10:16 UTC 2006

Hey Göran, I don't have the context you have into your
domain, nor experience with Seaside.  Nevertheless, my
strong intution suggests we should step back and
consider again having one Magma session per Seaside
session.

I am not sure whether you are trying to optimize for
speed or memory consumption, but I think that this 1:1
approach is good for both.

> > Still, it is probably good to try to keep the
> readSet
> > as small as possible.
> 
> Well, I find this recommendation slightly odd *in
> general*. I understand
> how it makes each transaction faster - but on the
> other hand you loose
> the caching benefit. For example, in this app I want
> a significant part
> of the model to be cached at all times - the meta
> model. It will not be
> large (so I can afford to cache it, even in several
> sessions), but it
> will be heavily used so I don't want to end up
> reading it over and over.

It's ok.  Go ahead and cache your meta-model in each
session if its not so big, but seriously let
everything else be read dynamically as-needed.  Let
every session have only a very small portion of the
domain cached and keep it small via #stubOut.  

Reads (proxy materializations) are one of the fastest
things Magma does.  You are supposed to *enjoy* the
transparency, not have to worry about such complex
ways to circumvent it.

ReadStrategies and #stubOut: are intended to optimize
read-performance and memory consumption, respectively.
 If these are not sufficient, and assuming the
uni-session approach (all Seaside sessions share one
MagmaSession and one copy of the domain) is not
either, *then* these other complex alternatives should
be considered.  It's not easy for me to say but I have
to face the truth; if the intended transparency of
Magma cannot be enjoyed then that opens up lots of
other options that are equally less-transparent.

> As a reminder - the reason for my discussion on this
> topic is that I
> feel that the "simplistic approach" of simply using
> a single
> MagmaSession for each Seaside session doesn't scale
> that well. I am
> looking at a possible 100 concurrent users (in the
> end, not from the
> start) using an object model with at least say 50000
> cases - which of
> course each consists of a number of other objects.
> Sure, I can use some
> kind of multi-image clustering with round-robin
> Apache in front etc, but
> still.

Well, it may scale better than you think.  Peak
(single-object) read rate is 3149 per second on my
slow laptop, 7.15 per second (see
http://minnow.cc.gatech.edu/squeak/5606 or run your
own MagmaBenchmarker) to read one thousand objects. 
So if you have 1000 objects in a Case, 100 users all
requesting a case at exactly the same time then the
longest delay would be ~10 seconds (assuming you're
not serving with my slow, circa 2004 laptop). 
Optimizing the ReadStrategy for a Case would allow
better performance.

Any single-image Seaside server where you want to
cache a whole bunch of stuff is going to have this
sort of scalability issue, no matter what DB is used,
right?  Remember, you could use the many:1 approach
(all Seaside sessions sharing one Magma session and
single-copy of the domain), how does this differ from
any other solution?.

The 1:1 design, OTOH, is what makes multi-image
clustering possible, so from that aspect risk is
reduced.  That's the one I would try very hard to make
work before abandoning TSTTCPW.

> As a sidenote, GemStone has a "shared page cache" so
> that multiple
> sessions actually share a cache of objects in ram.

That's in the server-side GemStone-Smalltalk image
memory though, isn't it?  Magma doesn't do that.

> Could we possibly
> contemplate some way of having sessions share a
> cache? Yes, complex
> stuff I know. Btw, could you perhaps explain how the
> caching works
> today? Do you have some kind of low level cache on
> the file level for
> example?

I'm open to ideas.  The caching is very simple right
now, it just uses WeakIdentityDictionarys to hold read
objects.

> > A commit is pretty cheap with a small readSet. 
> With a
> > large readSet, WriteBarrier will definitely
> improve it
> > dramatically.
> 
> I kinda guessed. Otherwise you keep an original
> duplicate of all cached
> objects, right? So WriteBarrier also improves on
> memory consumption I
> guess.

No to the first question, yes to the second (IIRC). 
It doesn't keep an original "duplicate", just the
original buffer that was read.

> Very good. :) And also - do you have any clue on how
> the performance is
> affected by using the various security parts?

Authorizing every request seems to have imposed about
a 10% penalty.  #cryptInFiles is hard to measure since
writes occur in the background anyway. 
#cryptOnNetwork definitely slows down network
transmissions considerably, only use it if you have
to.

Regards,
  Chris