Next steps

Thu Jan 12 06:42:04 UTC 2006

Hi Chris and all others!

Chris Muller <chris at funkyobjects.org> wrote:
> Hi Göran, what a dream project you have started!  The

Yes, indeed. And it is going great so far. Seaside is really nice - but
we all knew that of course. A tad short on class comments (as is Magma
<cough>) but it has examples, unfortunately not too many for the new
Canvas API - but it is pretty easy to "dig out" how to do things.

And Magma is just churning along so far, really neat. I am using the
ConnectionPool thingy that Cees has in Kilauea, have made some tweaks.
Still a bit undecided on how to deal with Magma sessions vs Seaside
sessions, right now I use the pool and allocate/release on each request
- but that is actually rather unpleasant - my Seaside components can't
hold onto persistent domain objects that way, ouff!

As I wrote I was toying with the idea of having the Seaside sessions
share a single "readonly" Magma session (greatly increasing the
scalability) per default and then whenever they need to commit something
they would allocate a new session from the pool (the pool having a
number of connected sessions), commit, and release the session.

Unfortunately I would then typically be holding persistent objects from
the readonly session and wanting to modify them and commit in another
session... Is there any facility for dealing with that scenario? In
other words, given object x from session A, can I easily obtain object x
attached to session B? I assume not given that I would need the full
backward chain of objects up to the root.

The alternative is to bite the bullet and simply re-traverse down to the
desired object x in session B when I decide I want to send modifying
messages to it and commit, but that will make the code uglier.

As a reminder - the reason for my discussion on this topic is that I
feel that the "simplistic approach" of simply using a single
MagmaSession for each Seaside session doesn't scale that well. I am
looking at a possible 100 concurrent users (in the end, not from the
start) using an object model with at least say 50000 cases - which of
course each consists of a number of other objects. Sure, I can use some
kind of multi-image clustering with round-robin Apache in front etc, but
still.

Having 100 MagmaSessions all using their own copy of that model sounds
tough.

As a sidenote, GemStone has a "shared page cache" so that multiple
sessions actually share a cache of objects in ram. Could we possibly
contemplate some way of having sessions share a cache? Yes, complex
stuff I know. Btw, could you perhaps explain how the caching works
today? Do you have some kind of low level cache on the file level for
example?

> very best projects are the few forging new territory
> and leading by example by progressive thinkers like
> yourself and progressive technologies, way to go.

In this project it feels like a prerequisite for success. No chance I
will be able to do this in the planned short time period otherwise.
Magma is essential, and so is Seaside. And Squeak of course.

> (I like to respond to your earlier note as well, sorry
> for the delay).
> 
> > 3. The new system is intended to support offline
> > operation, meaning that
> > users will be able to make a standalone installation
> > on their laptops
> > and then replicate a portion of the model to it,
> > work offline, and then
> > sync up later. I will be using a "prevaylerish"
> > pattern to accomplish
> > that, or call it a Command pattern if you like. I
> > also note the
> > existence of Magma's forwarding proxies (might get
> > handy). So yes, the
> > laptops will in that case run a local Magma with a
> > subset of the full
> > db.
> 
> This function will be built into Magma.  "1.1" has
> security, "1.2" will have security and "import/export"
> of large chunks of the persistent model.  1.2 is the
> very next thing I intend to work on as sooon as the
> 1.1 and KryptOn are stabilized.

Yes, I saw that.

> I would like to share a few more thoughts about this
> function.  The idea is that Magma is too centralized. 
> There needs to be a way to accmoplish exactly what you
> said, for someone to be able to "download" a chunk of
> the model for their own offline work (i.e., on a
> plane) and then, later, 'sync-up'.
> 
> I also intend for this to serve as the basis for "long
> transactions."
> 
> Now, I want to try to avoid the notion of a "master"
> and "replicate".  Instead, any repository can simply
> be a conglomeration of objects from many other
> repositories, and the repository knows from whence
> each object "originated" to support the sync-up.

This is actually the same idea I want to use in the future architecture
of SM.
Which perhaps could turn out to be Magma based, who knows.

> If there is a commit-conflict during the sync-up, the
> committer can only get through that by bringing down
> the objects in conflict into their own repository,
> reapplying their updates, and then try to commit
> again.
> 
> Bottom line, you can download your own copy of of the
> model and that copy is "yours" (you could host it, for
> example).  But the one you copied from is not yours,
> therefore the burden of commit-conflict reconciliation
> is always on the committer.
> 
> This function, combined with the ForwardingProxy's I
> hope will be sufficient for collaboration on
> large-scale domain models in a distributed fashion.

Right. In my current scenario I still (for several reasons linked to the
requirements) I still will want to "deal" with it using the command
pattern. But I think you are on the right track with that focus. The
other thing I really would like to have is a damn good free text engine
built in, but hey, I will simply have to use something on the side. :)

> > 1. Regarding wrapping each request in a commit - how
> > costly is that? The
> > abort/begin is of course needed (and how much does
> > an abort cost if no
> > modifications have been made?), but how much does
> > the commit cost if I
> > say do no modifications but have a large "read set"?
> > I am guessing this
> > is much cheaper if I use WriteBarrier? Is the
> > WriteBarrier code fine to
> > use?
> 
> A commit is pretty cheap with a small readSet.  With a
> large readSet, WriteBarrier will definitely improve it
> dramatically.

I kinda guessed. Otherwise you keep an original duplicate of all cached
objects, right? So WriteBarrier also improves on memory consumption I
guess.

> WriteBarrier is still supported, but I haven't tested
> it in a while.  WriteBarrier itself also has at least
> one bug related to changing the class-model while
> objects are behind a WriteBarrier.  Therefore, you
> should never use WriteBarrier in a development
> environment where classes will be recompiled.

No problem, as long as I can switch it on for deployment. :)

> Still, it is probably good to try to keep the readSet
> as small as possible.

Well, I find this recommendation slightly odd *in general*. I understand
how it makes each transaction faster - but on the other hand you loose
the caching benefit. For example, in this app I want a significant part
of the model to be cached at all times - the meta model. It will not be
large (so I can afford to cache it, even in several sessions), but it
will be heavily used so I don't want to end up reading it over and over.

[SNIP]
> Sure, all of the security can be essentially disabled.
>  The choice is yours.

Very good. :) And also - do you have any clue on how the performance is
affected by using the various security parts?

regards, Göran