[Seaside] Re: Seaside and GOODS

Thu Mar 17 15:12:48 CET 2005

(copying the list after an off-list discussion on strategies for using GOODS)

On Thu, 17 Mar 2005 08:31:17 -0500, Daniel Salama <dsalama at user.net> wrote:

> What I mean is that I was thinking/hoping to simply use the standard
> collection classes to store my objects, but had my reservations in
> terms of how I can use that data efficiently. For example, if I store a
> zip code lookup "database" in a Dictionary (where the zip code is the
> key), I should be able to search by the zip code. However, say I wanted
> to present to the user, the list of zip codes, sorted by city name, I
> thought I would have to extract the subset of data I wished to present
> to the user, and then internally sort it in whichever way I wanted.
> 
> What I understand you're suggesting, and I guess is mainly for
> performance purposes, is that I use BTrees or TSTrees instead of the
> Dictionary, for example.

Right; the standard collection classes assume that the entire
collection is resident in memory, and optimize accordingly.  BTree and
TSTree assume that the collection is mostly resident on some slower
media (like disk or over the network) and so try to minimize how much
of it needs to be loaded in for any given request.

A 5-digit zip code would probably want to be stored as a SmallInteger
key in a high-order BTree (IIRC, something like "BTree order: 20").

> I guess, internally, I may be thinking too much with a RDBMS mentality,
> thinking that I could simply ask GOODS for a list of zip codes in the
> state of Florida sorted by city name, as an example.

I would expect this to look something like

(self root stateAt: 'FL') cities asSortedCollection gather: [:ea | ea
allZipCodes]

That is, the root would have an index of states; the State object
would have a list of its cities; the City would have a list of its zip
codes.  After the initial lookup of 'FL' it's all direct pointer
references.

> Then, the other problem that comes to mind is, if I have a person
> object which stores a reference to a list of addresses, where each
> address stores a reference to the zip code, how can this be
> represented? 

The address object just stores a reference to the zip code object.

> In my mind, I would try to apply the data normalization
> rules I normally apply when designing a DB schema in an RDBMS, where
> there would be a bunch of foreign keys in, at least, three different
> tables (i.e. people, addresses, zip_codes). Would I or should I use a
> similar concept when using GOODS?

No. :)  Don't look stuff up by keys except when the keys are user
input (like 'FL' above, and even there, ideally they picked it from a
list which in Seaside means you should have a direct reference to the
object anyway).  The equivalent of a foreign key column is simply a
pointer reference (in an instance variable).  If you need to go in the
other (to-many) direction, store a collection.  You'll sometimes only
need one or the other, and sometimes need both - in which case you
have to be careful to maintain the invariants:

State>>addCity: aCity
    aCity setState: self.
   ^ cities add: aCity

Generally, I recommend making the collection side of this the public
interface, and the backlink private (#setState: vs. #state:) and never
sent directly.  Or the other way around, but be consistent about which
you use.

Many-to-many relations get a little more complex; I usually model this
a little more "relationally", and keep a separate object that acts as
the join table and keeps two dictionaries: one mapping X to sets of Y,
the other mapping Y to sets of X, both always updated in sync.

> I guess, in particular, is it smart to create an object for, say, zip
> code 33180, then store that in a root object in GOODS where I keep all
> my zip codes (the analogy of a zip_codes table) and at the same time
> store it in the address that belongs to the person? Will GOODS store a
> single instance of that zip code object and make references to the zip
> codes root object as well as the address or addresses referencing it,
> or would it store multiple copies of the same object (one in the zip
> codes root, and another in, say for example, a people root)?

It will only store a single instance of the zip code object.  It's
fine (and ideal) to have references to it from  multiple places.

> Since you seems to have more experience than me in this arena, if you
> were to develop a "simple" CRM web-based tool using Seaside, would you
> consider GOODS or would you simply use a RDBMS, keeping in mind that
> normally CRM tools offer plenty of "ad-hoc" queries to analyze contact
> information?

There are very few situations where I will use an RDBMS by choice. 
Unless there's legacy data to deal with, or the client specifically
requests it, I would much rather use some form of object database and
something like XML-RPC for interop.  So, yes, I would consider GOODS.

The main reason at this point I would opt not to use GOODS for a
particular project would be performance concerns - having to do
everything over the network, combined with a lack of server-side
querying, can be a real problem for some applications.  Local-disk
solutions like OmniBase have a real advantage here.  I do think,
however, that it would be possible to do some aggressive caching in
GOODS to improve the situation - I just haven't had time to work on
that yet.

As for "ad-hoc" queries: you'll have to be more specific.  The only
way that relational databases offer ad-hoc querying is if the user
types in SQL themselves.  Is that what you would do?  Otherwise,
you're providing a structured UI for the user to enter their query,
and so you know ahead of time what kinds of queries might be used and
can structure your object model and your indices accordingly.

Avi