My first Magma experience (Now including GOODS)

Mon Apr 4 15:48:49 UTC 2005

Daniel Salama wrote:
> 
> Thanks Yanni. I will contact you off the list and maybe you can publish
> your results back to the list for everyone's benefit.
> 
> On Apr 2, 2005, at 10:14 PM, Yanni Chiu wrote:
> 
> > If you send it to me (offlist), then I'll try to run it
> > in four configurations:
> >
> > 1. Read into memory and a no-op save (to provide a baseline time)
> > 2. Read and save into PostgreSQL.
> > 3. Read and save using VW and GemStone.
> > 4. Read and save into GOODS.
> >
> > I've had GOODS running at some point, but its not
> > something I'm set up to do right now (it's been a TODO).

Here are the results for #1-#3 (#4 will have to wait).
Daniel's code, after reading and parsing the data file,
stored the result in a class instance variable. So I ran
this code first, then ran further tests using this data
from within the image.

I created "Header" and "Entry" objects using my framework.
The fields are pretty much the same as Daniel's original,
except for the part that handles the 1-M relationship
between the Header and its many Entry's.

The timing results are:

2(a) 7.443 seconds to load into Memory in 89 commit(s) [Squeak]
2(b) 203.113 seconds to load into PostgreSQL in 89 commit(s) [Squeak, seqno, insert, update]
2(c) 581.57 seconds to load into PostgreSQL [Squeak. commit one entry at a time, some errors]
3(a) 0.891 seconds to load into Memory in 89 commit(s) [VW]
3(b) 72.344 seconds to load into GemStoneAccess in 89 commit(s) [VW/GemStone]

There were 8784 entry records that were committed in batches
of 100. (The commit count seems off by one). However, in 2(c)
the Entry objects were committed one at a time, and some duplicate
key errors occurred. I fixed the duplicate key problems for
2(b), but never re-ran the test in autocommit mode.

Note that config. 2(a) is what I'd called config. 1 in my
original posting. The purpose of running 2(a) and 3(a) is
to provide a baseline time for the cost of the framework
and creating the object from the in memory pre-parsed data.

2(a) vs. 2(b) is kind of an apples to oranges comparision
because different objects and collections are being created
and held in the image. However, in 3(a) vs. 3(b) the only
difference is that the in-image objects and collections
are persisted.

In 2(b), each Entry resulted in three SQL statements (due
to my framework). I had to use a sequence number because
I could not find a unique key in the data. A SELECT was run
to get the seqno. Then the INSERT inserted the record with
only parsed data values. Finally, the UPDATE set the field
to refer to the Header record. 

Hardware/Software:

Squeak 3.5.1/Tea 1.9 VM
Squeak 3.6 #5424
WinXP Pro
Toshiba Satellite Pro, Intel Pentium III 1GHz, 240 MB RAM