My first Magma experience (Now including GOODS)

Mon Apr 4 15:49:52 UTC 2005

Yanni,

Impressive results. I agree there could be "bugs" in my code. This was 
a first quick-and-dirty attempt for me to gauge where I'm standing with 
this whole idea of using Squeak et al. And on that note, Yanni, would 
you mind emailing me back the modified code for me to study?

Now, there is a huge difference in performance between PostgreSQL and 
GemStone, even though I'm impressed with both of them. I wonder why the 
big performance difference in Magma and GOODS. At this point, I'd like 
to blame it on my code until I get a copy from Yanni and run it again 
in my environment.

For practical purposes, I don't think I would be performing 2.c in real 
life apps, but it's good to know how it performed in your tests.

Thanks for the great effort you've put on this as well.

On Apr 4, 2005, at 11:48 AM, Yanni Chiu wrote:

> Daniel Salama wrote:
>>
>> Thanks Yanni. I will contact you off the list and maybe you can 
>> publish
>> your results back to the list for everyone's benefit.
>>
>> On Apr 2, 2005, at 10:14 PM, Yanni Chiu wrote:
>>
>>> If you send it to me (offlist), then I'll try to run it
>>> in four configurations:
>>>
>>> 1. Read into memory and a no-op save (to provide a baseline time)
>>> 2. Read and save into PostgreSQL.
>>> 3. Read and save using VW and GemStone.
>>> 4. Read and save into GOODS.
>>>
>>> I've had GOODS running at some point, but its not
>>> something I'm set up to do right now (it's been a TODO).
>
> Here are the results for #1-#3 (#4 will have to wait).
> Daniel's code, after reading and parsing the data file,
> stored the result in a class instance variable. So I ran
> this code first, then ran further tests using this data
> from within the image.
>
> I created "Header" and "Entry" objects using my framework.
> The fields are pretty much the same as Daniel's original,
> except for the part that handles the 1-M relationship
> between the Header and its many Entry's.
>
> The timing results are:
>
> 2(a) 7.443 seconds to load into Memory in 89 commit(s) [Squeak]
> 2(b) 203.113 seconds to load into PostgreSQL in 89 commit(s) [Squeak, 
> seqno, insert, update]
> 2(c) 581.57 seconds to load into PostgreSQL [Squeak. commit one entry 
> at a time, some errors]
> 3(a) 0.891 seconds to load into Memory in 89 commit(s) [VW]
> 3(b) 72.344 seconds to load into GemStoneAccess in 89 commit(s) 
> [VW/GemStone]
>
> There were 8784 entry records that were committed in batches
> of 100. (The commit count seems off by one). However, in 2(c)
> the Entry objects were committed one at a time, and some duplicate
> key errors occurred. I fixed the duplicate key problems for
> 2(b), but never re-ran the test in autocommit mode.
>
> Note that config. 2(a) is what I'd called config. 1 in my
> original posting. The purpose of running 2(a) and 3(a) is
> to provide a baseline time for the cost of the framework
> and creating the object from the in memory pre-parsed data.
>
> 2(a) vs. 2(b) is kind of an apples to oranges comparision
> because different objects and collections are being created
> and held in the image. However, in 3(a) vs. 3(b) the only
> difference is that the in-image objects and collections
> are persisted.
>
> In 2(b), each Entry resulted in three SQL statements (due
> to my framework). I had to use a sequence number because
> I could not find a unique key in the data. A SELECT was run
> to get the seqno. Then the INSERT inserted the record with
> only parsed data values. Finally, the UPDATE set the field
> to refer to the Header record.
>
> Hardware/Software:
>
> Squeak 3.5.1/Tea 1.9 VM
> Squeak 3.6 #5424
> WinXP Pro
> Toshiba Satellite Pro, Intel Pentium III 1GHz, 240 MB RAM