[squeak-dev] Looking for real-world Magma application

Tue Nov 2 16:06:16 UTC 2010

On 2 November 2010 15:40, Norberto Manzanos <nmanzanos at gmail.com> wrote:
> The question is "Magma could be used in a real-world application?"
> I think not.

I think otherwise. Magma should strive to remove any obstacles.

> I used Magma a couple of years ago. I had to migrate 13000 records from a
> database, build objects form the data and find duplications. The whole
> process of migration to Magma took more than two weaks. Yes, weaks!. 15
> days!. And I had to made a lot of strange things to reduce time. I made
> periodical cleaning up process in the middle of the whole migation  to
> improve performance. The first attemps tend to infinite, I reduce it to two
> weaks.
> The application was in production two years, with a lot of performance
> problems. Finally we have to throw it away.
> Now I'm trying again. Now I use very simple objects (just a couple of string
> variables) with only one index for one of this strings .
> But the problem is once again the same: each time I add an object to a
> MagmaCollection I have to look after duplicates, and if found, merge them.
> I'm adding object to 4 MagmaCollections, with an index with size 64. (I need
> more, but once again, performance...)
> Look at this numbers
> I tried with 2000 records (total is 76000).
> If I just add the objects without search it takes 2 minutes
> If I search the objects it takes 16 minutes!
> If I search the objects, and then iterate the results to make a more fine
> comparision, it takes 20 minutes!
>
> In a linear progression, total time for 76000 records would be 12 hours,
> which is not very good. But time doesn't grow lineary but exponentially.
> I tried with 12000 records: 5 hours 30 minutes. How must I supose it will
> take with 76000 records. 2 days?  3?
> These are not times for a process I surely will perform again as my model
> grows.
> The numbers tells, once again, that the problem is not adding the objects,
> merging or materializing them but  searching on a MagmaCollection. Time goes
> by and this problem is not solved.
>
> I'm desperate.
> I have no way of persist my objects if I want to avoid everybody laughing at
> me.
>

Norberto, i suspect that you doing something wrong. Scanning 76000
records by iterating though entire collection
when looking for duplicates is indeed slow. And sure thing, time to
add new item will grow exponentially once collection grows.
And its not really matters what kind of DB you will use. It will be
slow everywhere, if you need to scan entire dataset before adding new
item to it.

So, i suspect that there is something wrong with the way how you using
MagmaCollection,
and because of that, performance degrades from O(log(n)) down to O(n),
for each operation.
Sure, its hard to say anything without looking at actual code.

> Norberto
>

-- 
Best regards,
Igor Stasenko AKA sig.