[Seaside] GOODS best practice data storage

Avi Bryant avi at beta4.com
Thu May 20 21:42:40 CEST 2004


On May 20, 2004, at 12:28 PM, Sebastián Sastre wrote:

> I think the way the objects are stored is the key.

Yes, you're exactly right.  Storing 13000 objects in an 
OrderedCollection is a really bad idea; every time the collection 
grows, it has to recreate the entire array.  This makes commits 
extremely slow, even if you're just adding a small number of items to 
an already large collection.  If you use a doubly linked list or a 
BTree (I have a BTree implementation on SqueakMap) you'll have much 
better results.

I'd also suggest that when you're doing the migration, you try to keep 
the number of commits you do small - one per item is going to be slow, 
because there's a fair amount of overhead on each commit.  One per 100 
items is probably more reasonable.  You should also send #flushAll to 
the database after every commit, because once having committed those 
objects you don't need them to stick around in the cache.

A while ago I helped Ken Causey with some benchmarking around importing 
BFAV posts into GOODS.  He has some graphs and numbers up at 
http://kencausey.com/goodsperf.png .  You'll notice that using BTrees 
(or TreeSets, which are also included in my BTree package) leads to a 
much flatter graph than OrderedCollection and Dictionary.  The other 
axis he was experimenting with was disconnecting/reconnecting from the 
DB for every n commits, rather than using #flushAll.  IIRC with the 
latest GOODS releases this doesn't make any difference, but it might 
still be worth looking at.

Hope this helps,

Avi

> 	items := RDBMSDatabase allItems.
> 	db := KKDatabase onHost:'voyager' port:6101.
> 	db root: Dictionary new;commit.
> 	db root at:#items put:Dictionary new.
> 	db commit.
> 	1 to: items size do:[:i|
> 		item := items at:i.
> 		(db root at:#items) at: item identifierCodeString put:
> item.
> 		db commit].



More information about the Seaside mailing list