This will probably sound like a cop-out, but are you sure you need to be loading hundreds of thousands of rows? If you are using an RDBMS anyway, I would move as much processing as possible to the DB.
I don't know that you are doing this, but in my professional experience I see a lot of people pulling lots of rows like this and then doing all kinds of post processing on them. If one is going to do that then the overhead of having an RDBMS isn't worth it. There are lots of ways to persist data.
On Nov 21, 2007 9:23 PM, alain rastoul alr.dev@free.fr wrote:
Hi
I'm trying to load data from a sql server database (hundred thousands rows) into Heaps with ODBC/FFI and I noticed that most of the time is spent in incremental garbage collection (about 80 to 90% of the running time of the load process!). I will look at the ODBCResultSet implementation to limit IdentityDictionnary/Row allocation by working with preallocated arrays but this will solve only one of my problems.
I was wondering if there is a way to limit incremental collections by running them only when a certain amount of memory was allocated, I found setGCBiasToGrowGCLimit in SystemDictionnary (Smalltalk setGCBiasToGrowGCLimit: 16*1024*1024). but it doesn't work and popup a "a primitive has failed" error. Is it the right method ?
Another question about garbage collection is the overhead of loaded data in objects for the VM (hundred MB) : is there a way to know if incremental collection is bloated by those data or to know when they are moved to old space ?
Any pointers, ideas or links are welcome
Thanks
Regards, Alain