Q: incremental garbage collection overhead

Jason Johnson jason.johnson.081 at gmail.com
Thu Nov 22 18:00:40 UTC 2007


This will probably sound like a cop-out, but are you sure you need to
be loading hundreds of thousands of rows?  If you are using an RDBMS
anyway, I would move as much processing as possible to the DB.

I don't know that you are doing this, but in my professional
experience I see a lot of people pulling lots of rows like this and
then doing all kinds of post processing on them.  If one is going to
do that then the overhead of having an RDBMS isn't worth it.  There
are lots of ways to persist data.

On Nov 21, 2007 9:23 PM, alain rastoul <alr.dev at free.fr> wrote:
> Hi
>
> I'm trying to load data from a sql server database (hundred thousands rows)
> into Heaps with ODBC/FFI and I noticed that most of the time is spent in
> incremental garbage collection (about 80 to 90% of the running time of the
> load process!).
> I will look at the ODBCResultSet implementation to limit
> IdentityDictionnary/Row allocation by working with preallocated arrays but
> this will solve only one of my problems.
>
> I was wondering if there is a way to limit incremental collections by
> running them only when a certain amount of memory was allocated, I found
> setGCBiasToGrowGCLimit in SystemDictionnary (Smalltalk
> setGCBiasToGrowGCLimit: 16*1024*1024). but it doesn't work and popup a "a
> primitive has failed" error. Is it the right method ?
>
> Another question about garbage collection is the overhead of loaded data in
> objects for the VM (hundred MB) : is there a way to know if incremental
> collection is bloated by those data or to know when they are moved to old
> space ?
>
> Any pointers, ideas or links are welcome
>
> Thanks
>
> Regards,
> Alain
>
>
>
>
>



More information about the Squeak-dev mailing list