Q: incremental garbage collection overhead

Jason Johnson jason.johnson.081 at gmail.com
Sun Nov 25 17:29:58 UTC 2007


Ok, sounds like you know what you're doing then.  In that case, yes it
would be good for prototyping and so on.  I didn't mean my response to
be unhelpful, but I also didn't want to be person (a) from this
question:

http://weblogs.asp.net/alex_papadimoulis/archive/2005/05/25/408925.aspx

On Nov 25, 2007 5:37 PM, alain rastoul <alr.dev at free.fr> wrote:
> Hi Jason
>
> I 've been working with sql server every day since years, part of my job
> includes helping developers or consultants to rewrite bad performing queries
> (query plans etc). For some customers we set up cubes with analysis services
> and we do not directly use the rdbms, that would be too much load and far
> unusable.
>
> And about your question, yes, of course I'm sure I need to load a lot of
> rows, in fact I hope I could not load hundred thousands but milllions of
> rows ... (one hundred millions would be fine :) ) . I don't know if it will
> be possible with squeak without tackling some issues, but today I find it
> good for quick prototyping and explorations (in this case about hashing,
> cardinalities and computations...), it's not at all about how to persist
> data.
>
> Whatever, thank you for taking time to answer
>
> Regards
> Alain
>
> "Jason Johnson" <jason.johnson.081 at gmail.com> a écrit dans le message de
> news: aa22f0200711221000x25c03bf8neeb1fa560efceccf at mail.gmail.com...
>
> > This will probably sound like a cop-out, but are you sure you need to
> > be loading hundreds of thousands of rows?  If you are using an RDBMS
> > anyway, I would move as much processing as possible to the DB.
> >
> > I don't know that you are doing this, but in my professional
> > experience I see a lot of people pulling lots of rows like this and
> > then doing all kinds of post processing on them.  If one is going to
> > do that then the overhead of having an RDBMS isn't worth it.  There
> > are lots of ways to persist data.
> >
> > On Nov 21, 2007 9:23 PM, alain rastoul <alr.dev at free.fr> wrote:
> >> Hi
> >>
> >> I'm trying to load data from a sql server database (hundred thousands
> >> rows)
> >> into Heaps with ODBC/FFI and I noticed that most of the time is spent in
> >> incremental garbage collection (about 80 to 90% of the running time of
> >> the
> >> load process!).
> >> I will look at the ODBCResultSet implementation to limit
> >> IdentityDictionnary/Row allocation by working with preallocated arrays
> >> but
> >> this will solve only one of my problems.
> >>
> >> I was wondering if there is a way to limit incremental collections by
> >> running them only when a certain amount of memory was allocated, I found
> >> setGCBiasToGrowGCLimit in SystemDictionnary (Smalltalk
> >> setGCBiasToGrowGCLimit: 16*1024*1024). but it doesn't work and popup a "a
> >> primitive has failed" error. Is it the right method ?
> >>
> >> Another question about garbage collection is the overhead of loaded data
> >> in
> >> objects for the VM (hundred MB) : is there a way to know if incremental
> >> collection is bloated by those data or to know when they are moved to old
> >> space ?
> >>
> >> Any pointers, ideas or links are welcome
> >>
> >> Thanks
> >>
> >> Regards,
> >> Alain
> >>
> >>
> >>
> >>
> >>
> >
> >
>
>
>
>
>



More information about the Squeak-dev mailing list