---------- Forwarded message ---------- From: Igor Stasenko siguctua@gmail.com Date: 25 Sep 2007 19:29 Subject: Re: [ANN] Magma 1.0r40 To: florian.minjat@emn.fr
I'm placed a line in removeGarbageCollectedObjectEntries Transcript show: 'Oids count: ' , self oidCount asString; cr. And during running Florian's update it shows (between other lines): Oids count: 29923 Oids count: 45449 Oids count: 49919 Oids count: 66399 Oids count: 76492 So its growing and growing.. And only way to get rid of dead keys, is to call MaOidManager>>finalizeOids. But i can't find any places, leading to call of this method. Is this designed to be managed by user directly, using MagmaSession>>finalizeOids ? I think it should call finalizeValues to dicts in removeGarbageCollectedObjectEntries, so both dicts with object>id and id>object associations will be cleaned and rehashed.
Or, maybe, since magma uses own version(s) of weakIdentityKey dicts, implementation of #at:put: in weakIdentityKey dict can be changed to silently reuse entries with key=nil. So there will be much less need for rehashing.
And it seems i was wrong saying that weak finalization process can interfere with oid manager weak dicts. Since oid manager never register own dicts to finalization registry it should not because of it. But then i wonder, why with big pauses putted in #reject: (or not using #reject: , as in new case) helped to avoid errors? The only way how oid can be lost, its, for instance, i adding new oid to dict and then using it somewhere, while at same time in other (background) process dict is being rehashing and/or manipulating in other ways, and after rehashing new entries is lost because they putted into place, which was already scanned by copying loop which moves associations from old array to new one.
Interesting, how well magma handles following code: session commit: [ [ big long huge loop ] fork. [ big long huge loop 2 ] fork. ... ]. Is it allows to spawn different processes in commit? I think, Florian you should check if your code does not contain such things, or maybe you using code, which leads to them..
I now trying to run update with all methods in MagmaOidManager enclosed by mutex.
I'm placed a line in removeGarbageCollectedObjectEntries Transcript show: 'Oids count: ' , self oidCount asString; cr. And during running Florian's update it shows (between other lines): Oids count: 29923 Oids count: 45449 Oids count: 49919 Oids count: 66399 Oids count: 76492 So its growing and growing.. And only way to get rid of dead keys, is to call MaOidManager>>finalizeOids. But i can't find any places, leading to call of this method. Is this designed to be managed by user directly, using MagmaSession>>finalizeOids ?
Yes, as documented in the performance tuning page
http://wiki.squeak.org/squeak/2985
at this time finalizeOids is intended to be called by the user,
I think it should call finalizeValues to dicts in removeGarbageCollectedObjectEntries, so both dicts with object>id and id>object associations will be cleaned and rehashed.
but that sounds like a good idea. I know these two cleaning mechanisms were tacked on years apart from each other, so maybe that's why they ended up separate mechanisms.
Of course, all this is just to accomodate less than ideal Weak Dictionary's included with Squeak. I'd much rather fix or replace the Dictionary's. Florian, have you tried any alternative Weak Dictionary's available? Be sure to check out Martin Loewis' and Sig's. (I've posted the link to Martin's several times in the last months, it's on Mantis).
Or, maybe, since magma uses own version(s) of weakIdentityKey dicts, implementation of #at:put: in weakIdentityKey dict can be changed to silently reuse entries with key=nil. So there will be much less need for rehashing.
MaDictionary is a dead-simple hack around the 12-bit identity hash limitation. Whenever standard WeakIdentityDictionary's get over 4K elements their performance suffers tremendously, so MaDictionary makes the degradation more linear by managing a collection of <4K sized Dictionary's.
I really don't want to make the Ma Dictionary's more complicated. I really would just like to replace them with something built from the ground up to scale to more than 4K elements.
it. But then i wonder, why with big pauses putted in #reject: (or not using #reject: , as in new case) helped to avoid errors?
I have never seen the errors myself. Do they happen with the stock Magma MaDictionary's? If so, how can I reproduce it?
Interesting, how well magma handles following code: session commit: [ [ big long huge loop ] fork. [ big long huge loop 2 ] fork.
Do NOT do that. However, you should be able to do this:
[ big long huge loop with multiple commits ] fork. [ big long huge loop 2 with multiple commits ] fork.
Is it allows to spawn different processes in commit?
No.
I now trying to run update with all methods in MagmaOidManager enclosed by mutex.
Thanks Sig, I hope we can find a lean, mean, fast Dictionary machine for the next Magma.
- Chris
On 26/09/2007, Chris Muller asqueaker@gmail.com wrote:
I'm placed a line in removeGarbageCollectedObjectEntries Transcript show: 'Oids count: ' , self oidCount asString; cr. And during running Florian's update it shows (between other lines): Oids count: 29923 Oids count: 45449 Oids count: 49919 Oids count: 66399 Oids count: 76492 So its growing and growing.. And only way to get rid of dead keys, is to call MaOidManager>>finalizeOids. But i can't find any places, leading to call of this method. Is this designed to be managed by user directly, using MagmaSession>>finalizeOids ?
Yes, as documented in the performance tuning page
http://wiki.squeak.org/squeak/2985
at this time finalizeOids is intended to be called by the user,
I think it should call finalizeValues to dicts in removeGarbageCollectedObjectEntries, so both dicts with object>id and id>object associations will be cleaned and rehashed.
but that sounds like a good idea. I know these two cleaning mechanisms were tacked on years apart from each other, so maybe that's why they ended up separate mechanisms.
Of course, all this is just to accomodate less than ideal Weak Dictionary's included with Squeak. I'd much rather fix or replace the Dictionary's. Florian, have you tried any alternative Weak Dictionary's available? Be sure to check out Martin Loewis' and Sig's. (I've posted the link to Martin's several times in the last months, it's on Mantis).
Or, maybe, since magma uses own version(s) of weakIdentityKey dicts, implementation of #at:put: in weakIdentityKey dict can be changed to silently reuse entries with key=nil. So there will be much less need for rehashing.
MaDictionary is a dead-simple hack around the 12-bit identity hash limitation. Whenever standard WeakIdentityDictionary's get over 4K elements their performance suffers tremendously, so MaDictionary makes the degradation more linear by managing a collection of <4K sized Dictionary's.
I really don't want to make the Ma Dictionary's more complicated. I really would just like to replace them with something built from the ground up to scale to more than 4K elements.
it. But then i wonder, why with big pauses putted in #reject: (or not using #reject: , as in new case) helped to avoid errors?
I have never seen the errors myself. Do they happen with the stock Magma MaDictionary's? If so, how can I reproduce it?
Interesting, how well magma handles following code: session commit: [ [ big long huge loop ] fork. [ big long huge loop 2 ] fork.
Do NOT do that. However, you should be able to do this:
[ big long huge loop with multiple commits ] fork. [ big long huge loop 2 with multiple commits ] fork.
Is it allows to spawn different processes in commit?
No.
So, my suggestions was correct. Since we can end up with using different classes/packages which interact with magma, and they, in turn, can use third party packages, you never really know, if such "malicious" code executing or not.
I now trying to run update with all methods in MagmaOidManager enclosed by mutex.
Yesterday i sent changed MagmaOidManager with mutex to Florian, and he reported that he was able to run his update (twice in a row) without any errors.
Also, i tried to run update on image which Florian gave me month ago, with mutexes and stumbled upon error which says, that something changing object(s) while they being serialized. This is another sign, that there is unknown entity doing something wrong with data , and most probably runs in parallel. I can't say what is it, or why or where it is.
Personally, i don't like an idea of having mutex in MagmaOidManager because it slows the code which used very frequently. I'm just worry that this may be the only way to keep things safe. Of course its a question , what Florian does in his update process, which leads to such errors, and what he can do to avoid that without using mutexes.
Thanks Sig, I hope we can find a lean, mean, fast Dictionary machine for the next Magma.
- Chris
http://lists.squeakfoundation.org/pipermail/magma/2006-March/000207.html
On 9/26/07, Igor Stasenko siguctua@gmail.com wrote:
On 26/09/2007, Chris Muller asqueaker@gmail.com wrote:
I'm placed a line in removeGarbageCollectedObjectEntries Transcript show: 'Oids count: ' , self oidCount asString; cr. And during running Florian's update it shows (between other lines): Oids count: 29923 Oids count: 45449 Oids count: 49919 Oids count: 66399
...
I found code, which uses MagmaOidManager in two different processes. I placed a check instead of Mutex in it, which throws error, if some other process instead of one, which was first who accessed it tries to deal with oids.
After running update, i found, that the only process, which accessing oid manager except one, which running update is running by magma itself:
See fork in MaObjectRepository>>flushCacheSoon.
After changing this method to:
flushCacheSoon self flushCritical: [ self commitCritical: [ self flushCache. applyProcess _ nil ] ]
there's no other processes who attempt to use oid manager from except one, which runs update.
So, the conclusion is obvious: flush causing problems.
Cris, please check the #flushCritical: code. I think it should not run inside a commit or parallel to commits if we want it to run it in standalone process. Otherwise magma disregards own rule which says that following is not allowed: session commit: [ [ some code ] fork ]
Also, i'm glad to note, that with changes in #flushCacheSoon, Florian able to run own update without errors regardless using my dictionaries or using stock classes.
But timing are different: - out of the box magma: 18 minutes - using my dictionaries: 8 minutes.
quite impressive for real world application isn't? :)
Cris, in attachment i placed the MagmaOidManager code, which checks the process. Just for case, if you want to see it yourself.
Chris Muller wrote:
I'm placed a line in removeGarbageCollectedObjectEntries Transcript show: 'Oids count: ' , self oidCount asString; cr. And during running Florian's update it shows (between other lines): Oids count: 29923 Oids count: 45449 Oids count: 49919 Oids count: 66399 Oids count: 76492 So its growing and growing.. And only way to get rid of dead keys, is to call MaOidManager>>finalizeOids. But i can't find any places, leading to call of this method. Is this designed to be managed by user directly, using MagmaSession>>finalizeOids ?
Yes, as documented in the performance tuning page
http://wiki.squeak.org/squeak/2985
at this time finalizeOids is intended to be called by the user,
I make a call to finalizeOids after every commit in my application but it must not be enough.
I think it should call finalizeValues to dicts in removeGarbageCollectedObjectEntries, so both dicts with object>id and id>object associations will be cleaned and rehashed.
but that sounds like a good idea. I know these two cleaning mechanisms were tacked on years apart from each other, so maybe that's why they ended up separate mechanisms.
Of course, all this is just to accomodate less than ideal Weak Dictionary's included with Squeak. I'd much rather fix or replace the Dictionary's. Florian, have you tried any alternative Weak Dictionary's available? Be sure to check out Martin Loewis' and Sig's. (I've posted the link to Martin's several times in the last months, it's on Mantis).
I tried sig's optimisation and it was a lot quicker than the normal one. The problem was the race condition we are dealing with arrived quicker too. A simple but bad solution was to add a delay in #reject:. I don't know if sig has got a better solution yet.
I'll try the Martin Loewis's optimisation to compare the two.
Florian
Florian Minjat wrote:
I tried sig's optimisation and it was a lot quicker than the normal one. The problem was the race condition we are dealing with arrived quicker too. A simple but bad solution was to add a delay in #reject:. I don't know if sig has got a better solution yet.
I'll try the Martin Loewis's optimisation to compare the two.
Ok I just tried it, by installing the fix on Mantis and using "MagmaPreferences weakIdentityKeyDictionaryClass: WeakIdentityKeyDictionary." By the way there should be the same for weakValueDictionaryClass in order to really optimize something. I don't know where it's used in magma code, so I can't do that...
So yes after that I launched my update process, and got an error 'could not find an empty slot.' after 20s in WeakKeyDictionary>>noCheckAdd:.
Florian
magma@lists.squeakfoundation.org