Hi, I am trying to transfer a big object hierarchy to Magma : 110100 objects in 9004144 bytes. The basic structure is composed of ~70 Players with a lot of stuff inside. Each players can have some Letters which references other Players as sender or recipients. The problem is that I can't submit one player to an empty Magma repository without letting Magma insert all the references Players. And the resulting transaction is too big, causing an error. So I separated all the Letters from the Players, inserted the players (~50min) and then tried to insert the Letters. But I close the connection to the magma repository between the two, so Magma tried again to load all the references Players. So here is my question : how does Magma identify a given object as already inside a repository ? Does it compare the hash of the objects and I just need to redefine the hash method of Players ? Each try is quite long so I would like to understand a little more of the inner mechanisms of Magma instead of trying numerous times.
Florian
Hi Florian, this is an interesting problem.
First to answer your question, Magma identifies an object as already in the repository if it has a "permanent" oid. This is an oid range between MaOidCalculator #firstUserObjectOid and #lastUserObjectOid. The #hash of an object never comes into play to determine this, or anything in Magma for that matter.
One way to get your large object-model into Magma would be to change the method MaObjectBuffer>>#ensureSpaceFor: to allow more than 10 megabyte serializations. I only suggest this only because it is an arbitrary limit and it looks like your model is right there just over the limit. Change it to 50-meg on the fastest computer you have, It should work. Submit the commit and let it run all night.
Another option would be to temporarily change your Players to reference the other players only logically; by some temporary id or something. Once you get them loaded into Magma, then reset the references back to the actual objects.
Note #maTransientVariables is available to implement on Player so it will not serialize those variables, so you could actually ADD a new logical reference, if necessary, and make the hard player reference transient (temporarily).
Please let me know if these ideas are feasible.
- Chris
On 6/10/07, Florian Minjat florian.minjat@emn.fr wrote:
Hi, I am trying to transfer a big object hierarchy to Magma : 110100 objects in 9004144 bytes. The basic structure is composed of ~70 Players with a lot of stuff inside. Each players can have some Letters which references other Players as sender or recipients. The problem is that I can't submit one player to an empty Magma repository without letting Magma insert all the references Players. And the resulting transaction is too big, causing an error. So I separated all the Letters from the Players, inserted the players (~50min) and then tried to insert the Letters. But I close the connection to the magma repository between the two, so Magma tried again to load all the references Players. So here is my question : how does Magma identify a given object as already inside a repository ? Does it compare the hash of the objects and I just need to redefine the hash method of Players ? Each try is quite long so I would like to understand a little more of the inner mechanisms of Magma instead of trying numerous times.
Florian _______________________________________________ Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma
Chris Muller wrote:
Hi Florian, this is an interesting problem.
First to answer your question, Magma identifies an object as already in the repository if it has a "permanent" oid. This is an oid range between MaOidCalculator #firstUserObjectOid and #lastUserObjectOid. The #hash of an object never comes into play to determine this, or anything in Magma for that matter.
One way to get your large object-model into Magma would be to change the method MaObjectBuffer>>#ensureSpaceFor: to allow more than 10 megabyte serializations. I only suggest this only because it is an arbitrary limit and it looks like your model is right there just over the limit. Change it to 50-meg on the fastest computer you have, It should work. Submit the commit and let it run all night.
Great ! That should work very well.
Another option would be to temporarily change your Players to reference the other players only logically; by some temporary id or something. Once you get them loaded into Magma, then reset the references back to the actual objects.
This could be a good way too. But I don't know if it will be longer or not. Yesterday I tried again to save all the Letters in a dictionary of Players, add all the Players in Magma, then put back the Letters inside the Players. But the last part was so slow I had to stop the whole process after about one hour. I don't understand this slowliness because thoses Letters are only some references to Players which should already be inside Magma.
The resulting code gives something like this (a mailbox is a collection of Letters):
mailboxes := Dictionary new. players do: [:p | mailboxes at: (p login) put: (p mailbox). p mailbox: nil.]. session begin. players do: [:p | session root players add: p. session commitAndBegin]. players do: [:p | |magmaPlayer| magmaPlayer := session players where: [:r | r login equals: (p login)]. magmaPlayer mailbox: (mailboxes at: (p login)). session commitAndBegin].
Perhaps the players references by the Letters in the last block are detected in Magma as not yet inserted and it does it again ?
Note #maTransientVariables is available to implement on Player so it will not serialize those variables, so you could actually ADD a new logical reference, if necessary, and make the hard player reference transient (temporarily).
Do you have an example of use of a transient variable ? I way I understand it is that I make the Letters transients so they are not serialized inside magma, then I remove the transient property to load the Letters in a second pass ?
Thanks for your answer !
Florian
Please let me know if these ideas are feasible.
- Chris
On 6/10/07, Florian Minjat florian.minjat@emn.fr wrote:
Hi, I am trying to transfer a big object hierarchy to Magma : 110100 objects in 9004144 bytes. The basic structure is composed of ~70 Players with a lot of stuff inside. Each players can have some Letters which references other Players as sender or recipients. The problem is that I can't submit one player to an empty Magma repository without letting Magma insert all the references Players. And the resulting transaction is too big, causing an error. So I separated all the Letters from the Players, inserted the players (~50min) and then tried to insert the Letters. But I close the connection to the magma repository between the two, so Magma tried again to load all the references Players. So here is my question : how does Magma identify a given object as already inside a repository ? Does it compare the hash of the objects and I just need to redefine the hash method of Players ? Each try is quite long so I would like to understand a little more of the inner mechanisms of Magma instead of trying numerous times.
Florian _______________________________________________ Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma
Hi Florian,
Yesterday I tried again to save all the Letters in a dictionary of Players, add all the Players in Magma, then put back the Letters inside the Players. But the last part was so slow I had to stop the whole process after about one hour. I don't understand this slowliness because thoses Letters are only some references to Players which should already be inside Magma.
The resulting code gives something like this (a mailbox is a collection of Letters):
mailboxes := Dictionary new. players do: [:p | mailboxes at: (p login) put: (p mailbox). p mailbox: nil.]. session begin. players do: [:p | session root players add: p. session commitAndBegin]. players do: [:p | |magmaPlayer| magmaPlayer := session players where: [:r | r login equals: (p login)]. magmaPlayer mailbox: (mailboxes at: (p login)). session commitAndBegin]
Ok, thanks for the code, it looks real good. Except one thing that should be done to speed it up is add more than one player per commit. Add 100 or 1000 per commit. That should speed up the first part a lot.
Same comment for the second part. Instead of commitAndBegin after each one, do it every 100 or 1000. You could also choose to commit "every five seconds". I know it makes the code less clean, sorry, but this is one-time "bulk load" code anyway.
You should also send #finalizeOids to the session after each commit.
If it is still slow after doing all that, if you post a MessageTally spy it will help a lot.
Perhaps the players references by the Letters in the last block are detected in Magma as not yet inserted and it does it again ?
No, the object is available in the collection as soon as it returns from the commit.
Do you have an example of use of a transient variable ? I way I understand it is that I make the Letters transients so they are not serialized inside magma, then I remove the transient property to load the Letters in a second pass ?
Yes, the idea is to implement #maTransientVariables on the referencing class:
maTransientVariables ^ super maTransientVariables, #('letters')
My idea was to add a new variable temporarily, 'letterIds' or whatever. Populate this with some uniquely-identifying string or number for the letters. Commit it.
THEN, remove the #maTransientVariables method and enumerate them again, initializing each from the letterIds. Finally, remove the letterIds variable.
This is a cumbersome approach, hopefully the first one will work.
Regards, Chris
Hi again Chris,
Chris Muller wrote:
Hi Florian,
Yesterday I tried again to save all the Letters in a dictionary of Players, add all the Players in Magma, then put back the Letters inside the Players. But the last part was so slow I had to stop the whole process after about one hour. I don't understand this slowliness because thoses Letters are only some references to Players which should already be inside Magma.
The resulting code gives something like this (a mailbox is a collection of Letters):
mailboxes := Dictionary new. players do: [:p | mailboxes at: (p login) put: (p mailbox). p mailbox: nil.]. session begin. players do: [:p | session root players add: p. session commitAndBegin]. players do: [:p | |magmaPlayer| magmaPlayer := session players where: [:r | r login equals: (p login)]. magmaPlayer mailbox: (mailboxes at: (p login)). session commitAndBegin]
Ok, thanks for the code, it looks real good. Except one thing that should be done to speed it up is add more than one player per commit. Add 100 or 1000 per commit. That should speed up the first part a lot.
Same comment for the second part. Instead of commitAndBegin after each one, do it every 100 or 1000. You could also choose to commit "every five seconds". I know it makes the code less clean, sorry, but this is one-time "bulk load" code anyway.
You should also send #finalizeOids to the session after each commit.
If it is still slow after doing all that, if you post a MessageTally spy it will help a lot.
I fact there are only 59 Players ^^. And the whole dump takes more than one hour so I don't think it would speed up so much. I put 5 players per commit to test.
Here is the new code (sorry for the length) :
mailboxes := Dictionary new. players do: [:p | mailboxes at: (p login) put: (p mailbox). p mailbox: nil.]. commitCount := 0. players do: [ :p | session root players add: p. (commitCount = 5) ifTrue: [ self commitAndBegin. self magmaSession finalizeOids. commitCount := 0] ifFalse: [commitCount := commitCount +1]. ].
commitCount := 0. players do: [ :p | |magmaPlayer| magmaPlayer := (session players where: [:r | r login equals: (p login)]) first. magmaPlayer mailbox: (mailboxes at: (p login)). (commitCount = 5) ifTrue: [ self commitAndBegin. self magmaSession finalizeOids. commitCount := 0] ifFalse: [commitCount := commitCount +1]. ].
You can find the MessageTally at the end. It is much quicker (36min) with the finalizeOids and five adds by commit.
Perhaps the players references by the Letters in the last block are detected in Magma as not yet inserted and it does it again ?
No, the object is available in the collection as soon as it returns from the commit.
I tried to change my method to replace the Players by their logins before the first save, then iterate through the Letters to put back the references instead of the login. I am sure the references of the Letters comes from persistent players. But it was slower because there are much more magma queries. With your answer it is useless so I came back to the first version I showed you.
Do you have an example of use of a transient variable ? I way I understand it is that I make the Letters transients so they are not serialized inside magma, then I remove the transient property to load the Letters in a second pass ?
Yes, the idea is to implement #maTransientVariables on the referencing class:
maTransientVariables ^ super maTransientVariables, #('letters')
My idea was to add a new variable temporarily, 'letterIds' or whatever. Populate this with some uniquely-identifying string or number for the letters. Commit it.
THEN, remove the #maTransientVariables method and enumerate them again, initializing each from the letterIds. Finally, remove the letterIds variable.
This is a cumbersome approach, hopefully the first one will work.
This method is a little more complex than the first one. As I need only to do it once and for all to transfer the database, I'll use the first approach.
Regards, Chris
Thanks again !
Florian
------------------------------------------------------------ - 2077529 tallies, 2205004 msec.
**Tree** 98.4% {2169724ms} DOLBDD>>populateCollectionsFromBDDInSqueak 96.3% {2123419ms} DOLBDD>>commitAndBegin 96.3% {2123419ms} MagmaSession>>commitAndBegin 96.3% {2123419ms} MagmaSession>>commitAndBegin: 71.2% {1569963ms} MagmaSession>>newCommitPackageFor: |64.4% {1420023ms} MaTransaction>>changedObjects | |64.4% {1420023ms} MaTransaction>>addChangesFromReadSet | | 64.2% {1415613ms} MaTransaction>>didChange:from: | | 42.0% {926102ms} MaVariableObjectBuffer(MaVariableBuffer)>>isDifferent:using: | | |38.5% {848927ms} Array>>maIsChangedFrom:using: | | | |38.0% {837902ms} Array(Object)>>maIsChangedFrom:using: | | | | 21.9% {482896ms} MaVariableObjectBuffer(MaObjectBuffer)>>maInstVarAt: | | | | |20.4% {449821ms} MaVariableObjectBuffer(MaObjectBuffer)>>uint:at: | | | | | 20.2% {445411ms} ByteArray>>maUint:at: | | | | | 20.0% {441001ms} ByteArray>>maUnsigned48At: | | | | | 11.1% {244755ms} LargePositiveInteger>>+ | | | | | |8.4% {185220ms} LargePositiveInteger(Integer)>>+ | | | | | | |8.3% {183015ms} primitives | | | | | |2.7% {59535ms} primitives | | | | | 8.7% {191835ms} SmallInteger>>bitShift: | | | | | 8.1% {178605ms} SmallInteger(Integer)>>bitShift: | | | | | 7.9% {174195ms} primitives | | | | 13.6% {299881ms} MaObjectSerializer>>oidFor: | | | | 13.2% {291061ms} MagmaOidManager(MaOidManager)>>oidFor: | | | | 13.1% {288856ms} MagmaOidManager>>oidFor:ifAbsent: | | | | 9.9% {218295ms} MagmaOidManager(MaOidManager)>>oidFor:ifAbsent: | | | | |8.4% {185220ms} SmallInteger(Integer)>>maOid | | | | | 8.3% {183015ms} MaOidCalculator class>>oidForInteger: | | | | | 8.0% {176400ms} SmallInteger>>+ | | | | | 7.8% {171990ms} SmallInteger(Integer)>>+ | | | | | 5.9% {130095ms} primitives | | | | 2.6% {57330ms} primitives | | |3.2% {70560ms} Dictionary>>maIsChangedFrom:using: | | | 2.1% {46305ms} MaObjectSerializer>>oidFor: | | | 2.0% {44100ms} MagmaOidManager(MaOidManager)>>oidFor: | | | 2.0% {44100ms} MagmaOidManager>>oidFor:ifAbsent: | | 19.9% {438796ms} MaFixedObjectBuffer>>isDifferent:using: | | |9.7% {213885ms} MaObjectSerializer>>oidFor: | | | |9.6% {211680ms} MagmaOidManager(MaOidManager)>>oidFor: | | | | 9.6% {211680ms} MagmaOidManager>>oidFor:ifAbsent: | | | | 6.4% {141120ms} MagmaOidManager(MaOidManager)>>oidFor:ifAbsent: | | | | |2.7% {59535ms} WeakIdentityKeyDictionary(Dictionary)>>maAt:ifPresent:ifAbsent: | | | | | |2.3% {50715ms} WeakIdentityKeyDictionary(Dictionary)>>at:ifAbsent: | | | | |2.3% {50715ms} SmallInteger(Integer)>>maOid | | | | | 2.2% {48510ms} MaOidCalculator class>>oidForInteger: | | | | 2.2% {48510ms} WeakIdentityKeyDictionary(Dictionary)>>maAt:ifPresent:ifAbsent: | | |6.5% {143325ms} MaFixedObjectBuffer(MaObjectBuffer)>>maInstVarAt: | | | 6.0% {132300ms} MaFixedObjectBuffer(MaObjectBuffer)>>uint:at: | | | 5.9% {130095ms} ByteArray>>maUint:at: | | | 5.9% {130095ms} ByteArray>>maUnsigned48At: | | | 3.3% {72765ms} LargePositiveInteger>>+ | | | |2.5% {55125ms} LargePositiveInteger(Integer)>>+ | | | | 2.4% {52920ms} primitives | | | 2.5% {55125ms} SmallInteger>>bitShift: | | | 2.3% {50715ms} SmallInteger(Integer)>>bitShift: | | | 2.2% {48510ms} primitives | | 2.2% {48510ms} MaTransaction>>useWriteBarrierOn: |4.5% {99225ms} MaCommitPackage>>serializeObjectsUsing: | |4.4% {97020ms} MaObjectSerializer>>serializeGraph:do: | | 4.4% {97020ms} MaObjectSerializer>>appendGraph:do: | | 3.8% {83790ms} MaObjectSerializer>>append: | | 3.7% {81585ms} MaObjectSerializer>>bufferFor:storageObject:startingAt: |2.4% {52920ms} MagmaSession>>newCommitPackageFor: 17.5% {385876ms} MagmaSession>>refreshViewUsing: |17.5% {385876ms} MaCommitResult>>refresh: | 16.0% {352801ms} MagmaSession>>assignPermanentOidsFrom: | 13.9% {306496ms} MaObjectSerializer>>oidOf:is: | |13.9% {306496ms} MagmaOidManager>>oidOf:is: | | 13.9% {306496ms} MagmaOidManager(MaOidManager)>>oidOf:is: | | 13.4% {295471ms} MaWeakValueDictionary(MaDictionary)>>at:put: | | 12.7% {280036ms} WeakValueDictionary(Dictionary)>>includesKey: | | 12.6% {277831ms} WeakValueDictionary(Dictionary)>>at:ifAbsent: | | 12.2% {269010ms} WeakValueDictionary(Set)>>findElementOrNil: | | 11.8% {260190ms} WeakValueDictionary(Dictionary)>>scanFor: | 2.0% {44100ms} MaObjectSerializer>>objectWithOid:ifFound:ifAbsent: | 2.0% {44100ms} MagmaOidManager>>objectWithOid:ifFound:ifAbsent: 7.6% {167580ms} MagmaSession>>submit: 7.6% {167580ms} MaLocalServerLink>>submit: 7.6% {167580ms} MaLocalRequestServer(MaRequestServer)>>processRequest: 7.6% {167580ms} MagmaRepositoryController>>value: 7.6% {167580ms} MagmaRepositoryController>>processRequest: 7.6% {167580ms} MaWriteRequest>>process 7.6% {167580ms} MaObjectRepository>>submitAll:for:beginAnother: 7.5% {165375ms} MaObjectRepository>>write: 4.5% {99225ms} MaRecoveryManager>>log:flush: 4.0% {88200ms} MaObjectSerializer>>serializeGraph: 4.0% {88200ms} MaObjectSerializer>>serializeGraph:do: 4.0% {88200ms} MaObjectSerializer>>appendGraph:do: 3.1% {68355ms} MaObjectSerializer>>append: 3.0% {66150ms} MaObjectSerializer>>bufferFor:storageObject:startingAt: 2.2% {48510ms} MaVariableObjectBuffer>>populateBodyFor:using: 2.2% {48510ms} Dictionary>>maStreamVariablyInto:for: 2.0% {44100ms} MaObjectSerializer>>oidFor: 2.0% {44100ms} MaOidManager>>oidFor: **Leaves** 19.3% {425566ms} SmallInteger(Integer)>>+ 18.1% {399106ms} Dictionary>>scanFor: 10.5% {231525ms} SmallInteger(Integer)>>bitShift: 3.9% {85995ms} MagmaOidManager>>oidFor:ifAbsent: 3.7% {81585ms} LargePositiveInteger>>+ 3.5% {77175ms} Dictionary>>at:ifAbsent: 3.3% {72765ms} SmallInteger(Number)>>negative 2.4% {52920ms} WeakKeyAssociation>>key
**Memory** old +44,200,708 bytes young +2,281,748 bytes used +46,482,456 bytes free -2,281,036 bytes
**GCs** full 66 totalling 38,913ms (2.0% uptime), avg 590.0ms incr 144172 totalling 1,192,826ms (54.0% uptime), avg 8.0ms tenures 1,426 (avg 101 GCs/tenure) root table 0 overflows
Hi again. Well, as usual the MessageTally is very revealing. It should have hit me before, but I finally realize now you have all of what you want to commit to the database in memory (right?) and that is why tremendous amounts of time are spent to determine which objects changed. As more and more objects become persisted in the database, Magma must scan the ever-larger Dictionary of objects on subsequent commits to see if they changed.
WriteBarrier can virtually eliminate this. It detects when instance variables change and marks those objects dirty as the changes happen, allowing the readSet to remain very small.
But really, databases are used to have object models larger than memory, and Magma wants to be a database. It is designed for OLTP-style programs that work with a relatively small chunk of the model at a time, throw it away and retrieve work with another chunk. i.e., it should perform well working with a "screen at a time". Having a large portion of the model in memory will cause changed-detection to slow down significantly (unless WriteBarrier is used), as well as slowing down refreshes of the model (changes by other users) when transaction boundaries are crossed.
For your problem, though, you have a large inter-connected object graph in memory that you want to get into the database. You can't do it piecemeal because everything is linked together, right? Therefore, the fastest way to do that is with a single commit. Did you try changing that method (MaObjectBuffer>>#ensureSpaceFor:) to allow 50-meg and letting it run all night?
Regards, Chris
On 6/14/07, Florian Minjat florian.minjat@emn.fr wrote:
Hi again Chris,
Chris Muller wrote:
Hi Florian,
Yesterday I tried again to save all the Letters in a dictionary of Players, add all the Players in Magma, then put back the Letters inside the Players. But the last part was so slow I had to stop the whole process after about one hour. I don't understand this slowliness because thoses Letters are only some references to Players which should already be inside Magma.
The resulting code gives something like this (a mailbox is a collection of Letters):
mailboxes := Dictionary new. players do: [:p | mailboxes at: (p login) put: (p mailbox). p mailbox: nil.]. session begin. players do: [:p | session root players add: p. session commitAndBegin]. players do: [:p | |magmaPlayer| magmaPlayer := session players where: [:r | r login equals: (p login)]. magmaPlayer mailbox: (mailboxes at: (p login)). session commitAndBegin]
Ok, thanks for the code, it looks real good. Except one thing that should be done to speed it up is add more than one player per commit. Add 100 or 1000 per commit. That should speed up the first part a lot.
Same comment for the second part. Instead of commitAndBegin after each one, do it every 100 or 1000. You could also choose to commit "every five seconds". I know it makes the code less clean, sorry, but this is one-time "bulk load" code anyway.
You should also send #finalizeOids to the session after each commit.
If it is still slow after doing all that, if you post a MessageTally spy it will help a lot.
I fact there are only 59 Players ^^. And the whole dump takes more than one hour so I don't think it would speed up so much. I put 5 players per commit to test.
Here is the new code (sorry for the length) :
mailboxes := Dictionary new. players do: [:p | mailboxes at: (p login) put: (p mailbox). p mailbox: nil.]. commitCount := 0. players do: [ :p | session root players add: p. (commitCount = 5) ifTrue: [ self commitAndBegin. self magmaSession finalizeOids. commitCount := 0] ifFalse: [commitCount := commitCount +1]. ].
commitCount := 0. players do: [ :p | |magmaPlayer| magmaPlayer := (session players where: [:r | r login equals: (p login)]) first. magmaPlayer mailbox: (mailboxes at: (p login)). (commitCount = 5) ifTrue: [ self commitAndBegin. self magmaSession finalizeOids. commitCount := 0] ifFalse: [commitCount := commitCount +1]. ].
You can find the MessageTally at the end. It is much quicker (36min) with the finalizeOids and five adds by commit.
Perhaps the players references by the Letters in the last block are detected in Magma as not yet inserted and it does it again ?
No, the object is available in the collection as soon as it returns from the commit.
I tried to change my method to replace the Players by their logins before the first save, then iterate through the Letters to put back the references instead of the login. I am sure the references of the Letters comes from persistent players. But it was slower because there are much more magma queries. With your answer it is useless so I came back to the first version I showed you.
Do you have an example of use of a transient variable ? I way I understand it is that I make the Letters transients so they are not serialized inside magma, then I remove the transient property to load the Letters in a second pass ?
Yes, the idea is to implement #maTransientVariables on the referencing class:
maTransientVariables ^ super maTransientVariables, #('letters')
My idea was to add a new variable temporarily, 'letterIds' or whatever. Populate this with some uniquely-identifying string or number for the letters. Commit it.
THEN, remove the #maTransientVariables method and enumerate them again, initializing each from the letterIds. Finally, remove the letterIds variable.
This is a cumbersome approach, hopefully the first one will work.
This method is a little more complex than the first one. As I need only to do it once and for all to transfer the database, I'll use the first approach.
Regards, Chris
Thanks again !
Florian
- 2077529 tallies, 2205004 msec.
**Tree** 98.4% {2169724ms} DOLBDD>>populateCollectionsFromBDDInSqueak 96.3% {2123419ms} DOLBDD>>commitAndBegin 96.3% {2123419ms} MagmaSession>>commitAndBegin 96.3% {2123419ms} MagmaSession>>commitAndBegin: 71.2% {1569963ms} MagmaSession>>newCommitPackageFor: |64.4% {1420023ms} MaTransaction>>changedObjects | |64.4% {1420023ms} MaTransaction>>addChangesFromReadSet | | 64.2% {1415613ms} MaTransaction>>didChange:from: | | 42.0% {926102ms} MaVariableObjectBuffer(MaVariableBuffer)>>isDifferent:using: | | |38.5% {848927ms} Array>>maIsChangedFrom:using: | | | |38.0% {837902ms} Array(Object)>>maIsChangedFrom:using: | | | | 21.9% {482896ms} MaVariableObjectBuffer(MaObjectBuffer)>>maInstVarAt: | | | | |20.4% {449821ms} MaVariableObjectBuffer(MaObjectBuffer)>>uint:at: | | | | | 20.2% {445411ms} ByteArray>>maUint:at: | | | | | 20.0% {441001ms} ByteArray>>maUnsigned48At: | | | | | 11.1% {244755ms} LargePositiveInteger>>+ | | | | | |8.4% {185220ms} LargePositiveInteger(Integer)>>+ | | | | | | |8.3% {183015ms} primitives | | | | | |2.7% {59535ms} primitives | | | | | 8.7% {191835ms} SmallInteger>>bitShift: | | | | | 8.1% {178605ms} SmallInteger(Integer)>>bitShift: | | | | | 7.9% {174195ms} primitives | | | | 13.6% {299881ms} MaObjectSerializer>>oidFor: | | | | 13.2% {291061ms} MagmaOidManager(MaOidManager)>>oidFor: | | | | 13.1% {288856ms} MagmaOidManager>>oidFor:ifAbsent: | | | | 9.9% {218295ms} MagmaOidManager(MaOidManager)>>oidFor:ifAbsent: | | | | |8.4% {185220ms} SmallInteger(Integer)>>maOid | | | | | 8.3% {183015ms} MaOidCalculator class>>oidForInteger: | | | | | 8.0% {176400ms} SmallInteger>>+ | | | | | 7.8% {171990ms} SmallInteger(Integer)>>+ | | | | | 5.9% {130095ms} primitives | | | | 2.6% {57330ms} primitives | | |3.2% {70560ms} Dictionary>>maIsChangedFrom:using: | | | 2.1% {46305ms} MaObjectSerializer>>oidFor: | | | 2.0% {44100ms} MagmaOidManager(MaOidManager)>>oidFor: | | | 2.0% {44100ms} MagmaOidManager>>oidFor:ifAbsent: | | 19.9% {438796ms} MaFixedObjectBuffer>>isDifferent:using: | | |9.7% {213885ms} MaObjectSerializer>>oidFor: | | | |9.6% {211680ms} MagmaOidManager(MaOidManager)>>oidFor: | | | | 9.6% {211680ms} MagmaOidManager>>oidFor:ifAbsent: | | | | 6.4% {141120ms} MagmaOidManager(MaOidManager)>>oidFor:ifAbsent: | | | | |2.7% {59535ms} WeakIdentityKeyDictionary(Dictionary)>>maAt:ifPresent:ifAbsent: | | | | | |2.3% {50715ms} WeakIdentityKeyDictionary(Dictionary)>>at:ifAbsent: | | | | |2.3% {50715ms} SmallInteger(Integer)>>maOid | | | | | 2.2% {48510ms} MaOidCalculator class>>oidForInteger: | | | | 2.2% {48510ms} WeakIdentityKeyDictionary(Dictionary)>>maAt:ifPresent:ifAbsent: | | |6.5% {143325ms} MaFixedObjectBuffer(MaObjectBuffer)>>maInstVarAt: | | | 6.0% {132300ms} MaFixedObjectBuffer(MaObjectBuffer)>>uint:at: | | | 5.9% {130095ms} ByteArray>>maUint:at: | | | 5.9% {130095ms} ByteArray>>maUnsigned48At: | | | 3.3% {72765ms} LargePositiveInteger>>+ | | | |2.5% {55125ms} LargePositiveInteger(Integer)>>+ | | | | 2.4% {52920ms} primitives | | | 2.5% {55125ms} SmallInteger>>bitShift: | | | 2.3% {50715ms} SmallInteger(Integer)>>bitShift: | | | 2.2% {48510ms} primitives | | 2.2% {48510ms} MaTransaction>>useWriteBarrierOn: |4.5% {99225ms} MaCommitPackage>>serializeObjectsUsing: | |4.4% {97020ms} MaObjectSerializer>>serializeGraph:do: | | 4.4% {97020ms} MaObjectSerializer>>appendGraph:do: | | 3.8% {83790ms} MaObjectSerializer>>append: | | 3.7% {81585ms} MaObjectSerializer>>bufferFor:storageObject:startingAt: |2.4% {52920ms} MagmaSession>>newCommitPackageFor: 17.5% {385876ms} MagmaSession>>refreshViewUsing: |17.5% {385876ms} MaCommitResult>>refresh: | 16.0% {352801ms} MagmaSession>>assignPermanentOidsFrom: | 13.9% {306496ms} MaObjectSerializer>>oidOf:is: | |13.9% {306496ms} MagmaOidManager>>oidOf:is: | | 13.9% {306496ms} MagmaOidManager(MaOidManager)>>oidOf:is: | | 13.4% {295471ms} MaWeakValueDictionary(MaDictionary)>>at:put: | | 12.7% {280036ms} WeakValueDictionary(Dictionary)>>includesKey: | | 12.6% {277831ms} WeakValueDictionary(Dictionary)>>at:ifAbsent: | | 12.2% {269010ms} WeakValueDictionary(Set)>>findElementOrNil: | | 11.8% {260190ms} WeakValueDictionary(Dictionary)>>scanFor: | 2.0% {44100ms} MaObjectSerializer>>objectWithOid:ifFound:ifAbsent: | 2.0% {44100ms} MagmaOidManager>>objectWithOid:ifFound:ifAbsent: 7.6% {167580ms} MagmaSession>>submit: 7.6% {167580ms} MaLocalServerLink>>submit: 7.6% {167580ms} MaLocalRequestServer(MaRequestServer)>>processRequest: 7.6% {167580ms} MagmaRepositoryController>>value: 7.6% {167580ms} MagmaRepositoryController>>processRequest: 7.6% {167580ms} MaWriteRequest>>process 7.6% {167580ms} MaObjectRepository>>submitAll:for:beginAnother: 7.5% {165375ms} MaObjectRepository>>write: 4.5% {99225ms} MaRecoveryManager>>log:flush: 4.0% {88200ms} MaObjectSerializer>>serializeGraph: 4.0% {88200ms} MaObjectSerializer>>serializeGraph:do: 4.0% {88200ms} MaObjectSerializer>>appendGraph:do: 3.1% {68355ms} MaObjectSerializer>>append: 3.0% {66150ms} MaObjectSerializer>>bufferFor:storageObject:startingAt: 2.2% {48510ms} MaVariableObjectBuffer>>populateBodyFor:using: 2.2% {48510ms} Dictionary>>maStreamVariablyInto:for: 2.0% {44100ms} MaObjectSerializer>>oidFor: 2.0% {44100ms} MaOidManager>>oidFor:
**Leaves** 19.3% {425566ms} SmallInteger(Integer)>>+ 18.1% {399106ms} Dictionary>>scanFor: 10.5% {231525ms} SmallInteger(Integer)>>bitShift: 3.9% {85995ms} MagmaOidManager>>oidFor:ifAbsent: 3.7% {81585ms} LargePositiveInteger>>+ 3.5% {77175ms} Dictionary>>at:ifAbsent: 3.3% {72765ms} SmallInteger(Number)>>negative 2.4% {52920ms} WeakKeyAssociation>>key
**Memory** old +44,200,708 bytes young +2,281,748 bytes used +46,482,456 bytes free -2,281,036 bytes
**GCs** full 66 totalling 38,913ms (2.0% uptime), avg 590.0ms incr 144172 totalling 1,192,826ms (54.0% uptime), avg 8.0ms tenures 1,426 (avg 101 GCs/tenure) root table 0 overflows
Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma
Chris Muller wrote:
Hi again. Well, as usual the MessageTally is very revealing. It should have hit me before, but I finally realize now you have all of what you want to commit to the database in memory (right?) and that is why tremendous amounts of time are spent to determine which objects changed. As more and more objects become persisted in the database, Magma must scan the ever-larger Dictionary of objects on subsequent commits to see if they changed.
Yes you are right. I used GOODS which was not reliable at that time. Then I used a database inside the Squeak image and made saves of the db with fileOuts. But it is not stable at all with the over 2Go memory problem : the saves randomly hanged the image. So now I struggle to use Magma which seems far much reliable but need more tuning :).
WriteBarrier can virtually eliminate this. It detects when instance variables change and marks those objects dirty as the changes happen, allowing the readSet to remain very small.
But really, databases are used to have object models larger than memory, and Magma wants to be a database. It is designed for OLTP-style programs that work with a relatively small chunk of the model at a time, throw it away and retrieve work with another chunk. i.e., it should perform well working with a "screen at a time". Having a large portion of the model in memory will cause changed-detection to slow down significantly (unless WriteBarrier is used), as well as slowing down refreshes of the model (changes by other users) when transaction boundaries are crossed.
For your problem, though, you have a large inter-connected object graph in memory that you want to get into the database. You can't do it piecemeal because everything is linked together, right? Therefore, the fastest way to do that is with a single commit. Did you try changing that method (MaObjectBuffer>>#ensureSpaceFor:) to allow 50-meg and letting it run all night?
I just tried it. It took at least 40 min. But I am not sure because I paused the process to use the computer and it disturbed MessageTally which gave me an error at the end. I will do it again when I have some more time. But as it is almost the same length of time, I think I won't use the system with removal/insertion of the Letters anymore.
One thing that I am worried about though : each 12h my app runs a process to update the whole players, which changes a lot of things. Will it takes so much time or it is just insertion which is very long ?
Florian
Regards, Chris
On 6/14/07, Florian Minjat florian.minjat@emn.fr wrote:
Hi again Chris,
Chris Muller wrote:
Hi Florian,
Yesterday I tried again to save all the Letters in a dictionary of Players, add all the Players in Magma, then put back the Letters inside the Players. But the last part was so slow I had to stop the whole process after about one hour. I don't understand this slowliness because thoses Letters are only some references to Players which should already be inside Magma.
The resulting code gives something like this (a mailbox is a collection of Letters):
mailboxes := Dictionary new. players do: [:p | mailboxes at: (p login) put: (p mailbox). p mailbox: nil.]. session begin. players do: [:p | session root players add: p. session commitAndBegin]. players do: [:p | |magmaPlayer| magmaPlayer := session players where: [:r | r login equals: (p login)]. magmaPlayer mailbox: (mailboxes at: (p login)). session commitAndBegin]
Ok, thanks for the code, it looks real good. Except one thing that should be done to speed it up is add more than one player per commit. Add 100 or 1000 per commit. That should speed up the first part a lot.
Same comment for the second part. Instead of commitAndBegin after each one, do it every 100 or 1000. You could also choose to commit "every five seconds". I know it makes the code less clean, sorry, but this is one-time "bulk load" code anyway.
You should also send #finalizeOids to the session after each commit.
If it is still slow after doing all that, if you post a MessageTally spy it will help a lot.
I fact there are only 59 Players ^^. And the whole dump takes more than one hour so I don't think it would speed up so much. I put 5 players per commit to test.
Here is the new code (sorry for the length) :
mailboxes := Dictionary new. players do: [:p | mailboxes at: (p login) put: (p mailbox). p mailbox: nil.]. commitCount := 0. players do: [ :p | session root players add: p. (commitCount = 5) ifTrue: [ self commitAndBegin. self magmaSession finalizeOids. commitCount := 0] ifFalse: [commitCount := commitCount +1]. ].
commitCount := 0. players do: [ :p | |magmaPlayer| magmaPlayer := (session players where: [:r | r login equals: (p login)]) first. magmaPlayer mailbox: (mailboxes at: (p login)). (commitCount = 5) ifTrue: [ self commitAndBegin. self magmaSession finalizeOids. commitCount := 0] ifFalse: [commitCount := commitCount +1]. ].
You can find the MessageTally at the end. It is much quicker (36min) with the finalizeOids and five adds by commit.
Perhaps the players references by the Letters in the last block are detected in Magma as not yet inserted and it does it again ?
No, the object is available in the collection as soon as it returns from the commit.
I tried to change my method to replace the Players by their logins before the first save, then iterate through the Letters to put back the references instead of the login. I am sure the references of the Letters comes from persistent players. But it was slower because there are much more magma queries. With your answer it is useless so I came back to the first version I showed you.
Do you have an example of use of a transient variable ? I way I understand it is that I make the Letters transients so they are not serialized inside magma, then I remove the transient property to load the Letters in a second pass ?
Yes, the idea is to implement #maTransientVariables on the referencing class:
maTransientVariables ^ super maTransientVariables, #('letters')
My idea was to add a new variable temporarily, 'letterIds' or whatever. Populate this with some uniquely-identifying string or number for the letters. Commit it.
THEN, remove the #maTransientVariables method and enumerate them again, initializing each from the letterIds. Finally, remove the letterIds variable.
This is a cumbersome approach, hopefully the first one will work.
This method is a little more complex than the first one. As I need only to do it once and for all to transfer the database, I'll use the first approach.
Regards, Chris
Thanks again !
Florian
- 2077529 tallies, 2205004 msec.
**Tree** 98.4% {2169724ms} DOLBDD>>populateCollectionsFromBDDInSqueak 96.3% {2123419ms} DOLBDD>>commitAndBegin 96.3% {2123419ms} MagmaSession>>commitAndBegin 96.3% {2123419ms} MagmaSession>>commitAndBegin: 71.2% {1569963ms} MagmaSession>>newCommitPackageFor: |64.4% {1420023ms} MaTransaction>>changedObjects | |64.4% {1420023ms} MaTransaction>>addChangesFromReadSet | | 64.2% {1415613ms} MaTransaction>>didChange:from: | | 42.0% {926102ms} MaVariableObjectBuffer(MaVariableBuffer)>>isDifferent:using: | | |38.5% {848927ms} Array>>maIsChangedFrom:using: | | | |38.0% {837902ms} Array(Object)>>maIsChangedFrom:using: | | | | 21.9% {482896ms} MaVariableObjectBuffer(MaObjectBuffer)>>maInstVarAt: | | | | |20.4% {449821ms} MaVariableObjectBuffer(MaObjectBuffer)>>uint:at: | | | | | 20.2% {445411ms} ByteArray>>maUint:at: | | | | | 20.0% {441001ms} ByteArray>>maUnsigned48At: | | | | | 11.1% {244755ms} LargePositiveInteger>>+ | | | | | |8.4% {185220ms} LargePositiveInteger(Integer)>>+ | | | | | | |8.3% {183015ms} primitives | | | | | |2.7% {59535ms} primitives | | | | | 8.7% {191835ms} SmallInteger>>bitShift: | | | | | 8.1% {178605ms} SmallInteger(Integer)>>bitShift: | | | | | 7.9% {174195ms} primitives | | | | 13.6% {299881ms} MaObjectSerializer>>oidFor: | | | | 13.2% {291061ms} MagmaOidManager(MaOidManager)>>oidFor: | | | | 13.1% {288856ms} MagmaOidManager>>oidFor:ifAbsent: | | | | 9.9% {218295ms} MagmaOidManager(MaOidManager)>>oidFor:ifAbsent: | | | | |8.4% {185220ms} SmallInteger(Integer)>>maOid | | | | | 8.3% {183015ms} MaOidCalculator class>>oidForInteger: | | | | | 8.0% {176400ms} SmallInteger>>+ | | | | | 7.8% {171990ms} SmallInteger(Integer)>>+ | | | | | 5.9% {130095ms} primitives | | | | 2.6% {57330ms} primitives | | |3.2% {70560ms} Dictionary>>maIsChangedFrom:using: | | | 2.1% {46305ms} MaObjectSerializer>>oidFor: | | | 2.0% {44100ms} MagmaOidManager(MaOidManager)>>oidFor: | | | 2.0% {44100ms} MagmaOidManager>>oidFor:ifAbsent: | | 19.9% {438796ms} MaFixedObjectBuffer>>isDifferent:using: | | |9.7% {213885ms} MaObjectSerializer>>oidFor: | | | |9.6% {211680ms} MagmaOidManager(MaOidManager)>>oidFor: | | | | 9.6% {211680ms} MagmaOidManager>>oidFor:ifAbsent: | | | | 6.4% {141120ms} MagmaOidManager(MaOidManager)>>oidFor:ifAbsent: | | | | |2.7% {59535ms} WeakIdentityKeyDictionary(Dictionary)>>maAt:ifPresent:ifAbsent: | | | | | |2.3% {50715ms} WeakIdentityKeyDictionary(Dictionary)>>at:ifAbsent: | | | | |2.3% {50715ms} SmallInteger(Integer)>>maOid | | | | | 2.2% {48510ms} MaOidCalculator class>>oidForInteger: | | | | 2.2% {48510ms} WeakIdentityKeyDictionary(Dictionary)>>maAt:ifPresent:ifAbsent: | | |6.5% {143325ms} MaFixedObjectBuffer(MaObjectBuffer)>>maInstVarAt: | | | 6.0% {132300ms} MaFixedObjectBuffer(MaObjectBuffer)>>uint:at: | | | 5.9% {130095ms} ByteArray>>maUint:at: | | | 5.9% {130095ms} ByteArray>>maUnsigned48At: | | | 3.3% {72765ms} LargePositiveInteger>>+ | | | |2.5% {55125ms} LargePositiveInteger(Integer)>>+ | | | | 2.4% {52920ms} primitives | | | 2.5% {55125ms} SmallInteger>>bitShift: | | | 2.3% {50715ms} SmallInteger(Integer)>>bitShift: | | | 2.2% {48510ms} primitives | | 2.2% {48510ms} MaTransaction>>useWriteBarrierOn: |4.5% {99225ms} MaCommitPackage>>serializeObjectsUsing: | |4.4% {97020ms} MaObjectSerializer>>serializeGraph:do: | | 4.4% {97020ms} MaObjectSerializer>>appendGraph:do: | | 3.8% {83790ms} MaObjectSerializer>>append: | | 3.7% {81585ms} MaObjectSerializer>>bufferFor:storageObject:startingAt: |2.4% {52920ms} MagmaSession>>newCommitPackageFor: 17.5% {385876ms} MagmaSession>>refreshViewUsing: |17.5% {385876ms} MaCommitResult>>refresh: | 16.0% {352801ms} MagmaSession>>assignPermanentOidsFrom: | 13.9% {306496ms} MaObjectSerializer>>oidOf:is: | |13.9% {306496ms} MagmaOidManager>>oidOf:is: | | 13.9% {306496ms} MagmaOidManager(MaOidManager)>>oidOf:is: | | 13.4% {295471ms} MaWeakValueDictionary(MaDictionary)>>at:put: | | 12.7% {280036ms} WeakValueDictionary(Dictionary)>>includesKey: | | 12.6% {277831ms} WeakValueDictionary(Dictionary)>>at:ifAbsent: | | 12.2% {269010ms} WeakValueDictionary(Set)>>findElementOrNil: | | 11.8% {260190ms} WeakValueDictionary(Dictionary)>>scanFor: | 2.0% {44100ms} MaObjectSerializer>>objectWithOid:ifFound:ifAbsent: | 2.0% {44100ms} MagmaOidManager>>objectWithOid:ifFound:ifAbsent: 7.6% {167580ms} MagmaSession>>submit: 7.6% {167580ms} MaLocalServerLink>>submit: 7.6% {167580ms} MaLocalRequestServer(MaRequestServer)>>processRequest: 7.6% {167580ms} MagmaRepositoryController>>value: 7.6% {167580ms} MagmaRepositoryController>>processRequest: 7.6% {167580ms} MaWriteRequest>>process 7.6% {167580ms} MaObjectRepository>>submitAll:for:beginAnother: 7.5% {165375ms} MaObjectRepository>>write: 4.5% {99225ms} MaRecoveryManager>>log:flush: 4.0% {88200ms} MaObjectSerializer>>serializeGraph: 4.0% {88200ms} MaObjectSerializer>>serializeGraph:do: 4.0% {88200ms} MaObjectSerializer>>appendGraph:do: 3.1% {68355ms} MaObjectSerializer>>append: 3.0% {66150ms} MaObjectSerializer>>bufferFor:storageObject:startingAt: 2.2% {48510ms} MaVariableObjectBuffer>>populateBodyFor:using: 2.2% {48510ms} Dictionary>>maStreamVariablyInto:for: 2.0% {44100ms} MaObjectSerializer>>oidFor: 2.0% {44100ms} MaOidManager>>oidFor:
**Leaves** 19.3% {425566ms} SmallInteger(Integer)>>+ 18.1% {399106ms} Dictionary>>scanFor: 10.5% {231525ms} SmallInteger(Integer)>>bitShift: 3.9% {85995ms} MagmaOidManager>>oidFor:ifAbsent: 3.7% {81585ms} LargePositiveInteger>>+ 3.5% {77175ms} Dictionary>>at:ifAbsent: 3.3% {72765ms} SmallInteger(Number)>>negative 2.4% {52920ms} WeakKeyAssociation>>key
**Memory** old +44,200,708 bytes young +2,281,748 bytes used +46,482,456 bytes free -2,281,036 bytes
**GCs** full 66 totalling 38,913ms (2.0% uptime), avg 590.0ms incr 144172 totalling 1,192,826ms (54.0% uptime), avg 8.0ms tenures 1,426 (avg 101 GCs/tenure) root table 0 overflows
Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma
I have another question : if I ask magma for an object Player which references others Players in its Letters, will magma deserialize all the referenced players ? If it is the case, is there a way to ask for an 'half-deserialization' ? By that I mean for example getting the referenced Players with all their primitive objects but no referenced object ? Otherwise I will need to seriously rethink the Mailbox/Letter model of my application.
Florian
Chris Muller wrote:
Hi Florian, this is an interesting problem.
First to answer your question, Magma identifies an object as already in the repository if it has a "permanent" oid. This is an oid range between MaOidCalculator #firstUserObjectOid and #lastUserObjectOid. The #hash of an object never comes into play to determine this, or anything in Magma for that matter.
One way to get your large object-model into Magma would be to change the method MaObjectBuffer>>#ensureSpaceFor: to allow more than 10 megabyte serializations. I only suggest this only because it is an arbitrary limit and it looks like your model is right there just over the limit. Change it to 50-meg on the fastest computer you have, It should work. Submit the commit and let it run all night.
Another option would be to temporarily change your Players to reference the other players only logically; by some temporary id or something. Once you get them loaded into Magma, then reset the references back to the actual objects.
Note #maTransientVariables is available to implement on Player so it will not serialize those variables, so you could actually ADD a new logical reference, if necessary, and make the hard player reference transient (temporarily).
Please let me know if these ideas are feasible.
- Chris
On 6/10/07, Florian Minjat florian.minjat@emn.fr wrote:
Hi, I am trying to transfer a big object hierarchy to Magma : 110100 objects in 9004144 bytes. The basic structure is composed of ~70 Players with a lot of stuff inside. Each players can have some Letters which references other Players as sender or recipients. The problem is that I can't submit one player to an empty Magma repository without letting Magma insert all the references Players. And the resulting transaction is too big, causing an error. So I separated all the Letters from the Players, inserted the players (~50min) and then tried to insert the Letters. But I close the connection to the magma repository between the two, so Magma tried again to load all the references Players. So here is my question : how does Magma identify a given object as already inside a repository ? Does it compare the hash of the objects and I just need to redefine the hash method of Players ? Each try is quite long so I would like to understand a little more of the inner mechanisms of Magma instead of trying numerous times.
Florian _______________________________________________ Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma
Yes, I think you are asking about ReadStrategy's. They are crucial to getting proper performance out of Magma. Here is the information about ReadStrategy's you must understand:
http://wiki.squeak.org/squeak/2638
On 6/12/07, Florian Minjat florian.minjat@emn.fr wrote:
I have another question : if I ask magma for an object Player which references others Players in its Letters, will magma deserialize all the referenced players ? If it is the case, is there a way to ask for an 'half-deserialization' ? By that I mean for example getting the referenced Players with all their primitive objects but no referenced object ? Otherwise I will need to seriously rethink the Mailbox/Letter model of my application.
Florian
Chris Muller wrote:
Hi Florian, this is an interesting problem.
First to answer your question, Magma identifies an object as already in the repository if it has a "permanent" oid. This is an oid range between MaOidCalculator #firstUserObjectOid and #lastUserObjectOid. The #hash of an object never comes into play to determine this, or anything in Magma for that matter.
One way to get your large object-model into Magma would be to change the method MaObjectBuffer>>#ensureSpaceFor: to allow more than 10 megabyte serializations. I only suggest this only because it is an arbitrary limit and it looks like your model is right there just over the limit. Change it to 50-meg on the fastest computer you have, It should work. Submit the commit and let it run all night.
Another option would be to temporarily change your Players to reference the other players only logically; by some temporary id or something. Once you get them loaded into Magma, then reset the references back to the actual objects.
Note #maTransientVariables is available to implement on Player so it will not serialize those variables, so you could actually ADD a new logical reference, if necessary, and make the hard player reference transient (temporarily).
Please let me know if these ideas are feasible.
- Chris
On 6/10/07, Florian Minjat florian.minjat@emn.fr wrote:
Hi, I am trying to transfer a big object hierarchy to Magma : 110100 objects in 9004144 bytes. The basic structure is composed of ~70 Players with a lot of stuff inside. Each players can have some Letters which references other Players as sender or recipients. The problem is that I can't submit one player to an empty Magma repository without letting Magma insert all the references Players. And the resulting transaction is too big, causing an error. So I separated all the Letters from the Players, inserted the players (~50min) and then tried to insert the Letters. But I close the connection to the magma repository between the two, so Magma tried again to load all the references Players. So here is my question : how does Magma identify a given object as already inside a repository ? Does it compare the hash of the objects and I just need to redefine the hash method of Players ? Each try is quite long so I would like to understand a little more of the inner mechanisms of Magma instead of trying numerous times.
Florian _______________________________________________ Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma
Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma
Great ! I knew there were already something like that in Magma. I will test it soon. It should speed up the listing of the players considerably. Thanks !
Florian
Chris Muller wrote:
Yes, I think you are asking about ReadStrategy's. They are crucial to getting proper performance out of Magma. Here is the information about ReadStrategy's you must understand:
http://wiki.squeak.org/squeak/2638
On 6/12/07, Florian Minjat florian.minjat@emn.fr wrote:
I have another question : if I ask magma for an object Player which references others Players in its Letters, will magma deserialize all the referenced players ? If it is the case, is there a way to ask for an 'half-deserialization' ? By that I mean for example getting the referenced Players with all their primitive objects but no referenced object ? Otherwise I will need to seriously rethink the Mailbox/Letter model of my application.
Florian
Chris Muller wrote:
Hi Florian, this is an interesting problem.
First to answer your question, Magma identifies an object as already in the repository if it has a "permanent" oid. This is an oid range between MaOidCalculator #firstUserObjectOid and #lastUserObjectOid. The #hash of an object never comes into play to determine this, or anything in Magma for that matter.
One way to get your large object-model into Magma would be to change the method MaObjectBuffer>>#ensureSpaceFor: to allow more than 10 megabyte serializations. I only suggest this only because it is an arbitrary limit and it looks like your model is right there just over the limit. Change it to 50-meg on the fastest computer you have, It should work. Submit the commit and let it run all night.
Another option would be to temporarily change your Players to reference the other players only logically; by some temporary id or something. Once you get them loaded into Magma, then reset the references back to the actual objects.
Note #maTransientVariables is available to implement on Player so it will not serialize those variables, so you could actually ADD a new logical reference, if necessary, and make the hard player reference transient (temporarily).
Please let me know if these ideas are feasible.
- Chris
On 6/10/07, Florian Minjat florian.minjat@emn.fr wrote:
Hi, I am trying to transfer a big object hierarchy to Magma : 110100 objects in 9004144 bytes. The basic structure is composed of ~70 Players with a lot of stuff inside. Each players can have some Letters which references other Players as sender or recipients. The problem is that I can't submit one player to an empty Magma repository without letting Magma insert all the references Players. And the resulting transaction is too big, causing an error. So I separated all the Letters from the Players, inserted the players (~50min) and then tried to insert the Letters. But I close the connection to the magma repository between the two, so Magma tried again to load all the references Players. So here is my question : how does Magma identify a given object as already inside a repository ? Does it compare the hash of the objects and I just need to redefine the hash method of Players ? Each try is quite long so I would like to understand a little more of the inner mechanisms of Magma instead of trying numerous times.
Florian _______________________________________________ Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma
Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma
Hi Chris, I read the ReadStrategy FAQ and it's quite interresting. I have some question though. First of all, how long will the read strategy last for a given session by doing mySession 'session readStrategy: myReadStrategy.' ? Do I have to revert it back to a default strategy after my specific queries ? Then is it possible to be more specific. For example : my Players have a reference to a Dungeon which references a Desk which has two OrderedCollection referencing Letters. If I want to send a Letter to 10 Players, I need to had a Letter in each of their Desk, resulting of a huge query I think. Is there a way to be so much specific with a ReadStrategy ? By doing such a query, my understanding is that the other referenced objects of the Dungeons won't be materialized. Am I right ?
Florian
Florian Minjat wrote:
Great ! I knew there were already something like that in Magma. I will test it soon. It should speed up the listing of the players considerably. Thanks !
Florian
Chris Muller wrote:
Yes, I think you are asking about ReadStrategy's. They are crucial to getting proper performance out of Magma. Here is the information about ReadStrategy's you must understand:
http://wiki.squeak.org/squeak/2638
On 6/12/07, Florian Minjat florian.minjat@emn.fr wrote:
I have another question : if I ask magma for an object Player which references others Players in its Letters, will magma deserialize all the referenced players ? If it is the case, is there a way to ask for an 'half-deserialization' ? By that I mean for example getting the referenced Players with all their primitive objects but no referenced object ? Otherwise I will need to seriously rethink the Mailbox/Letter model of my application.
Florian
Chris Muller wrote:
Hi Florian, this is an interesting problem.
First to answer your question, Magma identifies an object as already in the repository if it has a "permanent" oid. This is an oid range between MaOidCalculator #firstUserObjectOid and #lastUserObjectOid. The #hash of an object never comes into play to determine this, or anything in Magma for that matter.
One way to get your large object-model into Magma would be to change the method MaObjectBuffer>>#ensureSpaceFor: to allow more than 10 megabyte serializations. I only suggest this only because it is an arbitrary limit and it looks like your model is right there just over the limit. Change it to 50-meg on the fastest computer you have, It should work. Submit the commit and let it run all night.
Another option would be to temporarily change your Players to reference the other players only logically; by some temporary id or something. Once you get them loaded into Magma, then reset the references back to the actual objects.
Note #maTransientVariables is available to implement on Player so it will not serialize those variables, so you could actually ADD a new logical reference, if necessary, and make the hard player reference transient (temporarily).
Please let me know if these ideas are feasible.
- Chris
On 6/10/07, Florian Minjat florian.minjat@emn.fr wrote:
Hi, I am trying to transfer a big object hierarchy to Magma : 110100 objects in 9004144 bytes. The basic structure is composed of ~70 Players with a lot of stuff inside. Each players can have some Letters which references other Players as sender or recipients. The problem is that I can't submit one player to an empty Magma repository without letting Magma insert all the references Players. And the resulting transaction is too big, causing an error. So I separated all the Letters from the Players, inserted the players (~50min) and then tried to insert the Letters. But I close the connection to the magma repository between the two, so Magma tried again to load all the references Players. So here is my question : how does Magma identify a given object as already inside a repository ? Does it compare the hash of the objects and I just need to redefine the hash method of Players ? Each try is quite long so I would like to understand a little more of the inner mechanisms of Magma instead of trying numerous times.
Florian _______________________________________________ Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma
Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma
Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma
First of all, how long will the read strategy last for a given session by doing mySession 'session readStrategy: myReadStrategy.' ?
It will last until you replace it with another ReadStrategy.
Do I have to revert it back to a default strategy after my specific queries ?
Yes. If you have a specific query you need to optimise by changing the depths of certain variables of certain classes, you will probably want to revert to your "default" strategy afterward, otherwise the specific-query strategy will continue to be used.
For example : my Players have a reference to a Dungeon which references a Desk which has two OrderedCollection referencing Letters. If I want to send a Letter to 10 Players, I need to had a Letter in each of their Desk, resulting of a huge query I think. Is there a way to be so much specific with a ReadStrategy ?
Let's see here.
Player --->(1) Dungeon --->(1) Desk --->(*) Letter
So, for this operation (sending a single Letter to 10 Players), use a ReadStrategy, something like this:
(myReadStrategy minimumDepth: 1) forVariableNamed: 'dungeon' onAny: Player readToDepth: 1 ; forVariableNamed: 'desk' onAny: Dungeon readToDepth: 1 ; forVariableNamed: 'letters1' onAny: Desk readToDepth: 2 ; forVariableNamed: 'letters2' onAny: Desk readToDepth: 2.
This will cause the Player-->Dungeon-->Desk-->OC of Letters to be grabbed in a single server-trip.
The 'letters1' OC species 2 levels because there is the intermediate 'array' variable.. But you'll have to experiment to get it fully optimised. The way to do this is be careful opening inspectors, because clicking the instVars inspectors invokes #printString causes proxy materializations. So what you do is just open the *inspector* (NOT an explorer) on your Desk and ask the instVar for its class before inspecting it:
letters1 class "MagmaMutatingProxy"
then you know you need one level deeper, because you want it to say "OrderedCollection". When you fix the ReadStrategy to do that, try it again but still do NOT click on the 'letters1' variable because the inspector will try to enumerate the values. Instead:
letters1 basicInspect
and then, in the basic-inspector:
array class "Array" or "MagmaMutatingProxy"
and so on.
In this way you can see exactly how it works and, knowing what you need, be able to craft a fully-optimised ReadStrategy for each use-case.
Regards, Chris
On 6/14/07, Florian Minjat florian.minjat@emn.fr wrote:
Hi Chris, I read the ReadStrategy FAQ and it's quite interresting. I have some question though. First of all, how long will the read strategy last for a given session by doing mySession 'session readStrategy: myReadStrategy.' ? Do I have to revert it back to a default strategy after my specific queries ? Then is it possible to be more specific. For example : my Players have a reference to a Dungeon which references a Desk which has two OrderedCollection referencing Letters. If I want to send a Letter to 10 Players, I need to had a Letter in each of their Desk, resulting of a huge query I think. Is there a way to be so much specific with a ReadStrategy ? By doing such a query, my understanding is that the other referenced objects of the Dungeons won't be materialized. Am I right ?
Florian
Florian Minjat wrote:
Great ! I knew there were already something like that in Magma. I will test it soon. It should speed up the listing of the players considerably. Thanks !
Florian
Chris Muller wrote:
Yes, I think you are asking about ReadStrategy's. They are crucial to getting proper performance out of Magma. Here is the information about ReadStrategy's you must understand:
http://wiki.squeak.org/squeak/2638
On 6/12/07, Florian Minjat florian.minjat@emn.fr wrote:
I have another question : if I ask magma for an object Player which references others Players in its Letters, will magma deserialize all the referenced players ? If it is the case, is there a way to ask for an 'half-deserialization' ? By that I mean for example getting the referenced Players with all their primitive objects but no referenced object ? Otherwise I will need to seriously rethink the Mailbox/Letter model of my application.
Florian
Chris Muller wrote:
Hi Florian, this is an interesting problem.
First to answer your question, Magma identifies an object as already in the repository if it has a "permanent" oid. This is an oid range between MaOidCalculator #firstUserObjectOid and #lastUserObjectOid. The #hash of an object never comes into play to determine this, or anything in Magma for that matter.
One way to get your large object-model into Magma would be to change the method MaObjectBuffer>>#ensureSpaceFor: to allow more than 10 megabyte serializations. I only suggest this only because it is an arbitrary limit and it looks like your model is right there just over the limit. Change it to 50-meg on the fastest computer you have, It should work. Submit the commit and let it run all night.
Another option would be to temporarily change your Players to reference the other players only logically; by some temporary id or something. Once you get them loaded into Magma, then reset the references back to the actual objects.
Note #maTransientVariables is available to implement on Player so it will not serialize those variables, so you could actually ADD a new logical reference, if necessary, and make the hard player reference transient (temporarily).
Please let me know if these ideas are feasible.
- Chris
On 6/10/07, Florian Minjat florian.minjat@emn.fr wrote:
Hi, I am trying to transfer a big object hierarchy to Magma : 110100 objects in 9004144 bytes. The basic structure is composed of ~70 Players with a lot of stuff inside. Each players can have some Letters which references other Players as sender or recipients. The problem is that I can't submit one player to an empty Magma repository without letting Magma insert all the references Players. And the resulting transaction is too big, causing an error. So I separated all the Letters from the Players, inserted the players (~50min) and then tried to insert the Letters. But I close the connection to the magma repository between the two, so Magma tried again to load all the references Players. So here is my question : how does Magma identify a given object as already inside a repository ? Does it compare the hash of the objects and I just need to redefine the hash method of Players ? Each try is quite long so I would like to understand a little more of the inner mechanisms of Magma instead of trying numerous times.
Florian _______________________________________________ Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma
Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma
Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma
Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma
Thanks a lot for this great tutorial Chris ! It will help me a lot to optimise my application. I will inform you if I have some difficulties or questions. Thanks again for Magma :)
Florian
Chris Muller wrote:
First of all, how long will the read strategy last for a given session by doing mySession 'session readStrategy: myReadStrategy.' ?
It will last until you replace it with another ReadStrategy.
Do I have to revert it back to a default strategy after my specific queries ?
Yes. If you have a specific query you need to optimise by changing the depths of certain variables of certain classes, you will probably want to revert to your "default" strategy afterward, otherwise the specific-query strategy will continue to be used.
For example : my Players have a reference to a Dungeon which references a Desk which has two OrderedCollection referencing Letters. If I want to send a Letter to 10 Players, I need to had a Letter in each of their Desk, resulting of a huge query I think. Is there a way to be so much specific with a ReadStrategy ?
Let's see here.
Player --->(1) Dungeon --->(1) Desk --->(*) Letter
So, for this operation (sending a single Letter to 10 Players), use a ReadStrategy, something like this:
(myReadStrategy minimumDepth: 1) forVariableNamed: 'dungeon' onAny: Player readToDepth: 1 ; forVariableNamed: 'desk' onAny: Dungeon readToDepth: 1 ; forVariableNamed: 'letters1' onAny: Desk readToDepth: 2 ; forVariableNamed: 'letters2' onAny: Desk readToDepth: 2.
This will cause the Player-->Dungeon-->Desk-->OC of Letters to be grabbed in a single server-trip.
The 'letters1' OC species 2 levels because there is the intermediate 'array' variable.. But you'll have to experiment to get it fully optimised. The way to do this is be careful opening inspectors, because clicking the instVars inspectors invokes #printString causes proxy materializations. So what you do is just open the *inspector* (NOT an explorer) on your Desk and ask the instVar for its class before inspecting it:
letters1 class "MagmaMutatingProxy"
then you know you need one level deeper, because you want it to say "OrderedCollection". When you fix the ReadStrategy to do that, try it again but still do NOT click on the 'letters1' variable because the inspector will try to enumerate the values. Instead:
letters1 basicInspect
and then, in the basic-inspector:
array class "Array" or "MagmaMutatingProxy"
and so on.
In this way you can see exactly how it works and, knowing what you need, be able to craft a fully-optimised ReadStrategy for each use-case.
Regards, Chris
On 6/14/07, Florian Minjat florian.minjat@emn.fr wrote:
Hi Chris, I read the ReadStrategy FAQ and it's quite interresting. I have some question though. First of all, how long will the read strategy last for a given session by doing mySession 'session readStrategy: myReadStrategy.' ? Do I have to revert it back to a default strategy after my specific queries ? Then is it possible to be more specific. For example : my Players have a reference to a Dungeon which references a Desk which has two OrderedCollection referencing Letters. If I want to send a Letter to 10 Players, I need to had a Letter in each of their Desk, resulting of a huge query I think. Is there a way to be so much specific with a ReadStrategy ? By doing such a query, my understanding is that the other referenced objects of the Dungeons won't be materialized. Am I right ?
Florian
Florian Minjat wrote:
Great ! I knew there were already something like that in Magma. I will test it soon. It should speed up the listing of the players considerably. Thanks !
Florian
Chris Muller wrote:
Yes, I think you are asking about ReadStrategy's. They are crucial to getting proper performance out of Magma. Here is the information about ReadStrategy's you must understand:
http://wiki.squeak.org/squeak/2638
On 6/12/07, Florian Minjat florian.minjat@emn.fr wrote:
I have another question : if I ask magma for an object Player which references others Players in its Letters, will magma deserialize all the referenced players ? If it is the case, is there a way to ask for an 'half-deserialization' ? By that I mean for example getting the referenced Players with all their primitive objects but no referenced object ? Otherwise I will need to seriously rethink the Mailbox/Letter model of my application.
Florian
Chris Muller wrote:
Hi Florian, this is an interesting problem.
First to answer your question, Magma identifies an object as already in the repository if it has a "permanent" oid. This is an oid range between MaOidCalculator #firstUserObjectOid and #lastUserObjectOid. The #hash of an object never comes into play to determine this, or anything in Magma for that matter.
One way to get your large object-model into Magma would be to change the method MaObjectBuffer>>#ensureSpaceFor: to allow more than 10 megabyte serializations. I only suggest this only because it is an arbitrary limit and it looks like your model is right there just
over
the limit. Change it to 50-meg on the fastest computer you have, It should work. Submit the commit and let it run all night.
Another option would be to temporarily change your Players to reference the other players only logically; by some temporary id or something. Once you get them loaded into Magma, then reset the references back to the actual objects.
Note #maTransientVariables is available to implement on Player so it will not serialize those variables, so you could actually ADD a new logical reference, if necessary, and make the hard player reference transient (temporarily).
Please let me know if these ideas are feasible.
- Chris
On 6/10/07, Florian Minjat florian.minjat@emn.fr wrote: > Hi, > I am trying to transfer a big object hierarchy to Magma : 110100 > objects in 9004144 bytes. > The basic structure is composed of ~70 Players with a lot of
stuff
> inside. Each players can have some Letters which references other > Players as sender or recipients. > The problem is that I can't submit one player to an empty Magma > repository without letting Magma insert all the references Players. > And the resulting transaction is too big, causing an error. So I > separated all the Letters from the Players, inserted the players > (~50min) and then tried to insert the Letters. But I close the > connection to the magma repository between the two, so Magma tried > again to load all the references Players. > So here is my question : how does Magma identify a given
object as
> already inside a repository ? Does it compare the hash of the
objects
> and I just need to redefine the hash method of Players ? > Each try is quite long so I would like to understand a little
more
> of the inner mechanisms of Magma instead of trying numerous times. > > Florian > _______________________________________________ > Magma mailing list > Magma@lists.squeakfoundation.org > http://lists.squeakfoundation.org/mailman/listinfo/magma >
Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma
Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma
Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma
Hi again Chris, I have a few questions about ReadStrategies after playing around with them. Lets say my Players have an id which is an integer and a login which is a String. Setting minimumDepth: 0 gives me the id reified and a proxy for the login. But setting the min depth to 1 gives me the exact same thing. I need to set the depth to 2 to get the String. If that's normal what is the difference between 0 and 1?
Another thing. As you saw I have a Desk with letters which refers to other Players. I don't want theses players to be reified but I want object deeper to be. So I tried something like that : session readStrategy: ((MaReadStrategy minimumDepth: 1) onAny: Desk readToDepth: 0). But it didn't seem to work. This answers OrderedCollection : ((player instVarNamed: 'myDesk') instVarNamed: 'inbox') class. So how can I make the rule on Desk precedes the more global rule?
A very simple browser to go through the hierarchy to see what is reified and what isn't could be great :).
Florian
Chris Muller wrote:
First of all, how long will the read strategy last for a given session by doing mySession 'session readStrategy: myReadStrategy.' ?
It will last until you replace it with another ReadStrategy.
Do I have to revert it back to a default strategy after my specific queries ?
Yes. If you have a specific query you need to optimise by changing the depths of certain variables of certain classes, you will probably want to revert to your "default" strategy afterward, otherwise the specific-query strategy will continue to be used.
For example : my Players have a reference to a Dungeon which references a Desk which has two OrderedCollection referencing Letters. If I want to send a Letter to 10 Players, I need to had a Letter in each of their Desk, resulting of a huge query I think. Is there a way to be so much specific with a ReadStrategy ?
Let's see here.
Player --->(1) Dungeon --->(1) Desk --->(*) Letter
So, for this operation (sending a single Letter to 10 Players), use a ReadStrategy, something like this:
(myReadStrategy minimumDepth: 1) forVariableNamed: 'dungeon' onAny: Player readToDepth: 1 ; forVariableNamed: 'desk' onAny: Dungeon readToDepth: 1 ; forVariableNamed: 'letters1' onAny: Desk readToDepth: 2 ; forVariableNamed: 'letters2' onAny: Desk readToDepth: 2.
This will cause the Player-->Dungeon-->Desk-->OC of Letters to be grabbed in a single server-trip.
The 'letters1' OC species 2 levels because there is the intermediate 'array' variable.. But you'll have to experiment to get it fully optimised. The way to do this is be careful opening inspectors, because clicking the instVars inspectors invokes #printString causes proxy materializations. So what you do is just open the *inspector* (NOT an explorer) on your Desk and ask the instVar for its class before inspecting it:
letters1 class "MagmaMutatingProxy"
then you know you need one level deeper, because you want it to say "OrderedCollection". When you fix the ReadStrategy to do that, try it again but still do NOT click on the 'letters1' variable because the inspector will try to enumerate the values. Instead:
letters1 basicInspect
and then, in the basic-inspector:
array class "Array" or "MagmaMutatingProxy"
and so on.
In this way you can see exactly how it works and, knowing what you need, be able to craft a fully-optimised ReadStrategy for each use-case.
Regards, Chris
magma@lists.squeakfoundation.org