In Squeak 3.8 (and thus OLPC Squeak), we changed Squeak to call #initialize by default on #new.
The attached changeset fixes some instance creation methods (e.g.of Rectangle, LookupKey, Point) to not call #new (as there is no initialize method), but use #basicNew instead, thus saving a lookup and call to the empty #initialize method.
Thanks again, Marcus,
I'll measure the performance on Monday (it should give some improvement, but) and push it to the stream.
-- Yoshiki
This brings up the idea of avoiding nil-out some of the memory area when it is not necessary...
Here's the last batch of performance changes backported from 3.9.
These seem to not give too much of a performance boost, but they should not hurt either.
"Change Set: RemoveFlag-md Date: 16 October 2006 Author: Marcus Denker
This just removed a self flag: #bob
from WorldState>>doOneCycleNowFor:. There are other calls to #flag done in the main rendering loop, these are not fixed with the changeset."
Change Set: SubMorphCopy-md Date: 16 October 2006 Author: Marcus Denker
This is from
Change Set: MorphicPerformance1-efc Date: 6 August 2005 Author: Eddie Cottongim
-Change Morph>>submorphs to be non-copying. This seems to be the concensus on the squeak-dev list of the right thing to do. It doesn't provide a noticable boost in normal circumstances but with very large numbers of morphs it prevents a lot of scaling problems.
I did not include the other changes, as there are clashes with other changes in OLPC"
Change Set: FasterMorphicProperties Date: 16 October 2006 Author: Marcus Denker
This changeset optimized Morphic property code for message sends by not using accessors"!
Marcus,
Here's the last batch of performance changes backported from 3.9.
These seem to not give too much of a performance boost, but they should not hurt either.
It seems that they do give some performance improvement. Again, I made up two images based on #1072 image. One is without these changes (i.e., vanilla 1072) and another is with all changes in you #2 and #3 emails.
without: '#1072 memory=19365744 display=1012x700x16 painter=1439.4 viewer=1693.2 scriptor=1426.0 category=2444.6 ' '#1072 memory=19287180 display=1012x700x16 painter=1436.4 viewer=1692.8 scriptor=1406.6 category=2442.2 ' '#1072 memory=19175904 display=1012x700x16 painter=1456.0 viewer=1691.2 scriptor=1443.6 category=2452.4 '
with: '#1072 memory=19381884 display=1012x700x16 painter=1414.4 viewer=1580.0 scriptor=1346.2 category=2280.8 ' '#1072 memory=19299216 display=1012x700x16 painter=1414.4 viewer=1582.2 scriptor=1332.0 category=2283.8 ' '#1072 memory=19179744 display=1012x700x16 painter=1465.6 viewer=1587.8 scriptor=1315.2 category=2283.8 '
Again, it is over all 7% or such improvement! It is rather exciting, actually.
The following is the results from the #1059 image without all of these changes. The improvements seems to be "linear". It is really good.
without: '#1059 memory=19310712 display=1012x700x16 painter=1483.2 viewer=1802.4 scriptor=1487.2 category=2628.8 ' '#1059 memory=19228056 display=1012x700x16 painter=1478.4 viewer=1796.8 scriptor=1470.4 category=2625.2 ' '#1059 memory=19022612 display=1012x700x16 painter=1565.2 viewer=1803.8 scriptor=1468.4 category=2627.0 '
"Change Set: RemoveFlag-md Date: 16 October 2006 Author: Marcus Denker
This just removed a self flag: #bob
Oh, boy. Even this wouldn't give us any improvement, I love this one.
-- Yoshiki
Marcus,
I forgot to mention, but
Change Set: SubMorphCopy-md Date: 16 October 2006 Author: Marcus Denker
This one was considered "it seems to be ok. We don't find any code that changes the result". I don't remember the further discussion, but was there some analysis, or it is just seems to be ok as it is long time in 3.9 line images?
Also, we've changed the MorphExtension a bit, I put a subsequent fix for it.
Thank you!
-- Yoshiki
On 16.10.2006, at 19:46, Yoshiki Ohshima wrote:
Marcus,
I forgot to mention, but
Change Set: SubMorphCopy-md Date: 16 October 2006 Author: Marcus Denker
This one was considered "it seems to be ok. We don't find any code that changes the result". I don't remember the further discussion, but was there some analysis, or it is just seems to be ok as it is long time in 3.9 line images?
It's in 3.9 for some time now, yes. But it should not be really perfomance critical, so you might not want to add it.
The changeset that contained this had some optimization that helped a lot with text rendering:
" -StrikeFontSet has been optimized to use the bitblt text primitive for bytestrings. For the multibyte path, it was still possible to save a few sends for a noticable improvement. " http://bugs.impara.de/view.php?id=1628 http://bugs.impara.de/view.php?id=1672
But you worked in that direction, so it might be already in or hard to merge.
Another change that I did not send is the Gradient Cache (Eddie Cottongim) http://bugs.impara.de/view.php?id=1759 I tested it on the OLPC image, and there are not many gradients left, so it will not have any positive effect, I think.
One changeset you might consider to load is Andreas' version of the ByteSymbol primitive fix: http://bugs.impara.de/view.php?id=1382
You fixed already the hashing primitive call, so this won't have any too big impact, but it's nice to have the best solution for that (I think this fix is in 3.8.1, too).
Marcus
Marcus,
This one was considered "it seems to be ok. We don't find any code that changes the result". I don't remember the further discussion, but was there some analysis, or it is just seems to be ok as it is long time in 3.9 line images?
It's in 3.9 for some time now, yes. But it should not be really perfomance critical, so you might not want to add it.
It does make difference, though. I like it in the image. There seems to be small small chance that some area breaks, but it should be something we can fix later.
#copy to Array, String, etc. nils-out memory redundantly (and send #postCopy for nothing), and these memory scanning is slow on such platforms. I'll experiment that.
-StrikeFontSet has been optimized to use the bitblt text primitive for bytestrings. For the multibyte path, it was still possible to save a few sends for a noticable improvement. " http://bugs.impara.de/view.php?id=1628 http://bugs.impara.de/view.php?id=1672
But you worked in that direction, so it might be already in or hard to merge.
I'll check that.
Another change that I did not send is the Gradient Cache (Eddie Cottongim) http://bugs.impara.de/view.php?id=1759 I tested it on the OLPC image, and there are not many gradients left, so it will not have any positive effect, I think.
Ah, we should think about the memory usage too (we generally favor speed over space), and as you say, we are trying to minimize the use of gradient, so probably we should wait on this one.
One changeset you might consider to load is Andreas' version of the ByteSymbol primitive fix: http://bugs.impara.de/view.php?id=1382
You fixed already the hashing primitive call, so this won't have any too big impact, but it's nice to have the best solution for that (I think this fix is in 3.8.1, too).
And, there is a good reason to be in sync somewhat.
Thanks!
-- Yoshiki
Here are a couple of issues arising from these mostly very welcome changes:
(1) I happened to notice that "Color fromUser" had become broken in OLPC after loading the performance improvements. I eventually tracked its cause down to update 1074ShortCutNew-md. Because of the change to Rectangle class origin:corner: in that update, #initialize is no longer sent to a new Quadrangle created via Quadrangle>>origin:corner, with the result that Quadrangles created with this idiom are incompletely initialized, with nil in their borderWidth, borderColor, and insideColor instance variables; this is what broke "Color fromUser".
(BTW CharacterBlock, another Rectangle subclass which has its own inst var additions, might also have been broken by the ShortCutNew change -- someone ought to check.)
When I evaluated "Color fromUser" in a 3.9 image, it didn't cause the error I saw in olpc. Investigating further, I found that this was because in 3.9 a custom @origin:corner: had been added to Quadrangle class-side. But that addition did not make it into the bundle that got sent to us. [BTW it appears that while 3.9 has the fix for Quadrangle>>origin:corner:, it still has the bug for Quadrangle>>origin:extent:]
I've pushed an update to olpc that fixes the two bugs I know of that appeared in Quadrangle as a consequence of update 1074, so I mention the issue here only as a cautionary tale. Partial backports can be dangerous, sometimes in quite obscure ways...
(2) I have reservations about the SubMorphCopy change-set (1075SubMorphCopy-md).
A well-known feature of Morphic's api was that #submorphs returned a shallow copy, rather than handing out the submorphs object itself. The SubMorphCopy update rescinds Morphic's long-standing (> 10 years) promise that #submorphs would return a new collection.
Many of us have had experience with bugs caused by iterating over a collection when the code run within the iteration might change the actual collection being iterated over. Thus, for example, as a safety measure, we often use the idiom "foo copy do:", as opposed to "foo do:", in situations where the actual contents of the foo collection might get changed by the do-block, thus compromising the integrity of the iteration itself. However, knowing that #submorphs always returned a *copy*, it has (until now) always been reasonable -- indeed optimal -- to omit a redundant #copy call in such circumstances.
Now it's *possible* that no methods linger anywhere in the image, nor in any package likely to be loaded by any user of an olpc image, which count on the promise that #submorphs would return a copy. But the only experience in the field with the non-copying #submorphs implementation has been in 3.9, and the total amount of etoy activity by people using 3.9 is approximately zero, whereas on the olpc nearly all Squeak activity will relate to etoys.
The likelihood that something actually got broken by switching to the non-copying #submorphs may be small, but IMO it's decidedly nonzero; and my concern is that if there *are* some breakages caused by this change, they might be extremely hard to track down.
So I wonder about the advisability of making a fundamental semantic change to a fundamental Morphic protocol at this stage, even if it appears that they've "gotten away with it" in 3.9...
Cheers,
-- Scott
On Oct 16, 2006, at 12:20 PM, Yoshiki Ohshima wrote:
Marcus,
Change Set: SubMorphCopy-md This one was considered "it seems to be ok. We don't find any code
that changes the result". I don't remember the further discussion, but was there some analysis, or it is just seems to be ok as it is long time in 3.9 line images?
It's in 3.9 for some time now, yes. But it should not be really performance critical, so you might not want to add it.
It does make difference, though. I like it in the image. There seems to be small small chance that some area breaks, but it should be something we can fix later.
Scott,
I've pushed an update to olpc that fixes the two bugs I know of that appeared in Quadrangle as a consequence of update 1074, so I mention the issue here only as a cautionary tale. Partial backports can be dangerous, sometimes in quite obscure ways...
Thanks!
(2) I have reservations about the SubMorphCopy change-set (1075SubMorphCopy-md).
Hmm. We can revert this. Array>>copy is two times faster, it will offset this. After B-test, we can get it in again.
-- Yoshiki
Scott,
(2) I have reservations about the SubMorphCopy change-set (1075SubMorphCopy-md).
Hmm. We can revert this. Array>>copy is two times faster, it will offset this. After B-test, we can get it in again.
Thank you for putting the change.
BTW, this is the first time to see the word "rescind". I submit a patch to rescind another issue came up with this morning.
-- Yoshiki
etoys-dev@lists.squeakfoundation.org