He lists (crossposting to Etoys and Squeak dev due to relevance for both),
One can say that I am sort of taking over from Luke (Gorrie) as OLE Nepal's Etoys performance optimizer (optimizator?). And I'd like to give a status update, but I would especially like some advice on some issues.
And the sooner the better. Cause we at OLE Nepal are in dire straits; we've got major performance issues, some of which are Etoys related, and some of which are Squeak related. And the reason for the speed is that we're gonna have to have a working XO build in a few days cause we have to train teachers, and in te beginning of April we're gonna flash a build to the machines of the children of the pilot schools, who will use it in the classrooms. So without further ado, except for this redundant sentence:
project loading
To start with the known, I've been looking into the project loading times, and have managed to cut them in half by removing the gzipping on the project files and just using the .pr file in stead of the zip which claims to be a .pr. Now this might of course not be the perfect solution for all, I did uncover some practices on which I'd like either the opinion of Squeak people in general or that of the Etoys people in particular.
First, I noticed (together with Luke, who showed me the debugging ropes) that in the method asUnzippedStream, on ReadWriteStream, the method upToEndWithProgressBar uses '/' an awful lot to give the the right approximation to the progress bar. It sucked about 20 percent of the total loading time of an activity. So I changed upToEndWithProgressBar for upToEnd. Which seems like a more reasonable default 'cause I'd guess that usually one won't actually use the progress bar functionality.
But thats not the end of it Etoys-wise. 'Cause it turns out an Etoys project file is gzipped twice. First the individual file in 'writeForExportWithSources:inDirectory:' and then the containing function writeForExportWithSources:inDirectory: as part of the .pr bundle. And this for a file that already has lots of it's contents compressed, seeing that most of the project file is taken up by jpegs. And this would ideally also be the case for sound files, but more on that later. The size decrease due to zipping in general is not more than 10%. So my question on this topic is: is there an argument for retaining the double zipping? And would there be a general need for a 'just save an uncompressed .pr file, instead of a bundle' option/patch?
project deletion/memory growth
Our second issue is related to memory growth within the Etoys image that's causing problems for the, let us say, memory-challenged XO. First of all we've got a, say, root project, which enumerates the activities which it reads from a few directories. In that root project we've got a script running that deletes child projects. Or it should anyway. The key method here is okToChange, which i guess is a bit of a misnomer. That is it's got a bunch of code in it that should remove a project and it's content from the image, but it doesn't. To make it concrete: we've got this script running in the root project:
unloadActivitiesToFreeSpace Project current children do: [:t1 | t1 okToChange] "or okToChangeSilently, but okToChange is nice for debugging"
But after removing a project, executing 'Project allInstances' shows that there is still a reference to the project and when one keeps an eye on the memory usage (with an OSX utility), we see that no memory is deallocated. So the image grows and grows.
So my concrete question is: how can one for once and for all delete a project instance. I was hoping to at least find some generic deleteInstance method in the image, but I couldn't find one (I'm a bit of a Squeak newbie).
Then it seems to me that the image grows quite fast in general. Is there some known garbage collection problem concerning Squeak or Etoys that I should know about? And related; when I'm gonna try and trim down the image size: any suggestions on where to start? Is there any known fat waiting to be caved off?
Sound format
Another strategy to trimming down the image size is using a compressed sound format. At the moment we use wav for sound right? If I'm wrong, excuse me, but I haven't had time to investigate the matter to thourougly. Anyways, since a number of activities have quite a few samples in them, ranging from short utterings to long sentences, it would be very worth while if we could use a compressed format. From the mailinglist I read rumours about Ogg plugin and a gstreamer plugin... Is it already possible to use Ogg files in Etoys? If so how? I'd already be happy with general directions.
Well that's perhaps a bit more than quite enough questions for one post. I'll save some for later. It goes without saying that any advice on these topics would be greatly appreciated.
/Ties
On Mar 25, 2008, at 10:46 AM, Ties Stuij wrote:
To start with the known, I've been looking into the project loading times, and have managed to cut them in half by removing the gzipping on the project files and just using the .pr file in stead of the zip which claims to be a .pr. Now this might of course not be the perfect solution for all, I did uncover some practices on which I'd like either the opinion of Squeak people in general or that of the Etoys people in particular.
When I was building the Sophie storage subsystem for OLPC I noticed the OLPC "hard disk" uses a compressed file system. There is some technical notes on this suggesting developers do NOT zip files because this is pointless: Compress the file, then store on a compressed file system. On read then you use cpu time to uncompress the compressed file, then uncompress the file. Waste of CPU time.
As a compromise Sophie documents depending on the platform contain a zip file which contains uncompressed images and compressed text, or uncompressed images and text if we know the platform target is OLPC.
However I was not sure of the network impact of then passing uncompressed files around.
First, I noticed (together with Luke, who showed me the debugging ropes) that in the method asUnzippedStream, on ReadWriteStream, the method upToEndWithProgressBar uses '/' an awful lot to give the the right approximation to the progress bar. It sucked about 20 percent of the total loading time of an activity. So I changed upToEndWithProgressBar for upToEnd. Which seems like a more reasonable default 'cause I'd guess that usually one won't actually use the progress bar functionality.
The progress bars interfere with the actual time needed to perform the action, we had loading progress bars for Sophie documents but found for small documents the act of showing the progress bar and updating it could take 80% of the load time.
But thats not the end of it Etoys-wise. 'Cause it turns out an Etoys project file is gzipped twice. First the individual file in 'writeForExportWithSources:inDirectory:' and then the containing function writeForExportWithSources:inDirectory: as part of the .pr bundle. And this for a file that already has lots of it's contents compressed, seeing that most of the project file is taken up by jpegs. And this would ideally also be the case for sound files, but more on that later. The size decrease due to zipping in general is not more than 10%. So my question on this topic is: is there an argument for retaining the double zipping? And would there be a general need for a 'just save an uncompressed .pr file, instead of a bundle' option/patch?
See comment above about not zipping JPEGS/image data that is included in the ZIP file. We choose to store media (sound/video) in a separate media folder because it was pointless to unpack from a zip file on need. However this might not be possible when you want all the data in the zip file, still you could alter the code so that sound/video files aren't compressed by the zip logic when they are added to the zip file.
In Sophie we also have a ZIP file subclass because we found the ZIP storage class was rather inefficient for ZIP files with many members. For example when you lookup and manipulate a zip file member it scans an ordered list looking for the member by name. We altered that to use a dictionary.
So my concrete question is: how can one for once and for all delete a project instance. I was hoping to at least find some generic deleteInstance method in the image, but I couldn't find one (I'm a bit of a Squeak newbie).
There are programming tools in the image that allow you to see who is holding onto a project, I'll let someone else comment.
Sound format
Another strategy to trimming down the image size is using a compressed sound format. At the moment we use wav for sound right? If I'm wrong, excuse me, but I haven't had time to investigate the matter to thourougly. Anyways, since a number of activities have quite a few samples in them, ranging from short utterings to long sentences, it would be very worth while if we could use a compressed format. From the mailinglist I read rumours about Ogg plugin and a gstreamer plugin... Is it already possible to use Ogg files in Etoys? If so how? I'd already be happy with general directions.
http://tinlizzie.org/olpc/OggPlugin/
I am currently working on a gstreamer plugin for OLPC via funding from Viewpoints Research Institute, and have a target completion date for the end of the month. This plugin will let you build gstreamer pipeline chains and run them, right now you can build a pipleline to decode audio/video ogg files for example and send the video to a X11 window and the audio to the hardware sound player.
I also hope to have 'sinks' which let you import audio and video into Squeak.
-- = = = ======================================================================== John M. McIntosh johnmci@smalltalkconsulting.com Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com = = = ========================================================================
On Wed, Mar 26, 2008 at 12:32 AM, John M McIntosh johnmci@smalltalkconsulting.com wrote:
On Mar 25, 2008, at 10:46 AM, Ties Stuij wrote:
When I was building the Sophie storage subsystem for OLPC I noticed the OLPC "hard disk" uses a compressed file system. There is some technical notes on this suggesting developers do NOT zip files because this is pointless: Compress the file, then store on a compressed file system. On read then you use cpu time to uncompress the compressed file, then uncompress the file. Waste of CPU time.
Of course, I remember now I read that somewhere, so that's yet another layer of compression; nice. And yes, for our own projects I already removed the zipping.
See comment above about not zipping JPEGS/image data that is included in the ZIP file. We choose to store media (sound/video) in a separate media folder because it was pointless to unpack from a zip file on need. However this might not be possible when you want all the data in the zip file, still you could alter the code so that sound/video files aren't compressed by the zip logic when they are added to the zip file.
Yes, well as mentioned above, the not zipping isn't a problem. Also 'cause we distribute Epaati in an .xo bundle we've already got a single file for distribution; which is, I'd like to mention just for poetic reasons, yet another layer of compression! And an Etoys project file already houses all the content needed for your project, all of which is loaded at startup. There might indeed be something to say for loading on need though...
There are programming tools in the image that allow you to see who is holding onto a project, I'll let someone else comment.
Ah, I'd very much like to know! Anyone?
http://tinlizzie.org/olpc/OggPlugin/
I am currently working on a gstreamer plugin for OLPC via funding from Viewpoints Research Institute, and have a target completion date for the end of the month. This plugin will let you build gstreamer pipeline chains and run them, right now you can build a pipleline to decode audio/video ogg files for example and send the video to a X11 window and the audio to the hardware sound player.
I also hope to have 'sinks' which let you import audio and video into Squeak.
Excellent, thanks.
/Ties
On Mar 26, 2008, at 6:39 , Ties Stuij wrote:
On Wed, Mar 26, 2008 at 12:32 AM, John M McIntosh johnmci@smalltalkconsulting.com wrote:
There are programming tools in the image that allow you to see who is holding onto a project, I'll let someone else comment.
Ah, I'd very much like to know! Anyone?
Here's a recipe (though in a different context):
http://wiki.squeak.org/squeak/ObsoleteClasses
- Bert -
Hello Tie,
Your email is very interesting. And I am wondering: does jpeg file inserted in a Squeak image then saved in a .pr file are saved in the .pr file as jpeg data or just 24bits RBG form data? I am enclined to think this is the second option, if so the size impact is hudge.
With your project growing, you may want to stop using .pr file and use external file to describe the activity (xml files) plus media data (jpeg, ogg, etc.). I am betting you will see a hudge performence boost. But then you are cut from Etoys..
I am currious to read Squeak friends advices.
Hilaire
2008/3/25, Ties Stuij cjstuij@gmail.com:
He lists (crossposting to Etoys and Squeak dev due to relevance for both),
One can say that I am sort of taking over from Luke (Gorrie) as OLE Nepal's Etoys performance optimizer (optimizator?). And I'd like to give a status update, but I would especially like some advice on some issues.
And the sooner the better. Cause we at OLE Nepal are in dire straits; we've got major performance issues, some of which are Etoys related, and some of which are Squeak related. And the reason for the speed is that we're gonna have to have a working XO build in a few days cause we have to train teachers, and in te beginning of April we're gonna flash a build to the machines of the children of the pilot schools, who will use it in the classrooms. So without further ado, except for this redundant sentence:
project loading
To start with the known, I've been looking into the project loading times, and have managed to cut them in half by removing the gzipping on the project files and just using the .pr file in stead of the zip which claims to be a .pr. Now this might of course not be the perfect solution for all, I did uncover some practices on which I'd like either the opinion of Squeak people in general or that of the Etoys people in particular.
First, I noticed (together with Luke, who showed me the debugging ropes) that in the method asUnzippedStream, on ReadWriteStream, the method upToEndWithProgressBar uses '/' an awful lot to give the the right approximation to the progress bar. It sucked about 20 percent of the total loading time of an activity. So I changed upToEndWithProgressBar for upToEnd. Which seems like a more reasonable default 'cause I'd guess that usually one won't actually use the progress bar functionality.
But thats not the end of it Etoys-wise. 'Cause it turns out an Etoys project file is gzipped twice. First the individual file in 'writeForExportWithSources:inDirectory:' and then the containing function writeForExportWithSources:inDirectory: as part of the .pr bundle. And this for a file that already has lots of it's contents compressed, seeing that most of the project file is taken up by jpegs. And this would ideally also be the case for sound files, but more on that later. The size decrease due to zipping in general is not more than 10%. So my question on this topic is: is there an argument for retaining the double zipping? And would there be a general need for a 'just save an uncompressed .pr file, instead of a bundle' option/patch?
project deletion/memory growth
Our second issue is related to memory growth within the Etoys image that's causing problems for the, let us say, memory-challenged XO. First of all we've got a, say, root project, which enumerates the activities which it reads from a few directories. In that root project we've got a script running that deletes child projects. Or it should anyway. The key method here is okToChange, which i guess is a bit of a misnomer. That is it's got a bunch of code in it that should remove a project and it's content from the image, but it doesn't. To make it concrete: we've got this script running in the root project:
unloadActivitiesToFreeSpace Project current children do: [:t1 | t1 okToChange] "or okToChangeSilently, but okToChange is nice for debugging"
But after removing a project, executing 'Project allInstances' shows that there is still a reference to the project and when one keeps an eye on the memory usage (with an OSX utility), we see that no memory is deallocated. So the image grows and grows.
So my concrete question is: how can one for once and for all delete a project instance. I was hoping to at least find some generic deleteInstance method in the image, but I couldn't find one (I'm a bit of a Squeak newbie).
Then it seems to me that the image grows quite fast in general. Is there some known garbage collection problem concerning Squeak or Etoys that I should know about? And related; when I'm gonna try and trim down the image size: any suggestions on where to start? Is there any known fat waiting to be caved off?
Sound format
Another strategy to trimming down the image size is using a compressed sound format. At the moment we use wav for sound right? If I'm wrong, excuse me, but I haven't had time to investigate the matter to thourougly. Anyways, since a number of activities have quite a few samples in them, ranging from short utterings to long sentences, it would be very worth while if we could use a compressed format. From the mailinglist I read rumours about Ogg plugin and a gstreamer plugin... Is it already possible to use Ogg files in Etoys? If so how? I'd already be happy with general directions.
Well that's perhaps a bit more than quite enough questions for one post. I'll save some for later. It goes without saying that any advice on these topics would be greatly appreciated.
/Ties _______________________________________________ Etoys mailing list Etoys@lists.laptop.org http://lists.laptop.org/listinfo/etoys
On Wed, Mar 26, 2008 at 12:43 AM, Hilaire Fernandes hilaire@ofset.org wrote:
Hello Tie,
Your email is very interesting. And I am wondering: does jpeg file inserted in a Squeak image then saved in a .pr file are saved in the .pr file as jpeg data or just 24bits RBG form data? I am enclined to think this is the second option, if so the size impact is hudge.
Hmm, yes interesting question. I didn't investigate, but judging from the amount of pictures compared to the file size, I guessed they were stored compressed in one way or another. Perhaps somebody with knowledge could shed some light?
With your project growing, you may want to stop using .pr file and use external file to describe the activity (xml files) plus media data (jpeg, ogg, etc.). I am betting you will see a hudge performence boost. But then you are cut from Etoys..
I am currious to read Squeak friends advices.
As am I,
/Ties
Ties Stuij wrote:
On Wed, Mar 26, 2008 at 12:43 AM, Hilaire Fernandes hilaire@ofset.org wrote:
Hello Tie,
Your email is very interesting. And I am wondering: does jpeg file inserted in a Squeak image then saved in a .pr file are saved in the .pr file as jpeg data or just 24bits RBG form data? I am enclined to think this is the second option, if so the size impact is hudge.
Hmm, yes interesting question. I didn't investigate, but judging from the amount of pictures compared to the file size, I guessed they were stored compressed in one way or another. Perhaps somebody with knowledge could shed some light?
It's compressed. When you load a form via Form>>fromFileNamed: the project's resource manager will remember the original bits (the jpeg file). Only if the reference gets lost (the file deleted, the image moved) it will use the uncompressed bits.
With your project growing, you may want to stop using .pr file and use external file to describe the activity (xml files) plus media data (jpeg, ogg, etc.). I am betting you will see a hudge performence boost. But then you are cut from Etoys..
I am currious to read Squeak friends advices.
As am I,
I would not recommend going down this path if you're new to Squeak and don't know what you are buying into. The .pr files work because half a dozen people spent a couple of years to make all of this stuff work. You'd be pretty much on your own recreating this effort so unless you have sufficient resources just try to deal with the inefficiencies and leave the architecture alone.
Coincidentally, in your previous message you noticed the double-compression problem. What I would do is simply change the code slightly such that it uses zip with no compression (which is one option with zip files). This allows you to reuse the whole infrastructure without paying the price of compressing/decompressing everything.
Cheers, - Andreas
2008/3/26, Andreas Raab andreas.raab@gmx.de:
Ties Stuij wrote:
On Wed, Mar 26, 2008 at 12:43 AM, Hilaire Fernandes hilaire@ofset.org wrote:
Hello Tie,
Your email is very interesting. And I am wondering: does jpeg file inserted in a Squeak image then saved in a .pr file are saved in the .pr file as jpeg data or just 24bits RBG form data? I am enclined to think this is the second option, if so the size impact is hudge.
Hmm, yes interesting question. I didn't investigate, but judging from the amount of pictures compared to the file size, I guessed they were stored compressed in one way or another. Perhaps somebody with knowledge could shed some light?
It's compressed. When you load a form via Form>>fromFileNamed: the project's resource manager will remember the original bits (the jpeg file). Only if the reference gets lost (the file deleted, the image moved) it will use the uncompressed bits.
To be accurate, what happen if one drag and drop a jpeg file in an image, then save a project file, move the project in another computer without the original jpeg file? Is the jpeg file saved in the .pr zip file as well?
As a concrete simple example, how can I examine the content of this project file http://swiki.ofset.org:8000/super/36 (produced with a 3.8 image)?
It use 3 graphics and I am really curious to know what inside the .pr file.
The info I got from the .pr is: hilaire@tice:/tmp$ file ENGRENAGE4.001.pr ENGRENAGE4.001.pr: gzip compressed data, from FAT filesystem (MS-DOS, OS/2, NT)
Once gunzipped, I can't examine the content...
Hilaire
On Mar 26, 2008, at 9:28 , Hilaire Fernandes wrote:
As a concrete simple example, how can I examine the content of this project file http://swiki.ofset.org:8000/super/36 (produced with a 3.8 image)?
It use 3 graphics and I am really curious to know what inside the .pr file.
The info I got from the .pr is: hilaire@tice:/tmp$ file ENGRENAGE4.001.pr ENGRENAGE4.001.pr: gzip compressed data, from FAT filesystem (MS- DOS, OS/2, NT)
Once gunzipped, I can't examine the content...
Why not? After gunzipping, you can inspect what is in it:
(FileStream readOnlyFileNamed:'ENGRENAGE4.001') fileInObjectAndCode
You get an ImageSegment containing 9195 objects with 250 outpointers. The objects have these classes:
(arrayOfRoots collect: [:each | each class]) asBag sortedCounts
a SortedCollection(1176->Point 1146->Association 1121->Array 908-
Rectangle 589->MorphExtension 555->Color 519->IdentityDictionary 434- Bitmap 342->TableLayout 342->TableLayoutProperties 328->Float 326- ByteString 167->SimpleBorder 141->Morph 139->ImageMorph 134- StringMorph 90->EventHandler 78->TileMorph 72->TilePadMorph 51- AlignmentMorph 42->PhraseTileMorph 39->SketchMorph 38- OrderedCollection 36->IconicButton 33->UpdatingStringMorph 32- CompiledMethod 24->SimpleButtonMorph 21->AssignmentTileMorph 18- DiskProxy 17->WeakArray 13->StepMessage 12->ScriptEditorMorph 12- ScriptInstantiation 12->TickIndicatorMorph 12->LinedTTCFont 12- UpdatingSimpleButtonMorph 12->UpdatingThreePhaseButtonMorph 12- UniclassScript 12->ScriptStatusControl 11->ScriptActivationButton 9- SlotInformation 9->ByteArray 6->Form 5->TTCFont 4->TranslucentColor
4->ShortPointArray 4->ActorState 4->TTGlyph 4->PasteUpMorph 3-
MorphicTransform 3->Metaclass 3->MethodDictionary 3->ColorForm 3- TransformationMorph 3->ScriptNameTile 3->ColorArray 3- ClassOrganizer 3->ThumbnailMorph 3->NumericReadoutTile 3- ViewerFlapTab 3->LayoutProperties 3->WatcherWrapper 2->MethodContext
2->Heap 2->Dictionary 2->DamageRecorder 2->HandMorph 2->BlockContext 1-
Player78 1->LocaleID 1->Player79 class 1->ChangeSet 1->Player78
class 1->MouseEvent 1->WorldState 1->TranscriptStream 1->Project 1-
Set 1->UnscriptedPlayer 1->LargePositiveInteger 1->Player77 class 1- Player77 1->MacRomanInputInterpreter 1->Presenter 1->Player79)
Or you could have a look which forms are in the segment:
arrayOfRoots select: [:each | each isKindOf: Form]
#(Form(197x132x16) ColorForm(443x444x8) ColorForm(152x152x8) Form(287x287x16) ColorForm(287x287x8) Form(152x152x16) Form(9x11x16) Form(443x444x16) Form(9x11x16))
These appear to be your 3 gears as originally imported (the 8-bit forms) plus the same but with a painted dot on top (the 16-bit forms).
And so on.
- Bert -
Hilaire,
To be accurate, what happen if one drag and drop a jpeg file in an image, then save a project file, move the project in another computer without the original jpeg file? Is the jpeg file saved in the .pr zip file as well?
Currently, it forgets the original format and all are treated as the same Form.
I contemplated that we can keep the original format, and if it was JPEG, we could even keep the original bytes of JPEG data. The catch is that if the user painted on the Sketch created from JPEG, determining whether to make sense to store the result in JPEG or not is not trivial.
-- Yoshiki
2008/3/27, Yoshiki Ohshima yoshiki@vpri.org:
Hilaire,
To be accurate, what happen if one drag and drop a jpeg file in an image, then save a project file, move the project in another computer without the original jpeg file? Is the jpeg file saved in the .pr zip file as well?
Currently, it forgets the original format and all are treated as the same Form.
So we have clearly an issue there !
I contemplated that we can keep the original format, and if it was JPEG, we could even keep the original bytes of JPEG data. The catch is that if the user painted on the Sketch created from JPEG, determining whether to make sense to store the result in JPEG or not is not trivial.
Indeed not trivial. A simple first helpful boost will that unmodified image (JPEG, PNG, GIF) to be kept in original format. It is very likely other users of Squeak will complain about that problem (pr. file getting big pretty quickly).
Hilaire
On Mar 26, 2008, at 7:59 , Andreas Raab wrote:
Ties Stuij wrote:
On Wed, Mar 26, 2008 at 12:43 AM, Hilaire Fernandes <hilaire@ofset.org
wrote: Hello Tie,
Your email is very interesting. And I am wondering: does jpeg file inserted in a Squeak image then saved in a .pr file are saved in the .pr file as jpeg data or just 24bits RBG form data? I am enclined to think this is the second option, if so the size impact is hudge.
Hmm, yes interesting question. I didn't investigate, but judging from the amount of pictures compared to the file size, I guessed they were stored compressed in one way or another. Perhaps somebody with knowledge could shed some light?
It's compressed. When you load a form via Form>>fromFileNamed: the project's resource manager will remember the original bits (the jpeg file). Only if the reference gets lost (the file deleted, the image moved) it will use the uncompressed bits.
I'm almost sure (though not 100%) that this code is not active anymore, forms are nowadays alway stored as plain objects. It's easy to test though - make a project, load a jpeg, save the project, unzip it - there would have to be a jpeg in the zip.
With your project growing, you may want to stop using .pr file and use external file to describe the activity (xml files) plus media data (jpeg, ogg, etc.). I am betting you will see a hudge performence boost. But then you are cut from Etoys..
I am currious to read Squeak friends advices.
As am I,
I would not recommend going down this path if you're new to Squeak and don't know what you are buying into. The .pr files work because half a dozen people spent a couple of years to make all of this stuff work. You'd be pretty much on your own recreating this effort so unless you have sufficient resources just try to deal with the inefficiencies and leave the architecture alone.
Actually, Yoshiki's new s-expression based project format does pretty much exactly that.
- Bert -
Bert Freudenberg wrote:
It's compressed. When you load a form via Form>>fromFileNamed: the project's resource manager will remember the original bits (the jpeg file). Only if the reference gets lost (the file deleted, the image moved) it will use the uncompressed bits.
I'm almost sure (though not 100%) that this code is not active anymore, forms are nowadays alway stored as plain objects. It's easy to test though - make a project, load a jpeg, save the project, unzip it - there would have to be a jpeg in the zip.
Oops. Shows how up to date I am on this code base ;-)
I would not recommend going down this path if you're new to Squeak and don't know what you are buying into. The .pr files work because half a dozen people spent a couple of years to make all of this stuff work. You'd be pretty much on your own recreating this effort so unless you have sufficient resources just try to deal with the inefficiencies and leave the architecture alone.
Actually, Yoshiki's new s-expression based project format does pretty much exactly that.
Yes, but I would still not recommend anyone start such a project until they really understand what they're getting themselves into ;-)
Cheers, - Andreas
On Mar 26, 2008, at 6:42 , Ties Stuij wrote:
On Wed, Mar 26, 2008 at 12:43 AM, Hilaire Fernandes hilaire@ofset.org wrote:
Hello Tie,
Your email is very interesting. And I am wondering: does jpeg file inserted in a Squeak image then saved in a .pr file are saved in the .pr file as jpeg data or just 24bits RBG form data? I am enclined to think this is the second option, if so the size impact is hudge.
Hmm, yes interesting question. I didn't investigate, but judging from the amount of pictures compared to the file size, I guessed they were stored compressed in one way or another. Perhaps somebody with knowledge could shed some light?
As far as I know the forms are hibernated (that is, the form's Bitmap in run-length encoded into a ByteArray) and stored as-is in the image segment that then is zipped. Depending on the picture contents this compresses quite well.
- Bert -
Ties Stuij wrote:
He lists (crossposting to Etoys and Squeak dev due to relevance for both),
One can say that I am sort of taking over from Luke (Gorrie) as OLE Nepal's Etoys performance optimizer (optimizator?). And I'd like to give a status update, but I would especially like some advice on some issues.
And the sooner the better. Cause we at OLE Nepal are in dire straits; we've got major performance issues, some of which are Etoys related, and some of which are Squeak related. And the reason for the speed is that we're gonna have to have a working XO build in a few days cause we have to train teachers, and in te beginning of April we're gonna flash a build to the machines of the children of the pilot schools, who will use it in the classrooms. So without further ado, except for this redundant sentence:
project loading
To start with the known, I've been looking into the project loading times, and have managed to cut them in half by removing the gzipping on the project files and just using the .pr file in stead of the zip which claims to be a .pr. Now this might of course not be the perfect solution for all, I did uncover some practices on which I'd like either the opinion of Squeak people in general or that of the Etoys people in particular.
First, I noticed (together with Luke, who showed me the debugging ropes) that in the method asUnzippedStream, on ReadWriteStream, the method upToEndWithProgressBar uses '/' an awful lot to give the the right approximation to the progress bar. It sucked about 20 percent of the total loading time of an activity. So I changed upToEndWithProgressBar for upToEnd. Which seems like a more reasonable default 'cause I'd guess that usually one won't actually use the progress bar functionality.
But thats not the end of it Etoys-wise. 'Cause it turns out an Etoys project file is gzipped twice. First the individual file in 'writeForExportWithSources:inDirectory:' and then the containing function writeForExportWithSources:inDirectory: as part of the .pr bundle. And this for a file that already has lots of it's contents compressed, seeing that most of the project file is taken up by jpegs. And this would ideally also be the case for sound files, but more on that later. The size decrease due to zipping in general is not more than 10%. So my question on this topic is: is there an argument for retaining the double zipping? And would there be a general need for a 'just save an uncompressed .pr file, instead of a bundle' option/patch?
Sounds like you are onto something. There are some dark corners in the project saving and loading classes and it needs some tlc.
project deletion/memory growth
Our second issue is related to memory growth within the Etoys image that's causing problems for the, let us say, memory-challenged XO. First of all we've got a, say, root project, which enumerates the activities which it reads from a few directories. In that root project we've got a script running that deletes child projects. Or it should anyway. The key method here is okToChange, which i guess is a bit of a misnomer. That is it's got a bunch of code in it that should remove a project and it's content from the image, but it doesn't. To make it concrete: we've got this script running in the root project:
unloadActivitiesToFreeSpace Project current children do: [:t1 | t1 okToChange] "or okToChangeSilently, but okToChange is nice for debugging"
But after removing a project, executing 'Project allInstances' shows that there is still a reference to the project and when one keeps an eye on the memory usage (with an OSX utility), we see that no memory is deallocated. So the image grows and grows.
So my concrete question is: how can one for once and for all delete a project instance. I was hoping to at least find some generic deleteInstance method in the image, but I couldn't find one (I'm a bit of a Squeak newbie).
Then it seems to me that the image grows quite fast in general. Is there some known garbage collection problem concerning Squeak or Etoys that I should know about? And related; when I'm gonna try and trim down the image size: any suggestions on where to start? Is there any known fat waiting to be caved off?
Sound format
Another strategy to trimming down the image size is using a compressed sound format. At the moment we use wav for sound right? If I'm wrong, excuse me, but I haven't had time to investigate the matter to thourougly. Anyways, since a number of activities have quite a few samples in them, ranging from short utterings to long sentences, it would be very worth while if we could use a compressed format. From the mailinglist I read rumours about Ogg plugin and a gstreamer plugin... Is it already possible to use Ogg files in Etoys? If so how? I'd already be happy with general directions.
Ogg and speex should work. There is a Ogg plugin and supporting classes for handling the sounds. I made a sound library tool that let you compress the sounds in different codecs once they are imported into Etoys.
Karl
Well that's perhaps a bit more than quite enough questions for one post. I'll save some for later. It goes without saying that any advice on these topics would be greatly appreciated.
/Ties
On Wed, Mar 26, 2008 at 1:44 AM, karl karl.ramberg@comhem.se wrote:
Sounds like you are onto something. There are some dark corners in the project saving and loading classes and it needs some tlc.
Could you elaborate on this?
/Ties
Ties Stuij wrote:
Sound format
Another strategy to trimming down the image size is using a compressed sound format. At the moment we use wav for sound right? If I'm wrong, excuse me, but I haven't had time to investigate the matter to thourougly. Anyways, since a number of activities have quite a few samples in them, ranging from short utterings to long sentences, it would be very worth while if we could use a compressed format. From the mailinglist I read rumours about Ogg plugin and a gstreamer plugin... Is it already possible to use Ogg files in Etoys? If so how? I'd already be happy with general directions.
You have to load this fix to get the SoundLibraryTool access to file compression: https://dev.laptop.org/attachment/ticket/5353/SoundLibraryCompress.10.cs
Select the sound in the tool, bring up the halo menu and compress it with gsm, ogg or speex codec
Karl
On Wed, Mar 26, 2008 at 2:10 AM, karl karl.ramberg@comhem.se wrote:
You have to load this fix to get the SoundLibraryTool access to file compression: https://dev.laptop.org/attachment/ticket/5353/SoundLibraryCompress.10.cs
Select the sound in the tool, bring up the halo menu and compress it with gsm, ogg or speex codec
Karl
Thanks!
/Ties
Another strategy to trimming down the image size is using a compressed sound format. At the moment we use wav for sound right? If I'm wrong, excuse me, but I haven't had time to investigate the matter to thourougly. Anyways, since a number of activities have quite a few samples in them, ranging from short utterings to long sentences, it would be very worth while if we could use a compressed format. From the mailinglist I read rumours about Ogg plugin and a gstreamer plugin... Is it already possible to use Ogg files in Etoys? If so how? I'd already be happy with general directions.
You have to load this fix to get the SoundLibraryTool access to file compression: https://dev.laptop.org/attachment/ticket/5353/SoundLibraryCompress.10.cs
Select the sound in the tool, bring up the halo menu and compress it with gsm, ogg or speex codec
Sorry, I wasn't following the ticket closely. This seems to be good for inclusion. Also, you can choose the compression when recording from the sound recorder.
-- Yoshiki
On Mar 25, 2008, at 18:46 , Ties Stuij wrote:
So my question on this topic is: is there an argument for retaining the double zipping?
Don't think so.
And would there be a general need for a 'just save an uncompressed .pr file, instead of a bundle' option/patch?
Possibly. Changesets welcome :)
project deletion/memory growth
Our second issue is related to memory growth within the Etoys image that's causing problems for the, let us say, memory-challenged XO. First of all we've got a, say, root project, which enumerates the activities which it reads from a few directories. In that root project we've got a script running that deletes child projects. Or it should anyway. The key method here is okToChange, which i guess is a bit of a misnomer. That is it's got a bunch of code in it that should remove a project and it's content from the image, but it doesn't. To make it concrete: we've got this script running in the root project:
unloadActivitiesToFreeSpace Project current children do: [:t1 | t1 okToChange] "or okToChangeSilently, but okToChange is nice for debugging"
But after removing a project, executing 'Project allInstances' shows that there is still a reference to the project and when one keeps an eye on the memory usage (with an OSX utility), we see that no memory is deallocated. So the image grows and grows.
Well, you also have to invoke the garbage collector (it will not kick in if there is loads of memory still usable). And the OSX utility cannot tell which memory is actually used. Look at the VM statistics (in the help menu).
So my concrete question is: how can one for once and for all delete a project instance. I was hoping to at least find some generic deleteInstance method in the image, but I couldn't find one (I'm a bit of a Squeak newbie).
You need to find out what's holding onto the project, break that reference, then the garbage collector will do its job.
Then it seems to me that the image grows quite fast in general. Is there some known garbage collection problem concerning Squeak or Etoys that I should know about? And related; when I'm gonna try and trim down the image size: any suggestions on where to start? Is there any known fat waiting to be caved off?
Certainly - though image stripping is an art that tends to become forgotten with todays gigabyte memory equipped machines.
You should start by looking at what is making your image larger than expected - SpaceTally>>printSpaceAnalysis might help.
- Bert -
Bert Freudenberg wrote:
On Mar 25, 2008, at 18:46 , Ties Stuij wrote:
So my question on this topic is: is there an argument for retaining the double zipping?
Don't think so.
But some internal compression could help.
Here is two test cases:
I made a test project with and added 255k jpeg image to it and saved it. The project was 1.6M.
I made another test project and added a 1.7M aiff file and compressed the file with ogg. The saved project was 86k.
Could compressing the forms inside the image like we compress sounds help ?
Karl
Hello,
I published a few changes in last a couple of weeks; I just looked at some inefficiency and avoid them.
I compared the project loading time in Etoys 2.3 (before Luke's method dictionary patch) and Etoys 3.0 #1957. I loaded all projects in the ExampleEtoys like this:
------------------- | dir entries proj | dir _ FileDirectory on: '/usr/share/etoys/ExampleEtoys'. entries _ FileList2 projectOnlySelectionMethod: dir entries. entries _ entries collect: [:each | Project parseProjectFileName: each first]. entries do: [:each | Transcript show: '\classic: ' withCRs , each first, ' ', ([ProjectLoading loadFromDir: '/usr/share/etoys/ExampleEtoys' projectName: each first] timeToRun) printString. proj _ (Project named: each first). proj ifNotNil: [proj okToChangeSilently]. ]. -------------------
got result (numbers are in milliseconds and are the average of three runs):
BallDropAnalysis1 13850.3 9971 1.38906 BetterMovieUI 9662.67 6145 1.57244 BouncingBallAnimation 4825.33 3151.67 1.53104 CarAndPen 5676.33 3504.67 1.61965 ComputerLogicGame 14902.7 7080.33 2.1048 DemonCastle1 24442.7 11497.7 2.12588 EtoysChallenge 19955.7 10435.7 1.91226 FishAndPlankton 12526 9573.33 1.30843 FollowRoad 5532.33 3869.33 1.42979 JustPaintedCar 3777 2237 1.68842 LunarLanderGame 7730 5286.67 1.46217 MakeAMovie 8909.67 5718 1.55818 MiddleOfRoad 6056 3554 1.704 ParticlesDyeInWater 11483.3 5672.33 2.02445 ParticlesEpidemic 9368.67 5116.33 1.83113 ParticlesGasModel 10688 4668.67 2.2893 RandomRacing 7013.33 4263.33 1.64504 SalmonSniff 5943 3912.33 1.51904 SimpleSprings 5563 4566.67 1.21818 SpeedAcceleration 6150.67 4421.33 1.39113 StartOfDTPDocument 4577.67 3090.67 1.48113 SteeringTheCar 6285.67 4388.67 1.43225 TurtleGeometry 6117.67 3903.33 1.56729 Welcome 22904 11830.3 1.93604
I haven't done anything about the double compression problem, putting media file aside, nor looking at the saving side. But now project loading is 20%-100% faster than before, and bigger projects seem to benefit more.
I have been experimenting another new format in S-expression, but found that for bigger projects, the new format is much slower. Unless there is a way to optimize it, or some other reason, probably we stick with the old format for now...
I'll try to modify the code so that .pr file may not have to be compressed and see if how much we gain.
-- Yoshiki
Yoshiki Ohshima wrote:
Hello,
I published a few changes in last a couple of weeks; I just looked at some inefficiency and avoid them.
I compared the project loading time in Etoys 2.3 (before Luke's method dictionary patch) and Etoys 3.0 #1957. I loaded all projects in the ExampleEtoys like this:
| dir entries proj | dir _ FileDirectory on: '/usr/share/etoys/ExampleEtoys'. entries _ FileList2 projectOnlySelectionMethod: dir entries. entries _ entries collect: [:each | Project parseProjectFileName: each first]. entries do: [:each | Transcript show: '\classic: ' withCRs , each first, ' ', ([ProjectLoading loadFromDir: '/usr/share/etoys/ExampleEtoys' projectName: each first] timeToRun) printString. proj _ (Project named: each first). proj ifNotNil: [proj okToChangeSilently]. ].
got result (numbers are in milliseconds and are the average of three runs):
BallDropAnalysis1 13850.3 9971 1.38906 BetterMovieUI 9662.67 6145 1.57244 BouncingBallAnimation 4825.33 3151.67 1.53104 CarAndPen 5676.33 3504.67 1.61965 ComputerLogicGame 14902.7 7080.33 2.1048 DemonCastle1 24442.7 11497.7 2.12588 EtoysChallenge 19955.7 10435.7 1.91226 FishAndPlankton 12526 9573.33 1.30843 FollowRoad 5532.33 3869.33 1.42979 JustPaintedCar 3777 2237 1.68842 LunarLanderGame 7730 5286.67 1.46217 MakeAMovie 8909.67 5718 1.55818 MiddleOfRoad 6056 3554 1.704 ParticlesDyeInWater 11483.3 5672.33 2.02445 ParticlesEpidemic 9368.67 5116.33 1.83113 ParticlesGasModel 10688 4668.67 2.2893 RandomRacing 7013.33 4263.33 1.64504 SalmonSniff 5943 3912.33 1.51904 SimpleSprings 5563 4566.67 1.21818 SpeedAcceleration 6150.67 4421.33 1.39113 StartOfDTPDocument 4577.67 3090.67 1.48113 SteeringTheCar 6285.67 4388.67 1.43225 TurtleGeometry 6117.67 3903.33 1.56729 Welcome 22904 11830.3 1.93604
I haven't done anything about the double compression problem, putting media file aside, nor looking at the saving side. But now project loading is 20%-100% faster than before, and bigger projects seem to benefit more.
Great progress. Do the big project contain lots of forms ? Sound is very efficiently compressed instances with the ogg plugin and maybe something similar would be possible for forms.
Karl
I have been experimenting another new format in S-expression, but found that for bigger projects, the new format is much slower. Unless there is a way to optimize it, or some other reason, probably we stick with the old format for now...
I'll try to modify the code so that .pr file may not have to be compressed and see if how much we gain.
-- Yoshiki _______________________________________________ Etoys mailing list Etoys@lists.laptop.org http://lists.laptop.org/listinfo/etoys
etoys-dev@lists.squeakfoundation.org