Hi all,
I am responsible for the re-implementation of GIFReadWriter in Pharo from a few years ago (see this PR https://github.com/pharo-project/pharo/pull/1666). Earlier this week I was able to successfully port most of that work to Cuis, so I thought I'd take a stab at Squeak too.
Unlike Pharo and Cuis, however, Squeak already has a class `AnimatedImageMorph`. I wanted to discuss my proposed changes here for this reason, prior to submitting any formal changes to the inbox.
So here's what I've done so far:
First, as with Pharo and Cuis, `GIFReadWriter` has been completely refactored to have a more clear and more complete implementation of the GIF format. This includes separating out the compression activity into two helper classes (`LzwGifEncoder` and `LzwGifDecoder`), as well as completely removing the existing `AnimatedGIFReadWriter` class. I have also added the `AnimatedImageFrame` class for composing GIF frame information that can be used by other classes -- both for writing and for displaying.
Second, I have refactored the existing `AnimatedImageMorph` in Squeak to better deal with displaying real-life animated GIFs. Prior to this, many GIFs would display incorrectly due to the nature of GIF frame compositing and disposal. All of the test GIFs I've tried so far (and there have been a lot) seem to be working well with this class.
Third, I have created an alternative class -- presumptuously named `BetterAnimatedImageMorph` -- that has some optimizations. The regular `AnimatedImageMorph` does the GIF frame/form compositing "on the fly," which results in a lot of extra computation each time the animation loops. The upshot here is that we cannot achieve real framerates for some GIFs -- they always play slower in Squeak than they would in a browser regardless of how low one sets the stepTime/delay. `BetterAnimatedImageMorph` takes a different approach: it loops through all the GIFReadWriter's read-in frames and first creates a collection of fully composited Forms. It then simply changes the current form to be displayed at each step. In my tests this has resulted in much more performant playback, of which I have provided examples (see below).
One issue with `BetterAnimatedImageMorph` is that the original information about the constituent forms of the GIF is "lost". Attempting to write it back to a GIF file will necessarily result in a larger file than the original, because most animated GIFs are made of frames that are "diffed" in a sense. There are a couple of ways to deal with this though, such as caching the original frame data etc.
My concrete proposals, then, would be the following: 1. Remove AnimatedGIFReadWriter and replace GIFReadWriter with my updated version (along with helper classes) 2. Replace AnimatedImageMorph with my new BetterAnimatedImageMorph
If you've read this far and have the patience to try it out, I have my working code up online at https://gitlab.com/darth-cheney/squeak-animated-gif
Load this code in Squot and check it out. Try some of the examples in the FunGIFExamples class (on the class side). In particular, any class method starting with `example` should be good fun.
For a live comparison of the performance between my `BetterAnimatedImageMorph` and the current `AnimatedImageMorph`, do `FunGIFExamples animatedImageMorphComparison` and you should be able to see the issue right away.
Any feedback is greatly appreciated here, and hopefully we can get this or something like it into the main image. GIFs are too fun to leave out!
Hi Eric,
I cannot speak for the community, but I am very glad that you are addressing this domain - GIFs, especially animated ones, are a great application and demonstration of the Morphic values, and there have been several issues with the old implementation mentioned on this list during the last years. I have been using AnimatedImageMorph myself recently for TelegramBot and was a bit sad about how many GIFs could not be processed properly. I have tried out your changes and they look very beautiful, most impressively #animatedImageMorphComparison!
Here is a naive list of things that need to be done to get your changes into the Trunk:
* compatibility checks - are there any breaking changes in terms of behavior, or has the public protocol been changed? Maybe it would be worth improving compatibility, adding a few "deprecated" methods, autc. For instance, probably we should re-add an empty subclass AnimatedGIFReadWriter in the 60Deprecated/61Deprecated package to not break existing code that relies on it. Another example, #openGIFInWindow:/#fromStream: are not understood by BetterAnimatedImageMorph class. * release scheduling - this looks like a bigger (though very valuable) change. We should discuss whether we want to integrate the patch before or after the upcoming 6.0 release. * code review - not sure whether there is anyone who would like to have a detailed look at your patch before it is integrated. But as you say it is "tested by the crowd" on two different Smalltalk systems, my personal opinion would be that this should not be a reason to block great contributions.
@all trunk maintainers: What do you think? Did I miss any other step? :-)
Best, Christoph
PS: Have you or someone else already tried to optimize the new GIFReadWriter for performance? I am having a 36 MB large GIFhttps://github.com/LinqLover/TelegramSmalltalkBot/blob/master/img/screencast.gif here which takes eternities (> 10 minutes) to file in on my machine. With a 9 MB small GIF, it still takes a few minutes. And the RAM consumption of my image is noticeably high. But after that, I can play the GIF with ca. 25 FPS. However, the old implementation was not faster either.
________________________________ Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Eric Gade eric.gade@gmail.com Gesendet: Samstag, 27. November 2021 19:02 Uhr An: The general-purpose Squeak developers list Betreff: [squeak-dev] [Discussion] GIF and Animated GIF Refactoring
Hi all,
I am responsible for the re-implementation of GIFReadWriter in Pharo from a few years ago (see this PRhttps://github.com/pharo-project/pharo/pull/1666). Earlier this week I was able to successfully port most of that work to Cuis, so I thought I'd take a stab at Squeak too.
Unlike Pharo and Cuis, however, Squeak already has a class `AnimatedImageMorph`. I wanted to discuss my proposed changes here for this reason, prior to submitting any formal changes to the inbox.
So here's what I've done so far:
First, as with Pharo and Cuis, `GIFReadWriter` has been completely refactored to have a more clear and more complete implementation of the GIF format. This includes separating out the compression activity into two helper classes (`LzwGifEncoder` and `LzwGifDecoder`), as well as completely removing the existing `AnimatedGIFReadWriter` class. I have also added the `AnimatedImageFrame` class for composing GIF frame information that can be used by other classes -- both for writing and for displaying.
Second, I have refactored the existing `AnimatedImageMorph` in Squeak to better deal with displaying real-life animated GIFs. Prior to this, many GIFs would display incorrectly due to the nature of GIF frame compositing and disposal. All of the test GIFs I've tried so far (and there have been a lot) seem to be working well with this class.
Third, I have created an alternative class -- presumptuously named `BetterAnimatedImageMorph` -- that has some optimizations. The regular `AnimatedImageMorph` does the GIF frame/form compositing "on the fly," which results in a lot of extra computation each time the animation loops. The upshot here is that we cannot achieve real framerates for some GIFs -- they always play slower in Squeak than they would in a browser regardless of how low one sets the stepTime/delay. `BetterAnimatedImageMorph` takes a different approach: it loops through all the GIFReadWriter's read-in frames and first creates a collection of fully composited Forms. It then simply changes the current form to be displayed at each step. In my tests this has resulted in much more performant playback, of which I have provided examples (see below).
One issue with `BetterAnimatedImageMorph` is that the original information about the constituent forms of the GIF is "lost". Attempting to write it back to a GIF file will necessarily result in a larger file than the original, because most animated GIFs are made of frames that are "diffed" in a sense. There are a couple of ways to deal with this though, such as caching the original frame data etc.
My concrete proposals, then, would be the following: 1. Remove AnimatedGIFReadWriter and replace GIFReadWriter with my updated version (along with helper classes) 2. Replace AnimatedImageMorph with my new BetterAnimatedImageMorph
If you've read this far and have the patience to try it out, I have my working code up online at https://gitlab.com/darth-cheney/squeak-animated-gif
Load this code in Squot and check it out. Try some of the examples in the FunGIFExamples class (on the class side). In particular, any class method starting with `example` should be good fun.
For a live comparison of the performance between my `BetterAnimatedImageMorph` and the current `AnimatedImageMorph`, do `FunGIFExamples animatedImageMorphComparison` and you should be able to see the issue right away.
Any feedback is greatly appreciated here, and hopefully we can get this or something like it into the main image. GIFs are too fun to leave out!
-- Eric
Hi Christoph,
On Sat, Nov 27, 2021 at 2:12 PM Thiede, Christoph < Christoph.Thiede@student.hpi.uni-potsdam.de> wrote:
- compatibility checks - are there any breaking changes in terms of
behavior, or has the public protocol been changed? Maybe it would be worth improving compatibility, adding a few "deprecated" methods, autc. For instance, probably we should re-add an empty subclass AnimatedGIFReadWriter in the 60Deprecated/61Deprecated package to not break existing code that relies on it. Another example, #openGIFInWindow:/#fromStream: are not understood by BetterAnimatedImageMorph class.
I agree. None of this replacement should happen without preserving that polymorphism, at least for the time being. I should probably write a few tests along the way (I have only a handful from previous ports)
- code review - not sure whether there is anyone who would like to
have a detailed look at your patch before it is integrated. But as you say it is "tested by the crowd" on two different Smalltalk systems, my personal opinion would be that this should not be a reason to block great contributions.
I believe I have some tests in ports to the other Smalltalks but I'll probably have to refactor them. In particular, there is and should be a test for reading a GIF, writing it out, reading it back in, and having the correct pixels at expected locations.
PS: Have you or someone else already tried to optimize the new
GIFReadWriter for performance? I am having a 36 MB large GIF https://github.com/LinqLover/TelegramSmalltalkBot/blob/master/img/screencast.gif here which takes eternities (> 10 minutes) to file in on my machine. With a 9 MB small GIF, it still takes a few minutes. And the RAM consumption of my image is noticeably high. But after that, I can play the GIF with ca. 25 FPS. However, the old implementation was not faster either.
It's actually worse than you think -- I cannot spy on the process of parsing your large GIF file without my Squeak process eventually killing itself, probably due to memory consumption. As it stands, I don't know the exact reason the memory consumption is so high. There *must* be some places for optimization in the decoding. When I rewrote the LZW encoding/decoding portions of this, my immediate goal was to understand what was going on -- therefore being verbose and using more objects. I am very likely using inappropriate classes or data structures for some of the decoding. Worse case scenario: we offload the encoding/decoding of LZW data to a plugin.
There is also the matter of working with the Bitmap and Form objects at a low level, which I must admit I still don't fully grasp. I could be missing something there.
That said, are there other packages available for profiling etc, or is MessageTally class >> spyOn: the bread and butter for that kind of thing?
On a related note, I did notice that during GIF frame composition there was what seemed to me a lot of time spent on symbol/string comparison at one point. (`9.5% {192ms} ByteSymbol(Symbol)>>=`). I don't quite understand why that should be the case.
Hi Eric,
thanks for the reply. Of course, every test will be very valuable!
Regarding efficiency and optimization: There are other people who might be able to give you detailed information, but maybe I can add a few basic notes:
I am very likely using inappropriate classes or data structures for some of the decoding.
Hypothetically any subclass of RawBitsArray (maybe even create a new one?) could be your friend (this hierarchy has been reworked recently, so maybe it takes some time to find the right structure.)
Worse case scenario: we offload the encoding/decoding of LZW data to a plugin.
Or use FFI instead. But whenever possible, I would myself prefer a transparent implementation inside the image.
That said, are there other packages available for profiling etc, or is MessageTally class >> spyOn: the bread and butter for that kind of thing?
There is the InteractiveProfilingTool https://github.com/hpi-swa-teaching/InteractiveProfilingTool which aggregates the recorded data a bit more conveniently, or you could evaluate the following: [self doSomeOperation] timeProfile. But it's not a completely new approach, just some slight refinements. Apart from that, we have SpaceTally for measuring object sizes (see the class comment), but I have not yet used it myself. If you want to analyze the usage of different VM primitives and modules, you could try out my new BenchmarkSimulator from SimulationStudio https://github.com/LinqLover/SimulationStudio (experimental), but it is itself very slow.
On a related note, I did notice that during GIF frame composition there was what seemed to me a lot of time spent on symbol/string comparison at one point. (`9.5% {192ms} ByteSymbol(Symbol)>>=`). I don't quite understand why that should be the case.
For symbols, you can use #== instead which is compiled as an inlined message send (-> faster).
Have a nice weekend!
Best, Christoph
________________________________ Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Eric Gade eric.gade@gmail.com Gesendet: Sonntag, 28. November 2021 17:31:17 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] [Discussion] GIF and Animated GIF Refactoring
Hi Christoph,
On Sat, Nov 27, 2021 at 2:12 PM Thiede, Christoph <Christoph.Thiede@student.hpi.uni-potsdam.demailto:Christoph.Thiede@student.hpi.uni-potsdam.de> wrote:
* compatibility checks - are there any breaking changes in terms of behavior, or has the public protocol been changed? Maybe it would be worth improving compatibility, adding a few "deprecated" methods, autc. For instance, probably we should re-add an empty subclass AnimatedGIFReadWriter in the 60Deprecated/61Deprecated package to not break existing code that relies on it. Another example, #openGIFInWindow:/#fromStream: are not understood by BetterAnimatedImageMorph class.
I agree. None of this replacement should happen without preserving that polymorphism, at least for the time being. I should probably write a few tests along the way (I have only a handful from previous ports)
* code review - not sure whether there is anyone who would like to have a detailed look at your patch before it is integrated. But as you say it is "tested by the crowd" on two different Smalltalk systems, my personal opinion would be that this should not be a reason to block great contributions.
I believe I have some tests in ports to the other Smalltalks but I'll probably have to refactor them. In particular, there is and should be a test for reading a GIF, writing it out, reading it back in, and having the correct pixels at expected locations.
PS: Have you or someone else already tried to optimize the new GIFReadWriter for performance? I am having a 36 MB large GIFhttps://github.com/LinqLover/TelegramSmalltalkBot/blob/master/img/screencast.gif here which takes eternities (> 10 minutes) to file in on my machine. With a 9 MB small GIF, it still takes a few minutes. And the RAM consumption of my image is noticeably high. But after that, I can play the GIF with ca. 25 FPS. However, the old implementation was not faster either.
It's actually worse than you think -- I cannot spy on the process of parsing your large GIF file without my Squeak process eventually killing itself, probably due to memory consumption. As it stands, I don't know the exact reason the memory consumption is so high. There *must* be some places for optimization in the decoding. When I rewrote the LZW encoding/decoding portions of this, my immediate goal was to understand what was going on -- therefore being verbose and using more objects. I am very likely using inappropriate classes or data structures for some of the decoding. Worse case scenario: we offload the encoding/decoding of LZW data to a plugin.
There is also the matter of working with the Bitmap and Form objects at a low level, which I must admit I still don't fully grasp. I could be missing something there.
That said, are there other packages available for profiling etc, or is MessageTally class >> spyOn: the bread and butter for that kind of thing?
On a related note, I did notice that during GIF frame composition there was what seemed to me a lot of time spent on symbol/string comparison at one point. (`9.5% {192ms} ByteSymbol(Symbol)>>=`). I don't quite understand why that should be the case.
-- Eric
Hi Eric --
Yes, I think it is worth improving Squeak's support for animated GIFs. Thanks!
My concrete proposals, then, would be the following:
- Remove AnimatedGIFReadWriter and replace GIFReadWriter with my updated version (along with helper classes)
- Replace AnimatedImageMorph with my new BetterAnimatedImageMorph
Well, for the sake of backwards compatibility, maybe you could do instead:
1*) Improve GIFReadWriter while ensuring that the obviously public protocol stays functional. Empty out AnimatedGIFReadWriter but leave it there. 2*) Improve AnimatedImageMorph while ensuring that the obviously public protocol stays functional. No new class BetterAnimatedImageMorph.
Regarding the process: - If single package: Inbox - If multiple packages: Change set via squeak-dev
:-)
Best,
Marcel Am 28.11.2021 21:10:36 schrieb Thiede, Christoph christoph.thiede@student.hpi.uni-potsdam.de: Hi Eric,
thanks for the reply. Of course, every test will be very valuable!
Regarding efficiency and optimization: There are other people who might be able to give you detailed information, but maybe I can add a few basic notes:
I am very likely using inappropriate classes or data structures for some of the decoding.
Hypothetically any subclass of RawBitsArray (maybe even create a new one?) could be your friend (this hierarchy has been reworked recently, so maybe it takes some time to find the right structure.)
Worse case scenario: we offload the encoding/decoding of LZW data to a plugin.
Or use FFI instead. But whenever possible, I would myself prefer a transparent implementation inside the image.
That said, are there other packages available for profiling etc, or is MessageTally class >> spyOn: the bread and butter for that kind of thing?
There is the InteractiveProfilingTool <https://github.com/hpi-swa-teaching/InteractiveProfilingTool [https://github.com/hpi-swa-teaching/InteractiveProfilingTool%5D%3E which aggregates the recorded data a bit more conveniently, or you could evaluate the following: [self doSomeOperation] timeProfile. But it's not a completely new approach, just some slight refinements. Apart from that, we have SpaceTally for measuring object sizes (see the class comment), but I have not yet used it myself. If you want to analyze the usage of different VM primitives and modules, you could try out my new BenchmarkSimulator from SimulationStudio <https://github.com/LinqLover/SimulationStudio [https://github.com/LinqLover/SimulationStudio%5D%3E (experimental), but it is itself very slow.
On a related note, I did notice that during GIF frame composition there was what seemed to me a lot of time spent on symbol/string comparison at one point. (`9.5% {192ms} ByteSymbol(Symbol)>>=`). I don't quite understand why that should be the case.
For symbols, you can use #== instead which is compiled as an inlined message send (-> faster).
Have a nice weekend!
Best, Christoph Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Eric Gade eric.gade@gmail.com Gesendet: Sonntag, 28. November 2021 17:31:17 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] [Discussion] GIF and Animated GIF Refactoring Hi Christoph,
On Sat, Nov 27, 2021 at 2:12 PM Thiede, Christoph <Christoph.Thiede@student.hpi.uni-potsdam.de [mailto:Christoph.Thiede@student.hpi.uni-potsdam.de]> wrote: * compatibility checks - are there any breaking changes in terms of behavior, or has the public protocol been changed? Maybe it would be worth improving compatibility, adding a few "deprecated" methods, autc. For instance, probably we should re-add an empty subclass AnimatedGIFReadWriter in the 60Deprecated/61Deprecated package to not break existing code that relies on it. Another example, #openGIFInWindow:/#fromStream: are not understood by BetterAnimatedImageMorph class.
I agree. None of this replacement should happen without preserving that polymorphism, at least for the time being. I should probably write a few tests along the way (I have only a handful from previous ports) * code review - not sure whether there is anyone who would like to have a detailed look at your patch before it is integrated. But as you say it is "tested by the crowd" on two different Smalltalk systems, my personal opinion would be that this should not be a reason to block great contributions.
I believe I have some tests in ports to the other Smalltalks but I'll probably have to refactor them. In particular, there is and should be a test for reading a GIF, writing it out, reading it back in, and having the correct pixels at expected locations.
PS: Have you or someone else already tried to optimize the new GIFReadWriter for performance? I am having a 36 MB large GIF [https://github.com/LinqLover/TelegramSmalltalkBot/blob/master/img/screencast...] here which takes eternities (> 10 minutes) to file in on my machine. With a 9 MB small GIF, it still takes a few minutes. And the RAM consumption of my image is noticeably high. But after that, I can play the GIF with ca. 25 FPS. However, the old implementation was not faster either.
It's actually worse than you think -- I cannot spy on the process of parsing your large GIF file without my Squeak process eventually killing itself, probably due to memory consumption. As it stands, I don't know the exact reason the memory consumption is so high. There *must* be some places for optimization in the decoding. When I rewrote the LZW encoding/decoding portions of this, my immediate goal was to understand what was going on -- therefore being verbose and using more objects. I am very likely using inappropriate classes or data structures for some of the decoding. Worse case scenario: we offload the encoding/decoding of LZW data to a plugin. There is also the matter of working with the Bitmap and Form objects at a low level, which I must admit I still don't fully grasp. I could be missing something there.
That said, are there other packages available for profiling etc, or is MessageTally class >> spyOn: the bread and butter for that kind of thing? On a related note, I did notice that during GIF frame composition there was what seemed to me a lot of time spent on symbol/string comparison at one point. (`9.5% {192ms} ByteSymbol(Symbol)>>=`). I don't quite understand why that should be the case.
--
Eric
Hi Marcel (and others)
On Mon, Nov 29, 2021 at 5:45 AM Marcel Taeumel marcel.taeumel@hpi.de wrote:
Well, for the sake of backwards compatibility, maybe you could do instead:
1*) Improve GIFReadWriter while ensuring that the obviously public protocol stays functional. Empty out AnimatedGIFReadWriter but leave it there. 2*) Improve AnimatedImageMorph while ensuring that the obviously public protocol stays functional. No new class BetterAnimatedImageMorph.
Regarding the process:
- If single package: Inbox
- If multiple packages: Change set via squeak-dev
After some wrangling between Squot and Changesets (yikes!), I was able to come up with the following changeset (attached).
Here is what I did: 1) Updated AnimatedImageMorph to have the new/updated functionality from BetterAnimatedImageMorph, while maintaining the same methods and deprecating as needed; 2) Added the classes needed for new GIF parsing 3) Updated GIFReadWriter to use the new parsing methods 4) Deprecated the AnimatedGIFReadWriter class using the technique of adding #deprecated: to #basicNew 5) I have added a GIFReadWriterTest class that tests both the regular and animated GIF reading/writing at the same time. There are issues (see below).
If you would like to quickly test that things are working, do the following in a workspace: ``` GIFReadWriter exampleAnim. (AnimatedImageMorph new fromGIFFileNamed: 'anim.gif') openInHand. ```
You should see an animated, growing red circle on a while background. See the class method comment.
Note on tests: I copied the GIFReadWriterTest class over from my Pharo implementation with minimal (file api related) modification. The tests are not passing and the reason is a bit confusing. Here's what's up: in these tests I write out a Form that has different colors in each quadrant, read it back in, and make sure the color values are the same. But apparently in Squeak when you use `FormCanvas fillColor: Color black`, the color that is put into the Form is not, in fact, equivalent to Color black. It is instead something like rgb(0,0,0.04). Is this intentional? If so, do you recommend that I simply update the tests to match on this specific color?
This is my first submission via changeset so definitely let me know if I messed up. Thanks!
Whoops, sorry all! The changeset was incorrect. Please use the revised changeset attached to this message. The content of the previous message still applies.
On Sat, Dec 4, 2021 at 1:18 PM Eric Gade eric.gade@gmail.com wrote:
Hi Marcel (and others)
On Mon, Nov 29, 2021 at 5:45 AM Marcel Taeumel marcel.taeumel@hpi.de wrote:
Well, for the sake of backwards compatibility, maybe you could do instead:
1*) Improve GIFReadWriter while ensuring that the obviously public protocol stays functional. Empty out AnimatedGIFReadWriter but leave it there. 2*) Improve AnimatedImageMorph while ensuring that the obviously public protocol stays functional. No new class BetterAnimatedImageMorph.
Regarding the process:
- If single package: Inbox
- If multiple packages: Change set via squeak-dev
After some wrangling between Squot and Changesets (yikes!), I was able to come up with the following changeset (attached).
Here is what I did:
- Updated AnimatedImageMorph to have the new/updated functionality from
BetterAnimatedImageMorph, while maintaining the same methods and deprecating as needed; 2) Added the classes needed for new GIF parsing 3) Updated GIFReadWriter to use the new parsing methods 4) Deprecated the AnimatedGIFReadWriter class using the technique of adding #deprecated: to #basicNew 5) I have added a GIFReadWriterTest class that tests both the regular and animated GIF reading/writing at the same time. There are issues (see below).
If you would like to quickly test that things are working, do the following in a workspace:
GIFReadWriter exampleAnim. (AnimatedImageMorph new fromGIFFileNamed: 'anim.gif') openInHand.
You should see an animated, growing red circle on a while background. See the class method comment.
Note on tests: I copied the GIFReadWriterTest class over from my Pharo implementation with minimal (file api related) modification. The tests are not passing and the reason is a bit confusing. Here's what's up: in these tests I write out a Form that has different colors in each quadrant, read it back in, and make sure the color values are the same. But apparently in Squeak when you use `FormCanvas fillColor: Color black`, the color that is put into the Form is not, in fact, equivalent to Color black. It is instead something like rgb(0,0,0.04). Is this intentional? If so, do you recommend that I simply update the tests to match on this specific color?
This is my first submission via changeset so definitely let me know if I messed up. Thanks! -- Eric
Hi Eric --
Thanks for doing this. I am looking at it now. Could it be that you dropped all "deleted" requests from the change set? There seems to be a lot of obsolete methods remaining in GIFReadWriter:
... GIFReadWriter >> readBitData GIFReadWriter >> writeCodeAndCheckCodeSize: ...
Most of that seems to be extracted to LzwGif(Decoder|Encoder).
I am trying to dissect, what I see. Maybe you could just "file-out" your version of GIFReadWriter and AnimatedImageMorph from the image directly?
Best, Marcel Am 04.12.2021 20:46:56 schrieb Eric Gade eric.gade@gmail.com: Whoops, sorry all! The changeset was incorrect. Please use the revised changeset attached to this message. The content of the previous message still applies.
On Sat, Dec 4, 2021 at 1:18 PM Eric Gade <eric.gade@gmail.com [mailto:eric.gade@gmail.com]> wrote:
Hi Marcel (and others)
On Mon, Nov 29, 2021 at 5:45 AM Marcel Taeumel <marcel.taeumel@hpi.de [mailto:marcel.taeumel@hpi.de]> wrote:
Well, for the sake of backwards compatibility, maybe you could do instead:
1*) Improve GIFReadWriter while ensuring that the obviously public protocol stays functional. Empty out AnimatedGIFReadWriter but leave it there. 2*) Improve AnimatedImageMorph while ensuring that the obviously public protocol stays functional. No new class BetterAnimatedImageMorph.
Regarding the process: - If single package: Inbox - If multiple packages: Change set via squeak-dev
After some wrangling between Squot and Changesets (yikes!), I was able to come up with the following changeset (attached).
Here is what I did: 1) Updated AnimatedImageMorph to have the new/updated functionality from BetterAnimatedImageMorph, while maintaining the same methods and deprecating as needed; 2) Added the classes needed for new GIF parsing 3) Updated GIFReadWriter to use the new parsing methods 4) Deprecated the AnimatedGIFReadWriter class using the technique of adding #deprecated: to #basicNew 5) I have added a GIFReadWriterTest class that tests both the regular and animated GIF reading/writing at the same time. There are issues (see below).
If you would like to quickly test that things are working, do the following in a workspace: ``` GIFReadWriter exampleAnim. (AnimatedImageMorph new fromGIFFileNamed: 'anim.gif') openInHand. ```
You should see an animated, growing red circle on a while background. See the class method comment.
Note on tests: I copied the GIFReadWriterTest class over from my Pharo implementation with minimal (file api related) modification. The tests are not passing and the reason is a bit confusing. Here's what's up: in these tests I write out a Form that has different colors in each quadrant, read it back in, and make sure the color values are the same. But apparently in Squeak when you use `FormCanvas fillColor: Color black`, the color that is put into the Form is not, in fact, equivalent to Color black. It is instead something like rgb(0,0,0.04). Is this intentional? If so, do you recommend that I simply update the tests to match on this specific color?
This is my first submission via changeset so definitely let me know if I messed up. Thanks!
--
Eric
--
Eric
Hi Eric --
I think this is works. I tested a bunch of GIFs ranging from a few KiB up to 4 MiB. All of them used to also load and play fine before.
I noticed that the loading times are about 5 times slower than before and that the GC runs runs longer. Hmm...
For that 4 MiB file it compares as follows:
BEFORE: '0.867 per second. 1.15 seconds per run. 3.71012 % GC time.' AFTER: '0.181 per second. 5.52 seconds per run. 58.18083 % GC time.'
One can notice this slow-down especially in the FileList tool. Maybe we can make it faster by avoiding some GC time?
Best, Marcel Am 07.12.2021 13:41:13 schrieb Marcel Taeumel marcel.taeumel@hpi.de: Hi Eric --
Thanks for doing this. I am looking at it now. Could it be that you dropped all "deleted" requests from the change set? There seems to be a lot of obsolete methods remaining in GIFReadWriter:
... GIFReadWriter >> readBitData GIFReadWriter >> writeCodeAndCheckCodeSize: ...
Most of that seems to be extracted to LzwGif(Decoder|Encoder).
I am trying to dissect, what I see. Maybe you could just "file-out" your version of GIFReadWriter and AnimatedImageMorph from the image directly?
Best, Marcel Am 04.12.2021 20:46:56 schrieb Eric Gade eric.gade@gmail.com: Whoops, sorry all! The changeset was incorrect. Please use the revised changeset attached to this message. The content of the previous message still applies.
On Sat, Dec 4, 2021 at 1:18 PM Eric Gade <eric.gade@gmail.com [mailto:eric.gade@gmail.com]> wrote:
Hi Marcel (and others)
On Mon, Nov 29, 2021 at 5:45 AM Marcel Taeumel <marcel.taeumel@hpi.de [mailto:marcel.taeumel@hpi.de]> wrote:
Well, for the sake of backwards compatibility, maybe you could do instead:
1*) Improve GIFReadWriter while ensuring that the obviously public protocol stays functional. Empty out AnimatedGIFReadWriter but leave it there. 2*) Improve AnimatedImageMorph while ensuring that the obviously public protocol stays functional. No new class BetterAnimatedImageMorph.
Regarding the process: - If single package: Inbox - If multiple packages: Change set via squeak-dev
After some wrangling between Squot and Changesets (yikes!), I was able to come up with the following changeset (attached).
Here is what I did: 1) Updated AnimatedImageMorph to have the new/updated functionality from BetterAnimatedImageMorph, while maintaining the same methods and deprecating as needed; 2) Added the classes needed for new GIF parsing 3) Updated GIFReadWriter to use the new parsing methods 4) Deprecated the AnimatedGIFReadWriter class using the technique of adding #deprecated: to #basicNew 5) I have added a GIFReadWriterTest class that tests both the regular and animated GIF reading/writing at the same time. There are issues (see below).
If you would like to quickly test that things are working, do the following in a workspace: ``` GIFReadWriter exampleAnim. (AnimatedImageMorph new fromGIFFileNamed: 'anim.gif') openInHand. ```
You should see an animated, growing red circle on a while background. See the class method comment.
Note on tests: I copied the GIFReadWriterTest class over from my Pharo implementation with minimal (file api related) modification. The tests are not passing and the reason is a bit confusing. Here's what's up: in these tests I write out a Form that has different colors in each quadrant, read it back in, and make sure the color values are the same. But apparently in Squeak when you use `FormCanvas fillColor: Color black`, the color that is put into the Form is not, in fact, equivalent to Color black. It is instead something like rgb(0,0,0.04). Is this intentional? If so, do you recommend that I simply update the tests to match on this specific color?
This is my first submission via changeset so definitely let me know if I messed up. Thanks!
--
Eric
--
Eric
Hi Marcel,
On Tue, Dec 7, 2021 at 8:23 AM Marcel Taeumel marcel.taeumel@hpi.de wrote:
I noticed that the loading times are about 5 times slower than before and that the GC runs runs longer. Hmm...
For that 4 MiB file it compares as follows:
BEFORE: '0.867 per second. 1.15 seconds per run. 3.71012 % GC time.' AFTER: '0.181 per second. 5.52 seconds per run. 58.18083 % GC time.'
One can notice this slow-down especially in the FileList tool. Maybe we can make it faster by avoiding some GC time?
It's definitely slower, there's no doubt. As far as I can tell, most of the slowdown is happening in `LzwGIFDecoder >> #handleCode:withPreviousCode:on:` , which is a method that deals with the "meat" of the LZW decompression algorithm. It could be that creating tons of OrderedCollections in the process of building LZW bit-code "stacks" is what's slowing things down. Would it be better to use one collection and then clear it out each time? To be honest, it's been a couple of years since I had to implement LZW here so I'm not sure what type of data structure to use etc. What are some good (general) strategies for avoiding GC time besides the unnecessary creation of extra objects?
Some kind of optimization clearly needs to happen here. In earlier messages, Christoph supplied a 36MB GIF that was taking ages to decode on his end. When I tried it on my machine the image would eventually blow up in memory and crash. I'm still trying to figure out the best way to profile this sort of thing.
However, I will say that while the old class can decode faster, it is fundamentally incorrect and cannot read a lot of the complex GIFs that are more common these days. For example, if you try to open this one https://media.giphy.com/media/bzUwzbxcvJ3XQlcnoi/giphy.gif or this one https://media.giphy.com/media/Byq2w1rdk2uXbWxbIx/giphy.gif, you will notice that the old AnimatedGIFReadWriter will bug-out with a subscript bounds error. This is because something about the LZW was not implemented correctly. (Giphy.com is generally a good source for test images).
Hi all,
I'm attaching an updated changeset that should by itself have everything you need. It includes a couple of changes that should help with performance: 1) The use of ByteArrays instead of generic OrderedCollections or Arrays where we are definitively dealing with byte values; 2) Removal of unnecessary buffering of data blocks. The GIF standard -- which is from 1989 -- described data blocks that are max 255 bytes large. I had been taking the standard as literally as possible, so was essentially refreshing the buffer every 255 bytes. With this changeset we are now buffering all data bytes ahead of time into one single ByteArray, which shouldn't be an issue for each image frame.
I have noticed a GIF reading speed improvement of >50% due to these changes. However, larger GIFs can still take quite a bit of time.
Additionally, I have removed and/or deprecated methods that are no longer being used.
Does anyone have additional thoughts about the tests and how Color black does not equal the pixel value of a Form that was filled in with Color black?
Hi Eric,
I think the code can be optimized further and I think it's possible to make it faster than the current Trunk version (which is a mess btw). But I wouldn't bother with that at this point. I suggest upgrading the changeset to have proper method stamps instead of "no timeStamp · unknown author", converting it into mcz files, and pushing it into the Trunk. That way others can review the code more easily and will have a chance to contribute as well.
Levente
On Thu, 9 Dec 2021, Eric Gade wrote:
Hi all,
I'm attaching an updated changeset that should by itself have everything you need. It includes a couple of changes that should help with performance:
- The use of ByteArrays instead of generic OrderedCollections or Arrays where we are definitively dealing with byte values;
- Removal of unnecessary buffering of data blocks. The GIF standard -- which is from 1989 -- described data blocks that are max 255 bytes large. I had been taking the standard as literally as possible, so was essentially
refreshing the buffer every 255 bytes. With this changeset we are now buffering all data bytes ahead of time into one single ByteArray, which shouldn't be an issue for each image frame. I have noticed a GIF reading speed improvement of >50% due to these changes. However, larger GIFs can still take quite a bit of time. Additionally, I have removed and/or deprecated methods that are no longer being used.
Does anyone have additional thoughts about the tests and how Color black does not equal the pixel value of a Form that was filled in with Color black?
Eric
Maybe change the missing stamps to "Eric Gade"? :-) I know that he did not touch all the code but most of it?
Best, Marcel Am 10.12.2021 13:49:53 schrieb Levente Uzonyi leves@caesar.elte.hu: Hi Eric,
I think the code can be optimized further and I think it's possible to make it faster than the current Trunk version (which is a mess btw). But I wouldn't bother with that at this point. I suggest upgrading the changeset to have proper method stamps instead of "no timeStamp · unknown author", converting it into mcz files, and pushing it into the Trunk. That way others can review the code more easily and will have a chance to contribute as well.
Levente
On Thu, 9 Dec 2021, Eric Gade wrote:
Hi all,
I'm attaching an updated changeset that should by itself have everything you need. It includes a couple of changes that should help with performance:
- The use of ByteArrays instead of generic OrderedCollections or Arrays where we are definitively dealing with byte values;
- Removal of unnecessary buffering of data blocks. The GIF standard -- which is from 1989 -- described data blocks that are max 255 bytes large. I had been taking the standard as literally as possible, so was essentially
refreshing the buffer every 255 bytes. With this changeset we are now buffering all data bytes ahead of time into one single ByteArray, which shouldn't be an issue for each image frame.
I have noticed a GIF reading speed improvement of >50% due to these changes. However, larger GIFs can still take quite a bit of time.
Additionally, I have removed and/or deprecated methods that are no longer being used.
Does anyone have additional thoughts about the tests and how Color black does not equal the pixel value of a Form that was filled in with Color black?
Eric
Hey all,
Sorry I'm a little out of my depth here. Did I screw up the timestamps somehow? What's the best way to update them? Also my understanding is that if I push to Trunk, I will have to do a different push for each package that has changed, is that correct?
On Fri, Dec 10, 2021 at 8:19 AM Marcel Taeumel marcel.taeumel@hpi.de wrote:
Maybe change the missing stamps to "Eric Gade"? :-) I know that he did not touch all the code but most of it?
Best, Marcel
Am 10.12.2021 13:49:53 schrieb Levente Uzonyi leves@caesar.elte.hu: Hi Eric,
I think the code can be optimized further and I think it's possible to make it faster than the current Trunk version (which is a mess btw). But I wouldn't bother with that at this point. I suggest upgrading the changeset to have proper method stamps instead of "no timeStamp · unknown author", converting it into mcz files, and pushing it into the Trunk. That way others can review the code more easily and will have a chance to contribute as well.
Levente
On Thu, 9 Dec 2021, Eric Gade wrote:
Hi all,
I'm attaching an updated changeset that should by itself have everything
you need. It includes a couple of changes that should help with performance:
- The use of ByteArrays instead of generic OrderedCollections or Arrays
where we are definitively dealing with byte values;
- Removal of unnecessary buffering of data blocks. The GIF standard --
which is from 1989 -- described data blocks that are max 255 bytes large. I had been taking the standard as literally as possible, so was essentially
refreshing the buffer every 255 bytes. With this changeset we are now
buffering all data bytes ahead of time into one single ByteArray, which shouldn't be an issue for each image frame.
I have noticed a GIF reading speed improvement of >50% due to these
changes. However, larger GIFs can still take quite a bit of time.
Additionally, I have removed and/or deprecated methods that are no
longer being used.
Does anyone have additional thoughts about the tests and how Color black
does not equal the pixel value of a Form that was filled in with Color black?
-- Eric
Hi Eric,
On Fri, 10 Dec 2021, Eric Gade wrote:
Hey all,
Sorry I'm a little out of my depth here. Did I screw up the timestamps somehow? What's the best way to update them? Also my understanding is that if I push to Trunk, I will have to do a different push for each package that has changed, is that correct?
I suppose you used a (git-related) tool which dropped the method timestamps. If you changed a method, update the timestamp with your initials, if not then try to keep the original timestamp and initials.
Yes, you still need to create an mcz for each affected package.
Levente
Hi Levente,
On Tue, Dec 21, 2021 at 10:18 AM Levente Uzonyi leves@caesar.elte.hu wrote:
I suppose you used a (git-related) tool which dropped the method timestamps. If you changed a method, update the timestamp with your initials, if not then try to keep the original timestamp and initials.
If I load my posted changeset, I see timestamps with my initials, but with "unknown author" as well, for example: "EG 12/4/2021 12:26 · unknown author · image reading/writing · 2 implementors ".
I don't actually know how to update the "unknown author" part for the affected methods. Is there a contextual menu item I am missing? All other methods that don't have my initials appear to have normal timestamp information as they did before.
Once I get that cleaned up, my plan would be to push to the Inbox.
Hi All,
Just wanted to drop a line and say that I had some time around Christmas to push these changes to the Inbox. Aside from the weirdness with Forms and Color black, I think it should be good to go? Perhaps there is a chance it could make it into the upcoming release?
squeak-dev@lists.squeakfoundation.org