Hello Trey, Dick & All
Trey said
There are a total of three copies of the Squeak display:
- Squeak's big-endian, internally allocated bitmap
- The host system's intermediate bitmap, in the same pixel configuration
as
the host display 3) The host display adapter (possibly a partial copy, if the Squeak
window
is obscured in any way). Data flow is in one direction only, 1->2->3
In the OS/2 version of Squeak, the buffer 2) is in the same display depth as Squeak. So this copy has only the endian conversion (This is not true for 1, 2 and 4 bits of depth in Squeak, but these are easily expanded to 8 bits). The pixel depth conversion in the 2->3 operation is done automatically by this DIVE dll I told you about. It even does color dithering if appropiate! I didn't try to make it better, because the results are quite good (even in my old 486). Try it, or try the Windows version, their performance is similar (but turn off the 'Defer Display Update' in the Windows version for realistic performance). Besides, I did not want to need a special image for OS/2.
Anyway, It would be a big improvement to get Squeak write directly to 2). In my particular implementation, besides needing Squeak to manage little endian display formats, it is necessary to receive an external pointer where the display buffer is allocated. (This is necessary to use hardware acceleration). This is probably incompatible with the way Squeak manages the memory for it's objects.
Regarding Tim Rowledge's work on little-endian BitBlt, I would only have DisplayScreen in little-endian mode. Reversing all the forms in the image seems to break the consistency across all platforms.
So we would need: 1) Little endian BitBlt, only when blitting to DisplayScreen 2) Have DisplayScreen memory outside the object memory 3) A primitive to automatically select the blitting endianess (Similar to the way to handle the directory separator character) 4) Integrate all this stuff in the official Squeak release, to keep complete portability. 5) Convince Andreas that the Windows version would also benefit from this.
If all this is possible, I will happily convert the OS/2 Squeak version to this approach.
Juan Manuel Vuletich SW Export IGS- IBM Argentina Phone Nbr : 54-1-313-0014 ext. 5207 Tie Line : 840-5207 e-mail: vuletich@ar.ibm.com
On Tue 29 Dec, vuletich@ar.ibm.com wrote:
Anyway, It would be a big improvement to get Squeak write directly to 2). In my particular implementation, besides needing Squeak to manage little endian display formats, it is necessary to receive an external pointer where the display buffer is allocated. (This is necessary to use hardware acceleration). This is probably incompatible with the way Squeak manages the memory for it's objects.
This is almost exactly the problem I had to solve on the Acorn. Look at the instvar 'displayBits' in ObjectMemory and its uses - this is the basics of how to have a Display Bitmap outside of normal object space. Look also at the implementation of ioShowDisplay for the Acorn, where there is a way to do the setup. It sounds like you need much the same stuff.
Regarding Tim Rowledge's work on little-endian BitBlt, I would only have DisplayScreen in little-endian mode. Reversing all the forms in the image seems to break the consistency across all platforms.
Only because nobody is willing to do a very small bit of work in th image code to handle this. Having just the DisplayScreen in opposite-endian seems to me to be a bit tricky. You need to detect the destination and use a conversion loop when writing to the Dispaly, and you need to check the source as well - since you might well be reading the Display. Of course, it has to work when you are reading and writing the Display in one operation as well.
So we would need:
- Little endian BitBlt, only when blitting to DisplayScreen
Simplest to have a fast pixel reversing loop that can be called in addition to the 'normal' loop, othwise you have to duplicate the entire core loop nad merg rule functions. Don't forget that you have to _pixel_ reverse, not byte reverse. Call this in front of the main loop for source == Display and after the main loop for target == Display.
- Have DisplayScreen memory outside the object memory
See above and Acorn source files.
- A primitive to automatically select the blitting endianess (Similar to
the way to handle the directory separator character)
Don't think so. Only matters to the VM. Keep it there.
- Integrate all this stuff in the official Squeak release, to keep
complete portability.
Ditto.
- Convince Andreas that the Windows version would also benefit from this.
He wasn't convinced by two years of my arguments, so good luck :-)
tim
- Convince Andreas that the Windows version would also benefit from this.
He wasn't convinced by two years of my arguments, so good luck :-)
C'mon Tim - I just never felt that the BB byte sex is worth the effort. Since you've to color-convert the stuff anyways you can simply add the byte reversal there without any significant overhead.
Andreas
I forgot an important paragraph in my last note on this subject - and one that might just help the BeOS folks work out what is slowing them down.
The reason that having the Display be in little-endian form and doing a pixel reverse for any blit where the Display is source and/or target is a poor choice is that the Display is involved in a _lot_ of blits. I forget the precise details, but at one time I had to trace something that lead me to notice that something like 20-50 times more blits were to the screen than there were display update cycles.
On Acorn and Windows (possibly X11, though I haven't read the code enough to be sure) the Display is only copied to the glass when the appropriate OS event is received. At that time, and that time only, is the Dispaly bitmap pixel reversing done - which is why Andreas is quite right that the overhead is fairly low, as long as the OS merges small rectangles etc effectively and as long as we have to handle these events relatively rarely (~100 times/sec max) all is ok. If I understand the code correctly, the Mac updates the glass every time.(?)
If you are doing a pixel reverse _every_ time you write to the Display, not to mention actually forcing a copy to the glass, then performance would probably suffer. Perhaps, just perhaps, this is what the BeOS code is doing?
tim
squeak-dev@lists.squeakfoundation.org