In order to make the Raspberry Pi Squeak faster we've built a fairly large chunk of ARM specific code to implement a good fraction of the bitlbt plugin. It *ought* to speed up some common blits by around 5-10X once all functioning fully.
My current problem is integrating it nicely so it doesn't get in the way of 'normal' code. I have fudged the BitBltSimulation code a little to make the relevant connections with, I hope, minimal intrusion and maintained the compile-time configuration that should support Jenkins builds etc. A single -DARM_FAST_BLT in CFLAGS triggers #including a header and taking a branch in copyBits to the specialised code. It's not the most elegant code in the world but the best I could come up with without rewriting everything.
That all works to generate suitable code for the Pi and seems not to mangle anything for other platforms. I'll make sure it works for RISC OS sometime soon, too.
The other issue is where to store the rest of the hand-written code. Right now it's mostly in platforms/Cross/plugins/BitBltPlugin which doesn't really seem right. It's cross platform in the sense of RISC OS & any ARM linux but hardly universally cross-platform. Would it annoy anyone if it stays there? Would naming it BitBltPluginARM make it nicer? There are a couple of other files that *are* platform specific and implement cpu discovery to configure the tricks used in the rest of the code. One is currently in unix/src/vm/intplugins/BitBltPlugin, which certainly isn't suitable. There's also a config.cmake fragment currently in unix/plugins/BitBltPlugin which clearly would need some work to handle non-ARM builds.
Oh, the other, other issue is integrating it into the make world for both plain interp and stack/cog. Right now I have the ST code integrated into the cog vmmaker but not the plain vmmaker, but the makefile stuff is for the CMake used in the plain interp branch. Sigh. I'll move the code across to the plain vmmaker today and that ought to result in something that would build clean on an ARM unix but likely not non-ARM. Changing the cog makefiles to incorporate the new code…. yuck. autoconf. Help.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful random insult:- Put a lens in each ear and you've got a telescope.
On Tue, May 28, 2013 at 11:49:17AM -0700, tim Rowledge wrote:
In order to make the Raspberry Pi Squeak faster we've built a fairly large chunk of ARM specific code to implement a good fraction of the bitlbt plugin. It *ought* to speed up some common blits by around 5-10X once all functioning fully.
My current problem is integrating it nicely so it doesn't get in the way of 'normal' code. I have fudged the BitBltSimulation code a little to make the relevant connections with, I hope, minimal intrusion and maintained the compile-time configuration that should support Jenkins builds etc. A single -DARM_FAST_BLT in CFLAGS triggers #including a header and taking a branch in copyBits to the specialised code. It's not the most elegant code in the world but the best I could come up with without rewriting everything.
That all works to generate suitable code for the Pi and seems not to mangle anything for other platforms. I'll make sure it works for RISC OS sometime soon, too.
The other issue is where to store the rest of the hand-written code. Right now it's mostly in platforms/Cross/plugins/BitBltPlugin which doesn't really seem right. It's cross platform in the sense of RISC OS & any ARM linux but hardly universally cross-platform. Would it annoy anyone if it stays there? Would naming it BitBltPluginARM make it nicer? There are a couple of other files that *are* platform specific and implement cpu discovery to configure the tricks used in the rest of the code. One is currently in unix/src/vm/intplugins/BitBltPlugin, which certainly isn't suitable. There's also a config.cmake fragment currently in unix/plugins/BitBltPlugin which clearly would need some work to handle non-ARM builds.
Good question. We do not really have a convention for things that work on multiple operating system platforms but that are not general enough to be considered "Cross". As a result, we have tended to accumulate copies of things in the platform directories that might better be shared across several platforms. For example, Ian's unix/vm/aio.c appears also as iOS/vm/Common/aio.c, and there are various plugins that simply make copies of code borrowed from other platform trees.
It might be a good idea to invent some directory structure to accommodate this, but I don't know if it's really necessary. In the case of the ARM specific code, I think the way to find out if it matters is simply to plug the code into the platforms/Cross directories of both oscog and trunk branches (on your own local machine), and see if it breaks the existing build procedures.
If you can do this with the trunk sources, and find that Ian's CMake build still happily produces a Unix VM, and if you can also do it on the branch/oscog sources and find the Eliot's build still works, then it's a good bet that you can safely add the code to Cross and work out the build system annoyances later on.
On the other hand, if either Ian's or Eliot's exiting build procedures crash and burn after you add the new Cross/ARM code, then we should probably give the matter a bit more thought.
If you want me to try it with trunk/platforms/Cross for a Unix build, send a copy of the code and I'll let you know if it works.
Oh, the other, other issue is integrating it into the make world for both plain interp and stack/cog. Right now I have the ST code integrated into the cog vmmaker but not the plain vmmaker, but the makefile stuff is for the CMake used in the plain interp branch. Sigh. I'll move the code across to the plain vmmaker today and that ought to result in something that would build clean on an ARM unix but likely not non-ARM. Changing the cog makefiles to incorporate the new code?. yuck. autoconf. Help.
Yes please. I really, really, really appreciate if you can keep the trunk files in sync with the oscog files :)
r.e. autoconf, I can offer sympathy but not much in the way of help. I'm not a huge fan of CMake either, but it seems to be less horrible than autoconf was. The only thing I know of that is worse that autoconf or CMake is the obvious - maintaining both of them at the same time ;-)
Dave
tim
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful random insult:- Put a lens in each ear and you've got a telescope.
On 28-05-2013, at 5:03 PM, "David T. Lewis" lewis@mail.msen.com wrote:
If you want me to try it with trunk/platforms/Cross for a Unix build, send a copy of the code and I'll let you know if it works.
OK, here is a zip of a) the files from Ben - it should be fairly obvious where to put them since there is a Cross dir and a platforms/unix dir. There is also a dir currently for the intplugins/BitBltPlugin dir but that needs to move somewhere more sensible. I just can't come up with a place at this time of day. b) my changes to the latest vmmaker-dtl.130(?)
b also adds a copy of wliot's hack for making it possible to add #ifdef wotsit #include "lunacy.h" #endif
Add -DARM_FAST_BLT to the CFLAGS to get it to try to use the new code, don't to not. It appears to work for me but I'd be much happier to have external confirmation.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Strange OpCodes: PBT: Prune Binary Tree
On Tue, May 28, 2013 at 08:43:18PM -0700, tim Rowledge wrote:
On 28-05-2013, at 5:03 PM, "David T. Lewis" lewis@mail.msen.com wrote:
If you want me to try it with trunk/platforms/Cross for a Unix build, send a copy of the code and I'll let you know if it works.
OK, here is a zip of a) the files from Ben - it should be fairly obvious where to put them since there is a Cross dir and a platforms/unix dir. There is also a dir currently for the intplugins/BitBltPlugin dir but that needs to move somewhere more sensible. I just can't come up with a place at this time of day. b) my changes to the latest vmmaker-dtl.130(?)
b also adds a copy of wliot's hack for making it possible to add #ifdef wotsit #include "lunacy.h" #endif
Add -DARM_FAST_BLT to the CFLAGS to get it to try to use the new code, don't to not. It appears to work for me but I'd be much happier to have external confirmation.
I started with a fresh platforms tree and VMMaker image, then added Ben's Cross/plugins/BitBltPlugin/* and unix/pluginx/BitBltPlugin/config.cmake. I also filed in your BitBltSimulation changes.
I did not file in the code generator changes, because VMMaker already has the following C preprocessor methods (http://bugs.squeak.org/view.php?id=5238):
Object>>isDefined:inSmalltalk:comment:ifTrue: Object>>isDefined:inSmalltalk:comment:ifTrue:ifFalse: Object>>isDefinedTrueExpression:inSmalltalk:comment:ifTrue:ifFalse: Object>>preprocessorExpression: Object>>cPreprocessorDirective:
I compiled a VM for x86_64 Linux from this, both with and without the -DARM_FAST_BLT in CFLAGS. In both cases I got a working VM, and in both cases the code in Cross/plugins/BitBltPlugin was not compiled.
The configure and make commands that I used were: $ cd build $ ../platforms/unix/cmake/configure --src=../src --CFLAGS='-DARM_FAST_BLT' $ make install
Conclusion: For Linux on a non-ARM platform, nothing bad happened when the code was added.
I think that Cross/plugins/BitBltPlugin cries out for some naming convention or new directory structure to distinguish ARM from other platforms. I'm not sure what to suggest just now...
Regarding the code, is this available under MIT license? And who is this Ben guy anyhow ;-)
Dave
On 29-05-2013, at 3:57 AM, "David T. Lewis" lewis@mail.msen.com wrote:
I started with a fresh platforms tree and VMMaker image, then added Ben's Cross/plugins/BitBltPlugin/* and unix/pluginx/BitBltPlugin/config.cmake. I also filed in your BitBltSimulation changes.
I did not file in the code generator changes, because VMMaker already has the following C preprocessor methods (http://bugs.squeak.org/view.php?id=5238):
Ah, excellent.I'll rewrite to make use of these. Um, so did you change the code to make use of them already?
Object>>isDefined:inSmalltalk:comment:ifTrue: Object>>isDefined:inSmalltalk:comment:ifTrue:ifFalse: Object>>isDefinedTrueExpression:inSmalltalk:comment:ifTrue:ifFalse: Object>>preprocessorExpression: Object>>cPreprocessorDirective:
I compiled a VM for x86_64 Linux from this, both with and without the -DARM_FAST_BLT in CFLAGS. In both cases I got a working VM, and in both cases the code in Cross/plugins/BitBltPlugin was not compiled.
I think that is correct since the cmake fragment appears to explicitly check for arm
The configure and make commands that I used were: $ cd build $ ../platforms/unix/cmake/configure --src=../src --CFLAGS='-DARM_FAST_BLT' $ make install
Conclusion: For Linux on a non-ARM platform, nothing bad happened when the code was added.
Sounds like a good start then.
I think that Cross/plugins/BitBltPlugin cries out for some naming convention or new directory structure to distinguish ARM from other platforms. I'm not sure what to suggest just now…
Me neither. Suggestions from the peanut gallery welcomed.
Regarding the code, is this available under MIT license? And who is this Ben guy anyhow ;-)
It's certainly intended to be acceptable to the main tree and the license verbiage looks MIT-ish to me. If it isn't adequate, I doubt there will be more than a five second conversation required to change it.
Ben is an ex-Acorn guy in the UK that happened to write the pixman Pi optimisations as well as quite a bit of RISC OS. As you will notice if you read the code, he is rather good at neat but devious code that does clever things to ARMs. For some reason despite living in Cambridge (the real one, not the one in the US) he works on PDT. He has a slightly dodgy beard.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Strange OpCodes: IIB: Ignore Inquiry and Branch anyway
On Wed, May 29, 2013 at 11:26:50AM -0700, tim Rowledge wrote:
On 29-05-2013, at 3:57 AM, "David T. Lewis" lewis@mail.msen.com wrote:
I started with a fresh platforms tree and VMMaker image, then added Ben's Cross/plugins/BitBltPlugin/* and unix/pluginx/BitBltPlugin/config.cmake. I also filed in your BitBltSimulation changes.
I did not file in the code generator changes, because VMMaker already has the following C preprocessor methods (http://bugs.squeak.org/view.php?id=5238):
Ah, excellent.I'll rewrite to make use of these. Um, so did you change the code to make use of them already?
No, I did not change the code at all. FYI there is an equivalent but different implementation of some of these directives in the Cog branch, so we'll need to modify accordingly (no big deal, just letting you know).
Object>>isDefined:inSmalltalk:comment:ifTrue: Object>>isDefined:inSmalltalk:comment:ifTrue:ifFalse: Object>>isDefinedTrueExpression:inSmalltalk:comment:ifTrue:ifFalse: Object>>preprocessorExpression: Object>>cPreprocessorDirective:
I compiled a VM for x86_64 Linux from this, both with and without the -DARM_FAST_BLT in CFLAGS. In both cases I got a working VM, and in both cases the code in Cross/plugins/BitBltPlugin was not compiled.
I think that is correct since the cmake fragment appears to explicitly check for arm
The configure and make commands that I used were: $ cd build $ ../platforms/unix/cmake/configure --src=../src --CFLAGS='-DARM_FAST_BLT' $ make install
Conclusion: For Linux on a non-ARM platform, nothing bad happened when the code was added.
Sounds like a good start then.
I think that Cross/plugins/BitBltPlugin cries out for some naming convention or new directory structure to distinguish ARM from other platforms. I'm not sure what to suggest just now?
Me neither. Suggestions from the peanut gallery welcomed.
Yes please ... ideas anyone?
Regarding the code, is this available under MIT license? And who is this Ben guy anyhow ;-)
It's certainly intended to be acceptable to the main tree and the license verbiage looks MIT-ish to me. If it isn't adequate, I doubt there will be more than a five second conversation required to change it.
Ben is an ex-Acorn guy in the UK that happened to write the pixman Pi optimisations as well as quite a bit of RISC OS. As you will notice if you read the code, he is rather good at neat but devious code that does clever things to ARMs. For some reason despite living in Cambridge (the real one, not the one in the US) he works on PDT. He has a slightly dodgy beard.
Great :)
I think that we do need an explicit statement that code is MIT licensed. The people who care about that stuff care about it a lot, and it messes things up for e.g. Linux distro maintainers if we don't have explicit MIT licensing declared.
Dave
On 29-05-2013, at 1:31 PM, "David T. Lewis" lewis@mail.msen.com wrote:
I think that we do need an explicit statement that code is MIT licensed. The people who care about that stuff care about it a lot, and it messes things up for e.g. Linux distro maintainers if we don't have explicit MIT licensing declared.
I was taking a look around the code currently in the tree and a lot of it has no mention of licensing, or somewhat variant looking licensy words. My RISC OS files appear to be the only ones with consistent mentions of MIT-L and that's only because I ran through them all earlier this year!
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful Latin Phrases:- Braccae illae virides cum subucula rosea et tunica Caledonia-quam elenganter concinnatur! = Those green pants go so well with that pink shirt and the plaid jacket!
On Wed, May 29, 2013 at 02:50:28PM -0700, tim Rowledge wrote:
On 29-05-2013, at 1:31 PM, "David T. Lewis" lewis@mail.msen.com wrote:
I think that we do need an explicit statement that code is MIT licensed. The people who care about that stuff care about it a lot, and it messes things up for e.g. Linux distro maintainers if we don't have explicit MIT licensing declared.
I was taking a look around the code currently in the tree and a lot of it has no mention of licensing, or somewhat variant looking licensy words. My RISC OS files appear to be the only ones with consistent mentions of MIT-L and that's only because I ran through them all earlier this year!
There is no need for license statements in the code files themselves, but we do need to have a statement (in an email) from the author and/or copyright holder to the effect that "this code is released under MIT license".
The past may have been murky, but the situation now is clear - if it is going into the code base it needs to be MIT licensed, and we need to be able to plausibly claim that this is so. I know that lots of folks (myself included) don't much care about this, but out of regard for the folks who do care, we need to be good custodians of the license integrity.
Please ask Ben ____ to send you an email to the effect that "I wrote this stuff and I hereby release it under MIT license for use in the VM". Forward the message to the list, and we'll call it a good day.
Dave
On 30-05-2013, at 1:30 AM, Steve Rees squeak-vm-dev@vimes.worldonline.co.uk wrote:
On 29/05/2013 19:26, tim Rowledge wrote:
For some reason despite living in Cambridge (the real one, not the one in the US) he works on PDT. He has a slightly dodgy beard.
Is there any other kind?
Yes. The *very* dodgy kind. Those silly little halfway between stubble and a beard, shaved at the edges into odd pseudo-gang sign shapes. They remind me somehow of… java. Not quite one thing nor another, a failed attempt at coolness that falls short in every important dimension
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful random insult:- Thinks everyone else is entitled to his opinion, like it or not.
Mine actually grows like that :(
On Thu, May 30, 2013 at 9:50 AM, tim Rowledge tim@rowledge.org wrote:
On 30-05-2013, at 1:30 AM, Steve Rees < squeak-vm-dev@vimes.worldonline.co.uk> wrote:
On 29/05/2013 19:26, tim Rowledge wrote:
For some reason despite living in Cambridge (the real one, not the one
in the US) he works on PDT. He has a slightly dodgy beard.
Is there any other kind?
Yes. The *very* dodgy kind. Those silly little halfway between stubble and a beard, shaved at the edges into odd pseudo-gang sign shapes. They remind me somehow of… java. Not quite one thing nor another, a failed attempt at coolness that falls short in every important dimension
tim
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful random insult:- Thinks everyone else is entitled to his opinion, like it or not.
Oh my, this is turning out to be fun with a capital 'A'.
I'm not at all sure Dave's tests the other day tell us anything much; it turned out that Ben forgot to tell me of the need to call an initialiser function! Without that the code would always use the fallback code and thus ought to work on non-ARM, so simply checking that it didn't blow up doesn't tell us so much. Sigh.
Now I have the initialiser set up there are other problems like the code not actually doing quite what we thought. That's going to be fun to fix.
Integrating into the BitBltPlugin code has been interesting. We need to selectively:- a) include a header file, b) choose a branch to the new code or not c) hide or expose references to a struct defined in the new header file d) use or not use the make related fragments e) use the appropriate file for getting the OS to tell us about the architecture in use
a) might possibly be best done by simply adding the headerfile, including it always and relying on the fact that the relevant code simply won't be called, and would likely get dropped when the unix libtool does its work. That has the attraction of not needing an ugly hack that lets me add a header file named '#ifdef ARM_FAST_BLT #include "BitBltDispatch.h" #else // to handle the unavoidable decl in the spec of copyBitsFallback(); #define operator_t void #endif' That 'operator_t 'is the struct mentioned in c). We could wrap much of the header file contents in a suitable #ifdef ARM_FAST_BLT of course.
d) is completely outside my knowledge. If we add a cmake fragment for bitblt and don't need some of it when there is no ARM_FAST_BLT, what do we need to do?
e) ought to be handled by putting the BitBltArmLinux.c file in the unix platform branch as usual and a derivative of BitBltArmOther.c in suitable places. I imagine this code will be of use should someone do a WindowsRT port, too.
Other fun issues- I've had to make quite separate versions for trunk and Cog trees. Eliot uses cppIf:ifTrue: etc and there is no support for that in the trunk VMMaker code. I don't have time to fix that right now. Eliot has things set up to generate code as generally as possible (still can't do RISC OS suitable code though!) so that choices of platform and global-struct or not etc can be made at compile time. It doesn't look like the same is true of trunk.
Current changes from trunk VMMaker - I'm really not sure it is ready to commit yet. I don't want to break everyone else's builds...
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Computer possessed? Try DEVICE=C:\EXOR.SYS
Looking to see why an important method is being removed from the translated code and I note, with due irony, that CCodeGenerator>unreachableMethods has no senders….
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Fractured Idiom:- VISA LA FRANCE - Don't leave chateau without it
On 29-05-2013, at 3:57 AM, "David T. Lewis" lewis@mail.msen.com wrote:
Object>>isDefined:inSmalltalk:comment:ifTrue: Object>>isDefined:inSmalltalk:comment:ifTrue:ifFalse: Object>>isDefinedTrueExpression:inSmalltalk:comment:ifTrue:ifFalse: Object>>preprocessorExpression: Object>>cPreprocessorDirective:
Those last two seem to rather pointlessly do almost exactly the same thing, differing only in whether the initial $# is provided by the user or the system. Cog-world seems to have only the cPreprocessorDirective: version and no uses of it; plain-interp-world has both and has no uses of cPre… and a single 'real' use of preprocessorExpression: and a single 'testing' usage.
There's also a bunch of related code in FT2PluginCodeGenerator where it adds its own translation fooblies.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Strange OpCodes: PBF: Pay Bus Fare
On Wed, May 29, 2013 at 01:07:59PM -0700, tim Rowledge wrote:
On 29-05-2013, at 3:57 AM, "David T. Lewis" lewis@mail.msen.com wrote:
Object>>isDefined:inSmalltalk:comment:ifTrue: Object>>isDefined:inSmalltalk:comment:ifTrue:ifFalse: Object>>isDefinedTrueExpression:inSmalltalk:comment:ifTrue:ifFalse: Object>>preprocessorExpression: Object>>cPreprocessorDirective:
Those last two seem to rather pointlessly do almost exactly the same thing, differing only in whether the initial $# is provided by the user or the system. Cog-world seems to have only the cPreprocessorDirective: version and no uses of it; plain-interp-world has both and has no uses of cPre? and a single 'real' use of preprocessorExpression: and a single 'testing' usage.
Indeed they are pointlessly different. The last one (#cPreprocessorDirective:) is one of Eliot's variants that I put into trunk for convenience in merging code from oscog into trunk.
There's also a bunch of related code in FT2PluginCodeGenerator where it adds its own translation fooblies.
I think that comes from an externally managed repository (not VMM trunk/oscog). I have not paid much attention to it, but as far as I know the added foibles seem to be working.
Dave
tim
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Strange OpCodes: PBF: Pay Bus Fare
vm-dev@lists.squeakfoundation.org