Integrating faster bitblt into iOS StackVM WAS: [Vm-dev] unit code for new faster bitblt

tim Rowledge tim at
Wed Jul 17 16:54:39 UTC 2013

On 17-07-2013, at 6:36 AM, Esteban Lorenzano <estebanlm at> wrote:

> Hi Tim,
> I'm trying to integrate your code and I have some doubts (very simple ones, as you will see :). 

Excellent; Eliot & I were just trying to do some of this on Monday. Clearly we should try to get together.

> So far, I integrate your code into CogVM and StackVM by enabling ENABLE_FAST_BLT and adding BitBltGeneric.c, BitBltDispatch.c into the code base. 
> (btw, it compiles, but I really don't know if something is missing and how to measure the enhancement). 

For non-ARM platforms I have no idea whether there will be any benefit to using ENABLE_FAST_BLT. I rather suspect that modern desktop & laptop machines have such large caches and such wide memory busses that there isn't any meaningful bottleneck to work around; by contrast the Raspberry Pi has a narrow and slow memory bus and a whole, massive, amazing 32kb or so of cache. Pre-fetching cachelines whilst doing other stuff can pay off handsomely.

BitBltGeneric and BitBltDispatch are generic (duh) C files that ought to work ok on any machine. They really ought to be compiled only when the fastblt is enabled during running configure. They shouldn't cause a problem even if compiled since the linking/library phase should just drop everything. Do all library/linking programs do it perfectly? Only in Harry Potter and the Perfect Linker.

BitBltArm.c ought not be used unless the target cpu is an ARM. BitBltArmLinux.c is for linux /arm machines, to discover exactly which version of an ARM it is at runtime. BitBltArmOther.c is a stub for non-linux systems; fill it out as needed.

The .s files need to be assembled with 'asasm' NOT gas. There is certainly an argument for converting them to be gas compatible but the original author ain't interested. He prefers the cleaner syntax and better macros and other  reasons. asasm is open source and available so it shouldn't be any more of a problem than the fact that you need to load cmake for the plain interpreter or autoconf for cog.

> Now, I want to extend the code to the iOS vm's (for example, for DrGeo2 would be great), and since iOS is a FreeBSD over an ARM, I do not see why it should not work more or less out of the box :)

Except for needing to get asasm, it should be no problem. You may need to do some fiddling to replicate the detectCpuFeatures() function.

> And well... my doubt is which files should I include, since there are various there :)

You'll need all of them except BitBltArmLinux & BitBltArmOther, plus you'll need to write BitBltArmIOS.c And if you get really ambitious you might want to add still more special cases. And it really seems likely to me that some large chunk of the marshalling of parameters for the bitblt could be done better and faster; for small area bitblts I suspect that overwhelms the time taken to move bits around.

> (btw.. since they are not really "cross", shouldn't them be in other(s) platform directories?)
It's a tricky one. We don't have any clean way to handle cpu diferentiation in the tree.

What *might* be a good idea would be to have a Cross/plugins/FastARMBitBltPlugin directory and make things work such that it is used instead of the plain bitblt code when configured.

tim Rowledge; tim at;
Objects are closer than they appear.

More information about the Vm-dev mailing list