[squeak-dev] Max source method length? Max string length? Max change set size?

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Wed May 18 17:18:02 UTC 2011


I've implemented some of these changes in VW 20 years ago, because I
was generating code from symbolic expression (computer algebra
system).
But it was easier because BlockClosure are a literal in VW, so you
just have to turn optimisation of long blocks off.
For literals, and temps,  I created Arrays of literals and temps as
Bert suggested. That means that some message sends were replace with
perform: operations.
But there is more : even the integer index used to access the
literal/temp Array can be a literal by itself (the limit depends on
the byte code set).
One trick is to generate Arrays of Arrays of Arrays ... all of size
accessible with a literal free integer (BEWARE, at expense of stack
depth when it comes to evaluating).
I chose another way: generate an expression computing literal indices
from byte-code encoded smaller integers. Funny.

For the number of arguments I also passed an array of arguments
(generalisation of temps trick). I have a patch pending in mantis
which can be applied to Squeak.

For stack depth, I don't remember if I ever hit the limit, nor what
this limit is in VW. That's IMO a big problem in current Squeak.

All in all, hacking the Compiler is do-able. But the bad news is that
you will have to hack the Decompiler, and the Debugger... Much harder
in current Squeak architecture (maybe worth a full rewrite in this
case).

Nicolas

2011/5/18 Bert Freudenberg <bert at freudenbergs.de>:
> On 18.05.2011, at 05:28, Igor Stasenko wrote:
>
>> Aha, so you're talking not about code directly authored by humans but
>> rather indirectly/automatically generated code,
>> which like i said is a form of abuse because it actually turns a
>> source code into data storage (and of course sometimes
>> it is hard to invent something better, but it doesn't makes it less abuse ;).
>>
>> I would not bother about limits, because it is not a big deal, in your
>> framework you could always detect if method's size surpasses certain
>> reasonable limit, then you can simply split it onto number of smaller
>> methods and then generate a root method to invoke them one by one in
>> order initialize things in specific order.
>
> Of course you can do that, it's just more work.
>
> I ran into the limits of the max jump distance being 1024 bytecodes, and the number of temps being limited to 64. That meant I had to split into different methods. That meant I had to figure out how to pass the intermediates into the next method. Also, both cases of a conditional must be in a single method, so each branch needs to spill over separately. Etc.
>
> If I had created the code generator from scratch I would have designed it to not run into the limits in the first place. But instead I modified a code generator that used to output C. And a C function has virtually no limits, at least compared to a Squeak method.
>
> I wonder if compiler magic could remove the limits without having to change the VM. Like, if there are more than 63 temps, make the 64th temp be an array to hold the spillover temps. Same for literals, method arguments, block arguments, inst vars. For large jumps, do a series of unconditional 1024 byte jumps. The most severe limit might be stack depth, it might have to reify the stack into an array. The hard limit on CompiledMethod size would be 1 GB, because the PC is a positive SmallInt?
>
> The benefit would be that for reasonably sized methods there would be no penalty at all, but there would not be artificially low limits either when you happen to do something unreasonable :)
>
> I can see how this would work for the interpreter. And since bytecode semantics would be unchanged, it should even work for Cog? Of course, it rarely translates large methods anyway.
>
> - Bert -
>
>
>



More information about the Squeak-dev mailing list