[squeak-dev] Max source method length? Max string length? Max change set size?

Wed May 18 09:41:08 UTC 2011

On 18.05.2011, at 05:28, Igor Stasenko wrote:

> Aha, so you're talking not about code directly authored by humans but
> rather indirectly/automatically generated code,
> which like i said is a form of abuse because it actually turns a
> source code into data storage (and of course sometimes
> it is hard to invent something better, but it doesn't makes it less abuse ;).
> 
> I would not bother about limits, because it is not a big deal, in your
> framework you could always detect if method's size surpasses certain
> reasonable limit, then you can simply split it onto number of smaller
> methods and then generate a root method to invoke them one by one in
> order initialize things in specific order.

Of course you can do that, it's just more work.

I ran into the limits of the max jump distance being 1024 bytecodes, and the number of temps being limited to 64. That meant I had to split into different methods. That meant I had to figure out how to pass the intermediates into the next method. Also, both cases of a conditional must be in a single method, so each branch needs to spill over separately. Etc.

If I had created the code generator from scratch I would have designed it to not run into the limits in the first place. But instead I modified a code generator that used to output C. And a C function has virtually no limits, at least compared to a Squeak method.

I wonder if compiler magic could remove the limits without having to change the VM. Like, if there are more than 63 temps, make the 64th temp be an array to hold the spillover temps. Same for literals, method arguments, block arguments, inst vars. For large jumps, do a series of unconditional 1024 byte jumps. The most severe limit might be stack depth, it might have to reify the stack into an array. The hard limit on CompiledMethod size would be 1 GB, because the PC is a positive SmallInt?

The benefit would be that for reasonably sized methods there would be no penalty at all, but there would not be artificially low limits either when you happen to do something unreasonable :)

I can see how this would work for the interpreter. And since bytecode semantics would be unchanged, it should even work for Cog? Of course, it rarely translates large methods anyway.

- Bert -