[Vm-dev] Imminent change to Spur image format

Mon Aug 11 19:08:08 UTC 2014

2014-08-11 19:59 GMT+02:00 Eliot Miranda <eliot.miranda at gmail.com>:

>
> Hi Bert,
>
> On Mon, Aug 11, 2014 at 10:41 AM, Bert Freudenberg <bert at freudenbergs.de>
> wrote:
>
>>
>> On 11.08.2014, at 18:16, Clément Bera <bera.clement at gmail.com> wrote:
>>
>> 2014-08-11 17:37 GMT+02:00 Bert Freudenberg <bert at freudenbergs.de>:
>>
>>>
>>> On 09.08.2014, at 00:46, Eliot Miranda <eliot.miranda at gmail.com> wrote:
>>>
>>> > Hi All,
>>> >
>>> >     as part of the Newspeak infrastructure we use at Cadence I
>>> implemented multiple bytecode set support and a lifting of the limits in a
>>> method on the number of literals and the span of branches about two years
>>> ago.  This work involved adding a second interpretation to the bits in a
>>> method header, providing 16 bits of literal count.  This was done by moving
>>> the primitive number out of the method header and into an optional
>>> callPrimitive bytecode, being the first bytecode of methods that have
>>> primitives.
>>> >
>>> > Now in Spur I have the opportunity to use this expanded format for the
>>> exsting bytecode set as well.  The SqueakV3 set does not use bytecode 139,
>>> which is convenient to use for its callPrimitiveBytecode.  The advantage is
>>> that when and if a new bytecode set is added, as is planned for the Sista
>>> VMs, the VM will not have to test method headers to decide which format
>>> they're in, because there will only be one.
>>>
>>> Just curious: how does the VM know which bytecode set to use for a given
>>> method?
>>>
>>> A bit is set or not in the compiled method header.
>>
>>
>> But Eliot wrote "the VM will not have to test method headers"?
>>
>
> Right.  For a while the Newspeak VMs have supported two bytecode sets with
> two different header formats, the old format:
>
> sign bit 0, header >= 0:
> (index 0) 9 bits: main part of primitive number   (#primitive)
>  (index 9) 8 bits: number of literals (#numLiterals)
> (index 17) 1 bit: whether a large frame size is needed (#frameSize)
>  (index 18) 6 bits: number of temporary variables (#numTemps)
> (index 24) 4 bits: number of arguments to the method (#numArgs)
>  (index 28) 1 bit: high-bit of primitive number (#primitive)
> (index 29) 1 bit: flag bit, ignored by the VM  (#flag)
>  (index 30/63) sign bit: 0 selects the Primary instruction set (#signFlag)
> sign bit 1, header < 0:
> (index 0) 16 bits: number of literals (#numLiterals)
>  (index 16)  1 bit: has primitive
> (index 17)  1 bit: whether a large frame size is needed (#frameSize)
>  (index 18)  6 bits: number of temporary variables (#numTemps)
> (index 24)  4 bits: number of arguments to the method (#numArgs)
>  (index 28)  2 bits: reserved for an access modifier (00-unused,
> 01-private, 10-protected, 11-public)
>  (index 30/63) sign bit: 1 selects the Secondary instruction set (e.g.
> NewsqueakV4) (#signFlag)
> i.e. the Secondary Bytecode Set expands the number of literals to 65535 by
> assuming a CallPrimitive bytecode.
>
>
> So whenever the VM needed to know the numLiterals (e.g. the GC in visiting
> pointer fields in methods) it had to switch-hit:
>
> StackInterpreter>>literalCountOfHeader: headerPointer
>  <api>
> "We support two method header formats, as selected by the sign flag.  Even
> if the VM only
>  has one bytecode set, supporting teh two formats here allows for
> instantiating methods in
>  the other format for testing, etc."
>  ^(self headerIndicatesAlternateBytecodeSet: headerPointer)
> ifTrue: [self literalCountOfAlternateHeader: headerPointer]
>  ifFalse: [self literalCountOfOriginalHeader: headerPointer]
>
> StackInterpreter>>headerIndicatesAlternateBytecodeSet: methodHeader
>  <api>
> <inline: true>
> "A negative header selects the alternate bytecode set."
>  ^methodHeader signedIntFromLong < 0
>
> It;s not a lot of work since the header has to be fetched anyway.  But
> it's complexity, and things should be as simple as possible but no simpler
> ;-)
>
> So since Spur is a chance for a fresh start it seemed like a good time to
> move to a single method header format, while keepoing the ability to
> support multiple bytecode sets.
>
> Also, with a single bit, how can there be more than one alternative
>> bytecode set?
>>
>
> One could I suppose put bits in a bytecode, just like the primitive is
> encoded in a callPrimitive: bytecode.  But I don't like that.  Instead, if
> one needs more juts add another bit.  As described above there's room for a
> two bit field in the sign and flag bits combined.
>
> One might take the view that only one additional set is needed.  One can
> develop it, test it, then move to it by recompiling everything to it and
> switching.  e.g. a snapshot operation that flips the sign bit on save could
> simply move methods from one set to another.
>
> However, if, as the Smalltalk-X folks do, one wants to directly support
> the bytecode of another language (java, python?) then having, say, 4 sets
> to choose from might be nice.
>
> Right now the pressing need is to support the Sista bytecode set.  Clément
> is making string progress and I've implemented a fair ammount of the VM
> support.  We have the bytecode set defined, the inline primitives defined.
>  I'll leave it to Clément to describe the optimizer status.
>

This is difficult to say. I believe I will have lots of time from november
1st to mid december to make lots of progress on the optimizer. It should be
stable enough for some benchs on Christmas.

Now there are other points, such as lazy deoptimization, discarding
optimized methods, fixing the settings to reach maximum performance,
uncommon bugs found on regression tests, debugging and inspecting context
transparently, stack replacement for on-the-fly optimization (for dynamic
deoptimization it is done and easier) that may or may not take time.

A prototype should be there for Christmas, but production will take more
time.

  In the VM we have the bytecode set implemented, the performance counters
> on conditional branches that call-back into the image when they trip, and
> the class trap bytecode, used to check that objects are of the required
> classes before entering unsafe optimized code.  That leaves the inlined
> primitives of which I've implemented a handful of the simpler ones.  So
> we're on course to have a prototype by Christmas ;-)
>
> At a later stage we'll add an OptimizedContext (which in the VM will have
> an associated new frame format) that will have a pointer stack (as the
> current Context does) and a byte stack for handling raw data such as
> floating-point.  The JIT will map (at least some) memory locations in the
> byte stack onto the floating-point registers for much improved
> floating-point performance.
> --
> best,
> Eliot
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20140811/9f79c84a/attachment-0001.htm