[Vm-dev] Re: goto instruction with Cog VM

Fri Nov 7 22:25:31 UTC 2014

Hi Ralph,

On Tue, Nov 4, 2014 at 9:09 PM, Ralph Boland <rpboland at gmail.com> wrote:

>
> > Hi Ralph,
> >
> > On Sun, Nov 2, 2014 at 9:48 PM, Ralph Boland <rpboland at gmail.com> wrote:
> >
> > >
> > > >>
> > > >> I am working on a parser generator tool (a replacement for SmaCC)
> and
> > > >> one of the things I a m interested in is the direct translation of
> > > >> language specifications into the virtual machine code (SmaCC and my
> > > >> current version of it use Squeak as the target language).
> > > >>
> > >
> > > > First, a different approach than compiling to Smalltalk is to
> compile to
> > > a
> > > > parse tree.  We do this in the pseudo JavaScript compiler we've
> > > implemented
> > > > at Cadence, translating an AST into a Squeak compiler parse tree for
> code
> > > > generation.  Targeting a parse tree gives you much more freedom; you
> can
> > > > express things that aren't expressible in Smalltalk.  And if you
> target
> > > > bytecodes you can do even more.
> > >
> > >
> > > I never considered using a parse tree as the target.  An interesting
> idea
> > > which in many instances may be the best approach.  But for my regular
> > > expression
> > > example I would still want to generate byte codes.  In any case I
> wouldn't
> > > want to
> > > restrict users of my parser generator tool to any one of the three
> options
> > > (Smalltalk code,
> > > parse tree, byte code).  It is my responsibility to make all three
> options
> > > as easy and
> > > efficient as reasonably possible for users of the parser generator
> tool.
> > > Haven't put
> > > much thought into this yet though.   So far Smalltalk (Squeak) is the
> only
> > > option.
> > >
> > >
> > > >> One of the problems I have is that, for some languages, the natural
> > > >> translation
> > > >> into VM code uses computed gotos.
> > > >> There are two scenarios here:
> > > >>
> > > >>      1) goto X  where X is a variable.
> > > >>      2) goto  (coll at: y)  where coll is a Collection.
> > > >>
> > >
> > > > There are several ways of implementing this without computed
> bytecodes in
> > > > the instruction set, but there is also the possibility of
> implementing it
> > > > directly in the instruction set.
> > >
> > > > Off the top of my head one could
> > >
> > > > - map to perform: using some mangled selector.  Yes this is
> problematic
> > > > because one has to split the scope between the two methods, so in
> general
> > > > it's not a solution
> > >
> > > Doesn't appeal to me.
> > >
> > > > - map to a case statement, which is what Squeak does. Just map it to
> a
> > > > sequence of compare and branches.  Or better still, to a binary tree.
> > > > Coincidentally this is used by the JIT to implement block dispatch in
> > > > methods that contain more than one block.  I know of other VM
> > > > implementations using it for PIC dispatch with really good
> performance.
> > >
> > > I don't know what you mean my Squeak mapping to a case statement since
> > > there is no case statement in  Squeak/Smalltalk and I can't off hand
> think
> > > of
> > > where one is needed (some Squeak users might feel they need one but
> that
> > > is a
> > > different matter).
> > >
> >
> > But there is.  And it is very convenient when one doesn't want to spread
> a
> > case over different classes, or when the cases distribute over values of
> > the same class (e.g. integer values).  People often claim that a case
> > statement isn't necessary because one has polymorphism, but unless the
> > language supports singletons (which of course Smalltalk does not) a case
> > statement is much more readable than e.g. an if-then-else tree or a
> > "Dictionary mapping values to blocks or selectors" solution.
> >
> > Since you're unaware of the case statement I suggest you browse senders
> of
> > caseOf: and caseOf:otherwise: in Squeak trunk (which includes selectors
> of
> > optimized selectors that end up not being sent) or caseError, which is
> sent
> > when there's no otherwise clause.  Here's an example:
> >
> > menuHook: aMenu named: aSymbol shifted: aBool
> > "Enhance aMenu with registered services."
> > aSymbol
> > caseOf:
> > { [ #classListMenu ] -> [ ServiceGui browser: self classMenu: aMenu ].
> > [ #codePaneMenu ] -> [ ServiceGui browser: self codePaneMenu: aMenu ].
> > [ #messageCategoryMenu] -> [ ServiceGui browser: self
> messageCategoryMenu:
> > aMenu ].
> > [ #messageListMenu ] -> [ ServiceGui browser: self messageListMenu:
> aMenu ].
> > [ #systemCategoryMenu ] -> [ ServiceGui browser: self classCategoryMenu:
> > aMenu ] }
> > otherwise: [ "do nothing" ]
> >
> > This compiles down to a sequence of comparisons.
>
> I was aware of caseOf: in Squeak.  I always found it awkward to use and
> felt a true case statement would be simpler.  Alas, it's impossible to
> have a true case statement added to Smalltalk now I think.
>

So what's a "true" case statement?  For me, at least, the Squeak one *is*,
and is more general than one limited to purely integer keys, as for example
is C's switch statement.  A number of languages provide case statements
that are like Squeak's.  What do you consider a "true" case statement?

> caseOf: is user code and thus, in principle, outside the scope of the VM.
> For example, if I didn't want the sequential search of caseOf:  I could
> implement a method  doBlockAt:  method in class Dictionary that assumes
> the values stored in the dictionary are blocks and invokes the block
> found by performing at: on the arg of doBlockAt:.
> But the code for both caseOf: and doBlockAt: are written by users so the
> VM can't optimize it, is forced to have the overhead of using blocks,
> and must be locked in to the search algorithm the user has chosen (for
> better or for worse).
>
> > > The use of compare and branches might be OK in some cases
> > > but a mess for the finite state machines generated from regular
> > > expressions.
> > > Actually, even with computed gotos  FSMs are somewhat messy but without
> > > them
> > > it's worse.  I don't know what  'PIC dispatch' is.
> > >
> >
> > Polymorphic inline cache dispatch.  In JIT VMs such as Cog polymorphic
> send
> > sites are optimized using jump tables, one jump for each class
> encountered
> > at the send site, up tio some small limit such as 6 cases.  Right now in
> > Cog these jump tables are  sequential comparisons.  But in some VMs,
> where
> > the degree of polymorphism may be much higher (think prototype languages
> > such as JavaScript) a binary tree may be much miore efficient, if harder
> to
> > organize.
> >
> > To use a binary tree don't I need some kind of computed goto for when I
> > > reach
> > > a leaf of the tree????
> > >
> >
> > No.  Once you're at the leaf you just include the bytecodes.  So for
> > example take the following case statement:
> >
> > quickPrimitiveGeneratorFor: aQuickPrimitiveIndex
> > <api>
> > <returnTypeC: 'int (*quickPrimitiveGeneratorFor(sqInt
> > aQuickPrimitiveIndex))(void)'>
> > ^aQuickPrimitiveIndex
> > caseOf: {
> > [256] -> [#genQuickReturnSelf].
> > [257] -> [#genQuickReturnConstNil].
> > [258] -> [#genQuickReturnConstTrue].
> > [259] -> [#genQuickReturnConstFalse].
> > [260] -> [#genQuickReturnConstMinusOne].
> > [261] -> [#genQuickReturnConstZero].
> > [262] -> [#genQuickReturnConstOne].
> > [263] -> [#genQuickReturnConstTwo] }
> > otherwise: [#genQuickReturnInstVar]
> >
> > This is compiled in the current Squeak compiler as
> >
> > 61 <10> pushTemp: 0
> > 62 <88> dup
> > 63 <2A> pushConstant: 256
> > 64 <B6> send: =
> > 65 <9B> jumpFalse: 70
> > 66 <87> pop
> > 67 <29> pushConstant: #genQuickReturnSelf
> > 68 <A4 35> jumpTo: 123
> > 70 <88> dup
> > 71 <28> pushConstant: 257
> > 72 <B6> send: =
> > 73 <9B> jumpFalse: 78
> > 74 <87> pop
> > 75 <21> pushConstant: #genQuickReturnConstNil
> > 76 <A4 2D> jumpTo: 123
> > ...
> > 117 <22> pushConstant: 263
> > 118 <B6> send: =
> > 119 <99> jumpFalse: 122
> > 120 <21> pushConstant: #genQuickReturnConstTwo
> > 121 <90> jumpTo: 123
> > 122 <20> pushConstant: #genQuickReturnInstVar
> > 123 <7C> returnTop
> >
> > but it could be more efficient if the code were
> >
> >     pushTemp: 0
> >     dup
> >     pushConstant: 256
> >     send: <
> >     jumpFalse: L1
> >     dup
> >     pushConstant: 260
> >     send: <
> >     jumpFalse: L2
> >     dup
> >     pushConstant: 258
> >     send: <
> >     jumpFalse: L3
> >     dup
> >     pushConstant: 257
> >     send: <
> >     jumpFalse: L4
> >     pushConstant: #genQuickReturnSelf
> >     jumpTo: L0
> >     pushConstant: #genQuickReturnConstNil
> >     jumpTo: L0
> >     ...
> >    L1:
> >     pushConstant: #genQuickReturnInstVar
> >    L0:
> >     returnTop
>
> I have no problem with this code per say; if Squeak, the language,
> had a case statement it could be implemented this way.
>

You keep on saying it doesn't have one.  I disagree.

> But I wouldn't want to be forced to implement my FSMs this way.
> It might be acceptable for small FSMs.
> I want to avoid sequential search and
>  even binary search might be rather expensive.
> I look at computed gotos as the solution but,
> as you pointed out, computed gotos pose problems for JIT.
> Admittedly, for large FSM's, it might be best or necessary to
> use a FSM simulator anyway, as I do now.
>

Nah.  One should always be able to map it down somehow.  Tis will be easier
with the Spur instruction set which lifts number of literals and length of
branches limits.

>
> > > - use thisContext pc: value.
> > >
> > > This would be a possibility for me to experiment with for now.  When I
> > > have a working
> > > parser generator tool I could campaign for my computed goto
> instructions
> > > to be added
> > > to the VM.
> > >
> > > > This /should/ be fine in the stack VM, but
> > > > slooooow in the JIT because internally mapping bytecode pcs to
> machine
> > > code
> > > > pcs is slow, and currently slower still because the frame will be
> > > converted
> > > > to a pure context and then converted back into a frame on the return
> from
> > > > pc:.  But this solution isn't to be rejected out-of-hand.  It can be
> > > > optimized to avoid the frame conversion and the JIT might be able to
> > > > optimize it.
> > >
> > > I assume that if computed gotos were used the translation to machine
> code
> > > would require a direct
> > > mapping of (virtually labeled) bytecode locations to machine code
> > > locations.  I think this can be done
> > > in a reasonable amount of time but others such as yourself clearly
> > > understand the issues far better than
> > > I do.  The dirty solution to start would be to simply not JIT the code
> > > that uses computed gotos.
> > >
> >
> > Yes, it could be.  The JIT would generate a jump table from the literal
> > containing the bytecoded pcs, and there would be no conversion; only
> > indexing the jump table and jumping.  Again, organizing that table as a
> > binary switch may well be faster on modern architectures.  Indirect jumps
> > typically involve pipeline stalls, whereas binary jump trees don't.
> >
> > > The main problem is the compiler has no support for labels so
> > > > there would be work here.
> > >
> > > I don't mind doing the work but to my way of thinking  "goto X" is
> pretty
> > > basic
> > > and is thus best handled at the VM/byte code level.  Anything else is
> doing
> > > in a complicated way something that is fairly simple.  Of course
> changing
> > > the VM/byte codes by even a single byte code is a major deal unless
> done
> > > when the VM/byte codes are initially created.  Alas I must deal with
> what
> > > already exists.  Even so, my preference is to work with the VM if at
> all
> > > possible.
> > >
> > >
> > > >> For example, one such language is that of regular expressions,
> which I
> > > >> wish to translate into finite state machines implemented in VM code.
> > > >> In this case I need case 2) gotos where coll is a collection of
> > > >> associations, possibly a
> > > >> Dictionary. I also plan to write a debugger for this (and other
> > > languages)
> > > >> but that is another story.
> > > >>
> > > >> I realize that the Cog VM is being built for Smalltalk (Squeak?
> Pharo?)
> > > >> for which the goto instructions are not needed and thus I assume
> > > >> unavailable. But there is something to
> > > >> viewing a virtual machine as general purpose and thus the target of
> > > >> multiple languages as is
> > > >> the case for the Java virtual machine.
> > > >> If the Cog VM is viewed this way then I argue there is a need for my
> > > goto
> > > >> instructions
> > > >> because some languages have need for them.
> > > >> For example, many languages have case statements.  (I am all for
> object
> > > >> oriented
> > > >> but I would be willing to accept a case statement in Smalltalk
> too;  the
> > > >> Squeak code
> > > >> implemented one in Squeak doesn't cut it).
> > > >
> > >
> > > > I've occasionally thought about this for many years.  A computed jump
> > > might
> > > > be nice.  Eg index an Array literal of pcs with the integer on top of
> > > > stack, falling through on bad type or out of range.
> > >
> > > This is the way I am thinking.  If there are other reasons for a
> computed
> > > jumpTo
> > > as well all the better.
> > >
> > > >> Anyway, I am not arguing to Change Squeak or Smalltalk but I am
> arguing
> > > >> to have my goto instructions in Cog VM. Is there any chance of
> this?????
> > > >>
> > >
> > > > There's no chance of me spending time implementing this any time
> soon.  I
> > > > have too much high-priority tasks to tackle this.  But I want to
> > > encourage
> > > > you or others to have a go implementing it.  It's fun!
> > >
> > > I understand and am willing to be the one to add one or more computed
> jump
> > > instructions, including working on the JIT code generator if needed.
> > > As you say it should be fun (and also educational).  But
> > >    1)  I am pretty busy now too and probably won't get to this for a
> year.
> > >    2)  If I am to do this it would be great if someone can write a
> > > specification as to
> > >         what is to be done.  If someone can write this now that would
> be
> > > great but
> > >         if they write it when I post that I am ready to do the work
> that
> > > would also
> > >         be fine.
> > >    3)  I don't want to just have my own private VM/byte codes.  I want
> > > users of my
> > >         parser generator tool to be able to load it into a standard
> > > version of Squeak
> > >         and run it there including the possible generation of compilers
> > > for compiling
> > >         their domain specific language programs into byte codes if
> desired.
> > >
> >
> > Sure.  Get back to me when you're ready to work on it and I'll write the
> > specification, and help you with whatever info you need.  There should be
> > room in the bytecode sets we'll likely be using next year.
>
>
> WILL DO.  SHOULD BE FUN.
> WILL DO.  SHOULD BE FUN.
> WILL DO.  SHOULD BE FUN.
>

Great, thanks!!

>
> > >
> > > >> I don't know the Squeak VM or the Cog VM either but I assume these
> > > >> instructions don't exist because I see no need of them when the
> source
> > > >> language is
> > > >> Squeak or any version of Smalltalk for that matter. I also assume
> that
> > > >> there is already
> > > >> a full list of 256 instructions in the Cog VM and thus no room for
> my
> > > goto
> > > >> instructions
> > > >> unless some instructions are removed.
> > > >>
> > > >> Are there Cog VM instructions that are so rarely used that they
> could be
> > > >> removed without
> > > >> unreasonably slowing down the Cog VM interpretation of byte codes
> > > >> generated from Squeak source code?????
> > > >>
> > >
> > > > The current set has 3 unused bytecodes, one of which Spur uses, so
> > > > effectively there are two unused bytecodes.
> > >
> > > Levente Uzonyi  in his posting pointed out that only one instruction is
> > > needed.
> > > I don't like having to push the address to jump to onto the stack,
> > > preferring a byte
> > > code with an argument, but I could live with his solution if that is
> what
> > > is decided.
> > > In the case of  goto  coll at: X  the address is likely to end up on
> top
> > > of the stack
> > > anyway so Levente's  jumpToTop instruction looks good in any case.
> > >
> >
> > If the bytecode is one that takes an integer on top of stack, and an
> Array
> > literal containing bytecode pcs, falling through on out of range, then
> > nothing other than the index need be pushed on top of stack.  That would
> be
> > my preference.
> >
>
> Again, for my FSM, case this would often be considered to be good.
> But if the state transition tables are sparse then Dictionaries
> might be preferable to Arrays.
>

Yes, but getting to the limit of what the VM can reasonably interpret.
Better would be an Array of value. pc pairs, where the keys are the values
the switch bytecode compares top of stack against, and the pcs are where to
jump to on a match.  The JIT can therefore implement the table as it sees
fit, whereas the interpreter can just do a linear search through the Array.

My expection is that  at:  be sent to the collection object
>  to get the address to go to.  Knowing that the collection
> is an array though makes it easier for the compiler/VM to
> ensure that the addresses stored in the collection are valid.
> Actually, the compiler will be generating the addresses.
> Does the VM have absolute trust in the compiler to generate valid
> addresses?
>

Yes.  Generate bad bytecode and the VM crashes.

> What if the compiler is generated by my parser generator tool?
>

Same.  Once the tool is generating correct bytecode the VM should stop
crashing ;-)

>
>
> > > The Cog VMs support multiple bytecode sets.  If you look at the
> > > > BytecodeSets package on VMMaker you can read the class comments of
> the
> > > > BytecodeEncoder subclasses such as EncoderForSistaV1.  These bytecode
> > > sets
> > > > have a few more unused bytecodes.  This multiple bytecode set
> support is
> > > > better implemented in Spur where there is only one compiled method
> header
> > > > format and support for 64k literals.  So let me encourage you to
> move to
> > > > Spur and to look at the Sista set.  The class comment of each encoder
> > > class
> > > > specifies the instruction set it targets.
> > >
> > > I am prepared to work with Spur and the Sista set.  I am looking for
> > > someone to
> > > say that if I do this work that incorporating the work into Spur will
> be
> > > seriously
> > > considered.
> > >
> >
> > I can say that, yes.  I will seriously consider adding it.  I think its a
> > useful thing in the bytecode set, precisely to allow compiling other
> > languages to the VM.
>
>
> GLAD TO HEAR THIS.
>

You're welcome.

>
>
> >
> >
> >
> > >
> > > Ralph Boland
> > >
> > >
> > >
> >
> >
> > --
> > best,
> > Eliot
>
>
> Ralph
>
> On 3 November 2014 12:34, <vm-dev-request at lists.squeakfoundation.org>
> wrote:
>
>> Send Vm-dev mailing list submissions to
>>         vm-dev at lists.squeakfoundation.org
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>>         http://lists.squeakfoundation.org/mailman/listinfo/vm-dev
>> or, via email, send a message with subject or body 'help' to
>>         vm-dev-request at lists.squeakfoundation.org
>>
>> You can reach the person managing the list at
>>         vm-dev-owner at lists.squeakfoundation.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Vm-dev digest..."
>>
>>
>> Today's Topics:
>>
>>    1. Re: goto instruction with Cog VM (Eliot Miranda)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Mon, 3 Nov 2014 11:34:48 -0800
>> From: Eliot Miranda <eliot.miranda at gmail.com>
>> Subject: Re: [Vm-dev] goto instruction with Cog VM
>> To: Squeak Virtual Machine Development Discussion
>>         <vm-dev at lists.squeakfoundation.org>
>> Message-ID:
>>         <
>> CAC20JE1RduKQW_EdMsfah156QnTbR8USt9dEb4mj1rb15rndqA at mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>> Hi Ralph,
>>
>> On Sun, Nov 2, 2014 at 9:48 PM, Ralph Boland <rpboland at gmail.com> wrote:
>>
>> >
>> > >>
>> > >> I am working on a parser generator tool (a replacement for SmaCC) and
>> > >> one of the things I a m interested in is the direct translation of
>> > >> language specifications into the virtual machine code (SmaCC and my
>> > >> current version of it use Squeak as the target language).
>> > >>
>> >
>> > > First, a different approach than compiling to Smalltalk is to compile
>> to
>> > a
>> > > parse tree.  We do this in the pseudo JavaScript compiler we've
>> > implemented
>> > > at Cadence, translating an AST into a Squeak compiler parse tree for
>> code
>> > > generation.  Targeting a parse tree gives you much more freedom; you
>> can
>> > > express things that aren't expressible in Smalltalk.  And if you
>> target
>> > > bytecodes you can do even more.
>> >
>> >
>> > I never considered using a parse tree as the target.  An interesting
>> idea
>> > which in many instances may be the best approach.  But for my regular
>> > expression
>> > example I would still want to generate byte codes.  In any case I
>> wouldn't
>> > want to
>> > restrict users of my parser generator tool to any one of the three
>> options
>> > (Smalltalk code,
>> > parse tree, byte code).  It is my responsibility to make all three
>> options
>> > as easy and
>> > efficient as reasonably possible for users of the parser generator tool.
>> > Haven't put
>> > much thought into this yet though.   So far Smalltalk (Squeak) is the
>> only
>> > option.
>> >
>> >
>> > >> One of the problems I have is that, for some languages, the natural
>> > >> translation
>> > >> into VM code uses computed gotos.
>> > >> There are two scenarios here:
>> > >>
>> > >>      1) goto X  where X is a variable.
>> > >>      2) goto  (coll at: y)  where coll is a Collection.
>> > >>
>> >
>> > > There are several ways of implementing this without computed
>> bytecodes in
>> > > the instruction set, but there is also the possibility of
>> implementing it
>> > > directly in the instruction set.
>> >
>> > > Off the top of my head one could
>> >
>> > > - map to perform: using some mangled selector.  Yes this is
>> problematic
>> > > because one has to split the scope between the two methods, so in
>> general
>> > > it's not a solution
>> >
>> > Doesn't appeal to me.
>> >
>> > > - map to a case statement, which is what Squeak does. Just map it to a
>> > > sequence of compare and branches.  Or better still, to a binary tree.
>> > > Coincidentally this is used by the JIT to implement block dispatch in
>> > > methods that contain more than one block.  I know of other VM
>> > > implementations using it for PIC dispatch with really good
>> performance.
>> >
>> > I don't know what you mean my Squeak mapping to a case statement since
>> > there is no case statement in  Squeak/Smalltalk and I can't off hand
>> think
>> > of
>> > where one is needed (some Squeak users might feel they need one but that
>> > is a
>> > different matter).
>> >
>>
>> But there is.  And it is very convenient when one doesn't want to spread a
>> case over different classes, or when the cases distribute over values of
>> the same class (e.g. integer values).  People often claim that a case
>> statement isn't necessary because one has polymorphism, but unless the
>> language supports singletons (which of course Smalltalk does not) a case
>> statement is much more readable than e.g. an if-then-else tree or a
>> "Dictionary mapping values to blocks or selectors" solution.
>>
>> Since you're unaware of the case statement I suggest you browse senders of
>> caseOf: and caseOf:otherwise: in Squeak trunk (which includes selectors of
>> optimized selectors that end up not being sent) or caseError, which is
>> sent
>> when there's no otherwise clause.  Here's an example:
>>
>> menuHook: aMenu named: aSymbol shifted: aBool
>> "Enhance aMenu with registered services."
>> aSymbol
>> caseOf:
>> { [ #classListMenu ] -> [ ServiceGui browser: self classMenu: aMenu ].
>> [ #codePaneMenu ] -> [ ServiceGui browser: self codePaneMenu: aMenu ].
>> [ #messageCategoryMenu] -> [ ServiceGui browser: self messageCategoryMenu:
>> aMenu ].
>> [ #messageListMenu ] -> [ ServiceGui browser: self messageListMenu: aMenu
>> ].
>> [ #systemCategoryMenu ] -> [ ServiceGui browser: self classCategoryMenu:
>> aMenu ] }
>> otherwise: [ "do nothing" ]
>>
>> This compiles down to a sequence of comparisons.
>>
>> The use of compare and branches might be OK in some cases
>> > but a mess for the finite state machines generated from regular
>> > expressions.
>> > Actually, even with computed gotos  FSMs are somewhat messy but without
>> > them
>> > it's worse.  I don't know what  'PIC dispatch' is.
>> >
>>
>> Polymorphic inline cache dispatch.  In JIT VMs such as Cog polymorphic
>> send
>> sites are optimized using jump tables, one jump for each class encountered
>> at the send site, up tio some small limit such as 6 cases.  Right now in
>> Cog these jump tables are  sequential comparisons.  But in some VMs, where
>> the degree of polymorphism may be much higher (think prototype languages
>> such as JavaScript) a binary tree may be much miore efficient, if harder
>> to
>> organize.
>>
>> To use a binary tree don't I need some kind of computed goto for when I
>> > reach
>> > a leaf of the tree????
>> >
>>
>> No.  Once you're at the leaf you just include the bytecodes.  So for
>> example take the following case statement:
>>
>> quickPrimitiveGeneratorFor: aQuickPrimitiveIndex
>> <api>
>> <returnTypeC: 'int (*quickPrimitiveGeneratorFor(sqInt
>> aQuickPrimitiveIndex))(void)'>
>> ^aQuickPrimitiveIndex
>> caseOf: {
>> [256] -> [#genQuickReturnSelf].
>> [257] -> [#genQuickReturnConstNil].
>> [258] -> [#genQuickReturnConstTrue].
>> [259] -> [#genQuickReturnConstFalse].
>> [260] -> [#genQuickReturnConstMinusOne].
>> [261] -> [#genQuickReturnConstZero].
>> [262] -> [#genQuickReturnConstOne].
>> [263] -> [#genQuickReturnConstTwo] }
>> otherwise: [#genQuickReturnInstVar]
>>
>> This is compiled in the current Squeak compiler as
>>
>> 61 <10> pushTemp: 0
>> 62 <88> dup
>> 63 <2A> pushConstant: 256
>> 64 <B6> send: =
>> 65 <9B> jumpFalse: 70
>> 66 <87> pop
>> 67 <29> pushConstant: #genQuickReturnSelf
>> 68 <A4 35> jumpTo: 123
>> 70 <88> dup
>> 71 <28> pushConstant: 257
>> 72 <B6> send: =
>> 73 <9B> jumpFalse: 78
>> 74 <87> pop
>> 75 <21> pushConstant: #genQuickReturnConstNil
>> 76 <A4 2D> jumpTo: 123
>> ...
>> 117 <22> pushConstant: 263
>> 118 <B6> send: =
>> 119 <99> jumpFalse: 122
>> 120 <21> pushConstant: #genQuickReturnConstTwo
>> 121 <90> jumpTo: 123
>> 122 <20> pushConstant: #genQuickReturnInstVar
>> 123 <7C> returnTop
>>
>> but it could be more efficient if the code were
>>
>>     pushTemp: 0
>>     dup
>>     pushConstant: 256
>>     send: <
>>     jumpFalse: L1
>>     dup
>>     pushConstant: 260
>>     send: <
>>     jumpFalse: L2
>>     dup
>>     pushConstant: 258
>>     send: <
>>     jumpFalse: L3
>>     dup
>>     pushConstant: 257
>>     send: <
>>     jumpFalse: L4
>>     pushConstant: #genQuickReturnSelf
>>     jumpTo: L0
>>     pushConstant: #genQuickReturnConstNil
>>     jumpTo: L0
>>     ...
>>    L1:
>>     pushConstant: #genQuickReturnInstVar
>>    L0:
>>     returnTop
>>
>> > - use thisContext pc: value.
>> >
>> > This would be a possibility for me to experiment with for now.  When I
>> > have a working
>> > parser generator tool I could campaign for my computed goto instructions
>> > to be added
>> > to the VM.
>> >
>> > > This /should/ be fine in the stack VM, but
>> > > slooooow in the JIT because internally mapping bytecode pcs to machine
>> > code
>> > > pcs is slow, and currently slower still because the frame will be
>> > converted
>> > > to a pure context and then converted back into a frame on the return
>> from
>> > > pc:.  But this solution isn't to be rejected out-of-hand.  It can be
>> > > optimized to avoid the frame conversion and the JIT might be able to
>> > > optimize it.
>> >
>> > I assume that if computed gotos were used the translation to machine
>> code
>> > would require a direct
>> > mapping of (virtually labeled) bytecode locations to machine code
>> > locations.  I think this can be done
>> > in a reasonable amount of time but others such as yourself clearly
>> > understand the issues far better than
>> > I do.  The dirty solution to start would be to simply not JIT the code
>> > that uses computed gotos.
>> >
>>
>> Yes, it could be.  The JIT would generate a jump table from the literal
>> containing the bytecoded pcs, and there would be no conversion; only
>> indexing the jump table and jumping.  Again, organizing that table as a
>> binary switch may well be faster on modern architectures.  Indirect jumps
>> typically involve pipeline stalls, whereas binary jump trees don't.
>>
>> > The main problem is the compiler has no support for labels so
>> > > there would be work here.
>> >
>> > I don't mind doing the work but to my way of thinking  "goto X" is
>> pretty
>> > basic
>> > and is thus best handled at the VM/byte code level.  Anything else is
>> doing
>> > in a complicated way something that is fairly simple.  Of course
>> changing
>> > the VM/byte codes by even a single byte code is a major deal unless done
>> > when the VM/byte codes are initially created.  Alas I must deal with
>> what
>> > already exists.  Even so, my preference is to work with the VM if at all
>> > possible.
>> >
>> >
>> > >> For example, one such language is that of regular expressions, which
>> I
>> > >> wish to translate into finite state machines implemented in VM code.
>> > >> In this case I need case 2) gotos where coll is a collection of
>> > >> associations, possibly a
>> > >> Dictionary. I also plan to write a debugger for this (and other
>> > languages)
>> > >> but that is another story.
>> > >>
>> > >> I realize that the Cog VM is being built for Smalltalk (Squeak?
>> Pharo?)
>> > >> for which the goto instructions are not needed and thus I assume
>> > >> unavailable. But there is something to
>> > >> viewing a virtual machine as general purpose and thus the target of
>> > >> multiple languages as is
>> > >> the case for the Java virtual machine.
>> > >> If the Cog VM is viewed this way then I argue there is a need for my
>> > goto
>> > >> instructions
>> > >> because some languages have need for them.
>> > >> For example, many languages have case statements.  (I am all for
>> object
>> > >> oriented
>> > >> but I would be willing to accept a case statement in Smalltalk too;
>> the
>> > >> Squeak code
>> > >> implemented one in Squeak doesn't cut it).
>> > >
>> >
>> > > I've occasionally thought about this for many years.  A computed jump
>> > might
>> > > be nice.  Eg index an Array literal of pcs with the integer on top of
>> > > stack, falling through on bad type or out of range.
>> >
>> > This is the way I am thinking.  If there are other reasons for a
>> computed
>> > jumpTo
>> > as well all the better.
>> >
>> > >> Anyway, I am not arguing to Change Squeak or Smalltalk but I am
>> arguing
>> > >> to have my goto instructions in Cog VM. Is there any chance of
>> this?????
>> > >>
>> >
>> > > There's no chance of me spending time implementing this any time
>> soon.  I
>> > > have too much high-priority tasks to tackle this.  But I want to
>> > encourage
>> > > you or others to have a go implementing it.  It's fun!
>> >
>> > I understand and am willing to be the one to add one or more computed
>> jump
>> > instructions, including working on the JIT code generator if needed.
>> > As you say it should be fun (and also educational).  But
>> >    1)  I am pretty busy now too and probably won't get to this for a
>> year.
>> >    2)  If I am to do this it would be great if someone can write a
>> > specification as to
>> >         what is to be done.  If someone can write this now that would be
>> > great but
>> >         if they write it when I post that I am ready to do the work that
>> > would also
>> >         be fine.
>> >    3)  I don't want to just have my own private VM/byte codes.  I want
>> > users of my
>> >         parser generator tool to be able to load it into a standard
>> > version of Squeak
>> >         and run it there including the possible generation of compilers
>> > for compiling
>> >         their domain specific language programs into byte codes if
>> desired.
>> >
>>
>> Sure.  Get back to me when you're ready to work on it and I'll write the
>> specification, and help you with whatever info you need.  There should be
>> room in the bytecode sets we'll likely be using next year.
>>
>>
>> >
>> > >> I don't know the Squeak VM or the Cog VM either but I assume these
>> > >> instructions don't exist because I see no need of them when the
>> source
>> > >> language is
>> > >> Squeak or any version of Smalltalk for that matter. I also assume
>> that
>> > >> there is already
>> > >> a full list of 256 instructions in the Cog VM and thus no room for my
>> > goto
>> > >> instructions
>> > >> unless some instructions are removed.
>> > >>
>> > >> Are there Cog VM instructions that are so rarely used that they
>> could be
>> > >> removed without
>> > >> unreasonably slowing down the Cog VM interpretation of byte codes
>> > >> generated from Squeak source code?????
>> > >>
>> >
>> > > The current set has 3 unused bytecodes, one of which Spur uses, so
>> > > effectively there are two unused bytecodes.
>> >
>> > Levente Uzonyi  in his posting pointed out that only one instruction is
>> > needed.
>> > I don't like having to push the address to jump to onto the stack,
>> > preferring a byte
>> > code with an argument, but I could live with his solution if that is
>> what
>> > is decided.
>> > In the case of  goto  coll at: X  the address is likely to end up on top
>> > of the stack
>> > anyway so Levente's  jumpToTop instruction looks good in any case.
>> >
>>
>> If the bytecode is one that takes an integer on top of stack, and an Array
>> literal containing bytecode pcs, falling through on out of range, then
>> nothing other than the index need be pushed on top of stack.  That would
>> be
>> my preference.
>>
>> > The Cog VMs support multiple bytecode sets.  If you look at the
>> > > BytecodeSets package on VMMaker you can read the class comments of the
>> > > BytecodeEncoder subclasses such as EncoderForSistaV1.  These bytecode
>> > sets
>> > > have a few more unused bytecodes.  This multiple bytecode set support
>> is
>> > > better implemented in Spur where there is only one compiled method
>> header
>> > > format and support for 64k literals.  So let me encourage you to move
>> to
>> > > Spur and to look at the Sista set.  The class comment of each encoder
>> > class
>> > > specifies the instruction set it targets.
>> >
>> > I am prepared to work with Spur and the Sista set.  I am looking for
>> > someone to
>> > say that if I do this work that incorporating the work into Spur will be
>> > seriously
>> > considered.
>> >
>>
>> I can say that, yes.  I will seriously consider adding it.  I think its a
>> useful thing in the bytecode set, precisely to allow compiling other
>> languages to the VM.
>>
>>
>>
>> >
>> > Ralph Boland
>> >
>> >
>> >
>>
>>
>> --
>> best,
>> Eliot
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL:
>> http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20141103/469369c9/attachment.htm
>>
>> ------------------------------
>>
>> _______________________________________________
>> Vm-dev mailing list
>> Vm-dev at lists.squeakfoundation.org
>> http://lists.squeakfoundation.org/mailman/listinfo/vm-dev
>>
>>
>> End of Vm-dev Digest, Vol 101, Issue 7
>> **************************************
>>
>
>
>
> --
> -- Those who can make you believe absurdities, can make you commit
> atrocities -- Voltaire
>
>

-- 
best,
Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20141107/d7aea646/attachment-0001.htm