[Vm-dev] Re: goto instruction with Cog VM

Ben Coman btc at openInWorld.com
Sat Nov 8 23:35:37 UTC 2014


Eliot Miranda wrote:
>  
> 
> 
> ------------------------------------------------------------------------
> 
> 
> 
> On Sat, Nov 8, 2014 at 11:21 AM, Ralph Boland <rpboland at gmail.com 
> <mailto:rpboland at gmail.com>> wrote:
> 
>      
>      > Hi Ralph,
>     ...
> 
>      > >
>      > > I was aware of caseOf: in Squeak.  I always found it awkward to
>     use and
>      > > felt a true case statement would be simpler.  Alas, it's
>     impossible to
>      > > have a true case statement added to Smalltalk now I think.
> 
>      > So what's a "true" case statement?  For me, at least, the Squeak
>     one *is*,
>      > and is more general than one limited to purely integer keys, as
>     for example
>      > is C's switch statement.  A number of languages provide case
>     statements
>      > that are like Squeak's.  What do you consider a "true" case
>     statement?
> 
>     I mean that:  caseOf: is not part of the language itself but rather
>     part of the
>     standard library or set of packages that one finds in the IDE.  To
>     be part of the
>     language it would need to be something the compiler is aware of. 
> 
> 
> Ah OK.  I see what you mean. But you're wrong on a few counts.  First, 
> there are *no* control structures in the language beyond closures and 
> polymorphism.  ifTrue:, to:do:, and: whileTrue: et al are all defined in 
> the library, not by the compiler.  Second, tehse structures, /including/ 
> caseOf: are understood by the compiler and compiled to 
> non-message-sending code.  So none of the blocks in caseOf:, ifTrue: 
> and: whileTrue: et al, the optimized selectors, are created and all are 
> inlined by the compiler.  So a) by your criterion of being in the 
> compiler caseOf: is in the language, but b) it all control structures in 
> Smalltalk are defined in the library, and some are optimized by the 
> compiler.

Reviewing the code for the following is enlightening:
True>ifTrue:
True>>ifFalse:
False>>ifTrue:
False>>ifFalse:
to see as the original implementation, but remembering that as an 
optimization these are inlined, so that code is currently not executed.

Eliot, Would I be right to presume that the Interpreter does execute 
those methods without optimisation?

cheers -ben

> 
>     That is to
>     day the Smalltalk language is not very much.  Smalltalk (Squeak) the
>     language
>     would not include Sets or Dictionaries but would include (some)
>     Array classes
>     because some aspects of Arrays are dealt with directly by the compiler.
> 
> 
> There is a syntactic form for creating Array, but really the notion that 
> the Smalltalk compiler defines the language is a limited one.  It's fair 
> to say that language is defined by a small set of variables, return, 
> blocks, an object representation (ability to create classes that define 
> a sequence of named inst vars and inherit from other classes), and 
> message lookup rules (normal sends and super sends), and a small number 
> of literal forms (Array, Integer, Float, Fraction, ByteArray, String and 
> Symbol literals), and a method syntax.  The rest is in the library.  
> What this really means is that Smalltalk can't be reduced to a language, 
> becaue the anguage doesn't defne enough.  Instead it is a small language 
> and a large library.
> 
>     Selectors such as  ifTrue: and  to:do:  are part of the language
>     because they are inlined by the compiler.
> 
> 
> No.  One can change the compiler to not inline them.  This is merely an 
> optimization.
>  
> 
>     Put another way,  if I could get my doBlockAt: method incorporated
>     into the Squeak IDE
>     it would nevertheless NOT be part of Squeak the language.
>     The consequence of  caseOf:  not being part of the language is that
>     the compiler/VM
>     cannot perform optimizations when caseOf:  is run into but must
>     treat it as
>     user written code.
> 
>     Squeak's  caseOf:  is more general than C's switch statement but it
>     could be more
>     general in that there is a hard coded message (=).  I would like to
>     be able to replace
>     the '=' message by an arbitrary binary operator such as  includes: 
>     or '>'.
> 
>     I have to backtrack here:  I looked at the code and it looks like
>     the compiler inlines
>     caseOf:  and caseOf:otherwise.  If so then these selectors are part
>     of the language
>     by my definition.
> 
> 
> Well, live and learn :-)
>  
> 
> 
>     ...
> 
>      > > But I wouldn't want to be forced to implement my FSMs this way.
>      > > It might be acceptable for small FSMs.
>      > > I want to avoid sequential search and
>      > >  even binary search might be rather expensive.
>      > > I look at computed gotos as the solution but,
>      > > as you pointed out, computed gotos pose problems for JIT.
>      > > Admittedly, for large FSM's, it might be best or necessary to
>      > > use a FSM simulator anyway, as I do now.
> 
> 
>      > Nah.  One should always be able to map it down somehow.  Tis will
>     be easier
>      > with the Spur instruction set which lifts number of literals and
>     length of
>      > branches limits.
> 
>     Good to hear.
> 
> 
>      > > Again, for my FSM, case this would often be considered to be good.
>      > > But if the state transition tables are sparse then Dictionaries
>      > > might be preferable to Arrays.
>      >
> 
>      > Yes, but getting to the limit of what the VM can reasonably
>     interpret.
>      > Better would be an Array of value. pc pairs, where the keys are
>     the values
>      > the switch bytecode compares top of stack against, and the pcs
>     are where to
>      > jump to on a match.  The JIT can therefore implement the table as
>     it sees
>      > fit, whereas the interpreter can just do a linear search through
>     the Array.
> 
>     I am looking at this from the point of view of a compiler
>     writer/generator and consider
>     your proposal as inadequate for my needs.  You, I think, are looking
>     at this from
>     the point of view of a VM writer and what can reasonably be
>     delivered.  I don't think
>     what I want is overly difficult for the interpreter to deliver but
>     as you pointed out,
>     and you know much better than I, what I want causes serious problems
>     for the VM.
> 
>      > > My expection is that  at:  be sent to the collection object
>      > >  to get the address to go to.  Knowing that the collection
>      > > is an array though makes it easier for the compiler/VM to
>      > > ensure that the addresses stored in the collection are valid.
>      > > Actually, the compiler will be generating the addresses.
>      > > Does the VM have absolute trust in the compiler to generate valid
>      > > addresses?
> 
> 
>      > Yes.  Generate bad bytecode and the VM crashes.
>      
>     This is what I expected to hear but wanted it to be clear for
>     compilers generated
>     by my parser generator tool as you did.
> 
>     Ralph
> 
> 
> 
> 
> -- 
> best,
> Eliot



More information about the Vm-dev mailing list