[Vm-dev] Advice Required - Live Typing - Saving local vars types

Wed Jan 2 03:54:32 UTC 2019

Hi Hernan,

On Tue, Jan 1, 2019 at 6:30 PM Hernan Wilkinson <
hernan.wilkinson at 10pines.com> wrote:

>
> Hi all!
>  I need some advice to solve a problem for the Live Typing functionality
> I'm working on (Live Typing, previously called Dynamic Type Information,
> saves the class of the object every time it is assigned to a variable, used
> as return/etc. That info is accesible from the image and helps having
> better tools. More info at:
> https://github.com/hernanwilkinson/Cuis-Smalltalk-DynamicTypeInformation
> For the shake of simplicity I use class and type indistinctly)
>
>  The problem is related to saving the type of parameters and temporaries
> of closures (BlockClosure). Parameters and temporaries (locals from now on)
> of methods work well (with some minor issues not important now)
>

Hence the solution top your problem is to use FullBlocks.  The SistaV1
bytecode set supports full blocks, which means it is possible to use a
separate CompiledBlock for every block in a method. In Squeak one has to
set the Preferred bytecode set encoder class preference to
EncoderForSistaV1 and recompile the system (the system supports up to two
bytecode sets).

The support is in Squeak and fully tested there (my work image uses the
SistaV1 bytyecode set).  Some of the support is in Pharo 7, but there is no
preference and so no easy way to force the use of SistaV1 and Full Blocks.
I don't know if the support has been ported to Cuis from Squeak, but it
should not be difficult to do.

>  To save classes of method locals I added an array in
> AdditionalMethodState whose size is equals to locals size. Each element of
> that array points to another array that holds the classes of the objects
> assigned to a variable (the local var index is used as index in the first
> array).
>  Saving the types of closure's locals is not that simple, but I solved the
> "structural" part  adding a new indirection (as usual). So now
> AdditionalMethodState has an array whose size equals "1 + the number of
> closures the method has", that is one element per "closure". Each element
> of that array will point the array used to point the arrays of types per
> variable. I think an example will help:
>
> m1: p1
>    | t1 |
>
>    t1 := 0.   "<-- it will save SmallInteger in (method additionalState
> contextTypesAt: 1) at: 2."
>    [ | t2 | t2 := 'hello' ] value. "<-- it will save String in (method
> additionalState contextTypesAt: 2) at: 1"
>
>    [ | t3 | t3 := 3.14 ] value. "<-- it will save Float in (method
> additionalState contextTypesAt: 3) at: 1"
>
> As we can see, index 1 is used for the method's context when 0 is assigned
> to t1. Because it is the second local (the first one is p1), the index 2 is
> used to save the type of 0 (SmallInteger).
>  Index 2 is used when 'hello' is assigned to t2 because t2 is defined in
> the first closure. Because it is the first block local, index 1 is used to
> access t2 types array, and so on.
>
>  I did not mentioned it, but the problem resides in the fact that the same
> bytecode is used to assign an object to a var, no matter if it is inside a
> closure or not.
>  So the problem I have to solve is how can the bytecode's code know at
> witch context types array save the assigned object's class. That is, for
> the same bytecode, for example "<69> popIntoTemp: 1" I have to decide the
> array to use.
>  I hope I've been clear, it is a difficult to explain...
>

Well it is difficult to explain.  But a key insight is that the Decompiler
solves exactly this problem in mapping from bytecodes back to source, and
that the Debugger solves this problem in displaying the temporary variables
that are in scope.  So if you have a look at the code in DebuggerMethodMap
you should find an API that provides what you want.  For example, these
Context methods all use the relevant API for linearizing temps:

*Context methods for debugger access*
*namedTempAt:* index
"*Answer the value of the temp at index in the receiver's sequence of
tempNames.*"
^self debuggerMap namedTempAt: index in: self

*namedTempAt:* index *put:* aValue
"*Set the value of the temp at index in the receiver's sequence of
tempNames.*
* (Note that if the value is a copied value it is also set out along the
lexical chain,*
*  but alas not in along the lexical chain.).*"
^self debuggerMap namedTempAt: index put: aValue in: self

*tempNames*
"*Answer a SequenceableCollection of the names of the receiver's temporary *
* variables, which are strings.*"

^ self debuggerMap tempNamesForContext: self

*tempsAndValues*
"*Return a string of the temporary variables and their current values*"
^self debuggerMap tempsAndValuesForContext: self

So for example if I evaluate this:

    (1 to: 2) inject: 0 into: [:a :b| ^thisContext sender tempsAndValues]

I get this:

'thisValue: 0
binaryBlock: [closure] in Context class>>DoIt
nextValue: 0
each: 1
'

So the simplified API is provided by DebuggerMethodMap.

But you want to do this in the VM.  That's a lot harder, and it makes the
VM implementation very dependent on the specific compiler implementation.
If you go with Full Blocks things may not be too bad.  But moving things up
to the image level will make everything a lot easier, faster to develop,
and extensible.

 After a lot of ideas and possibilities, I found that the method/closure
> start pc could be use to decide the array's index to use (based on
> something similar to what CompiledMethod>>#startpcsToBlockExtents returns).
>  So, every time a new activation context is created (for example
> StackInterpreter>>#activateNewClosure:outer:method:numArgs:mayContextSwitch:,
> StackInterpreter>>#internalActivateNewMethod and so on) I can use the start
> pc to calculate the array's index.
>  So, let's say I have that solved too, now the problem is how can I access
> that index from the bytecode's code?. I have the following ideas:
> 1) Add an inst. var. to MethodContext that will have the index (or even
> better, the local's types array). So every time a new context is created, I
> calculate the index based on the PC and set that inst. var.
> 2) Do the same as in 1) but adding an inst. var. to BlockClosure (better
> than 1 because the closure is created once while the method context could
> be created more than once for the same closure)
> 3) Push the calculated index in the stack (as the IP, SP, etc. are
> pushed). Based on the num. args + num. temps., calculate the position in
> the stack of that index every time a type has to be saved.
> 4) Have an interpreter variable as 'method' but called, let's say,
> 'contextVarsTypes' that is set every time a new activation is created. The
> previous contextVarsTypes value is pushed in the stack and restore from it
> when exiting a context.
>
> The problem with 1) and 2) is that MethodContext and BlockClosure can not
> be modified (at least not easily, a new image format would be needed, etc),
> but the advantage is that I don't have to worry about that value when a
> context is leaved.
> Between 3) and 4) and think 4) is faster but I'm not sure that if a GC is
> executed and the array moved (let's say contextVarsTypes points directly to
> the types array), that contextVarsTypes will point to the new arrays
> position in memory (will that happen? how is 'method' changed if a GC is
> executed?)
>
> Which one do you think is better/faster/possible?
> Any advice/comment on this matter will be appreciated.
> If there is an easier/different way to solve this problem, please help me
> :-)
>
> Thanks!
> Hernan
>
>
> --
>
> *Hernán WilkinsonAgile Software Development, Teaching & Coaching*
> *Phone: +54-011*-4893-2057
> *Twitter: @HernanWilkinson*
> *site: http://www.10Pines.com <http://www.10pines.com/>*
> Address: Alem 896, Floor 6, Buenos Aires, Argentina
>

-- 
_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20190101/ce45707a/attachment-0001.html>