[Vm-dev] tempVectors use case and current issues

Eliot Miranda eliot.miranda at gmail.com
Thu Mar 28 23:29:07 UTC 2019


Hi Denis,

On Thu, Mar 28, 2019 at 2:36 PM Denis Kudriashov <dionisiydk at gmail.com>
wrote:

>
> Hi Nicolas.
>
> чт, 28 мар. 2019 г. в 19:44, Nicolas Cellier <
> nicolas.cellier.aka.nice at gmail.com>:
>
>>
>> Hi Denis,
>> Special bytecodes don't have to be changed: just don't use them and
>> replace by regular sends at bytecode generation (with a special compiler,
>> or some IR translater).
>>
>
> Sure, bytecode transformation will work. But it would be quite tricky to
> apply in live execution context. It would require fixing context stack to
> take into account updated method bytecode.
> Notice that I don't search for global setting to recompile all methods in
> image. I want this logic only for concrete method/block activation. In my
> scenario block is serialized and transferred together with current context.
> So on remote side I need to do something with materialized objects to
> maintain normal block semantics.
>
>
>> All can be done at image side then. Or did I miss something?
>>
>
> I think my examples shows a security hole in VM execution logic which
> allows to violate memory bounds from the image side.
>

It is no different than using an inst var access bytecode on an object
which doesn't have enough net vars.  It is not a security hole, as much as
it is something the system must use correctly to avoid crashes.  The same
can be done by e.g.

    thisContext swapSender: Point basicNew

There are many such "security holes".  And if you want the VM to plug them
all then the VM will become very much slower.


> I did not got segfault but I would not be surprized if it would happens in
> some complex real live scenarios. Maybe it looks like a specially invented
> case but I think it is quite easy to get when using or developing low level
> serialization library - as soon as you by mistake or intentionally
> serialize context objects with some substitution logic.
> And considering that this hole needs to be closed it would be good
> opportunity to have another hook in execution engine which can be used like
> in my remote scenario. So back to my proposal in first mail.
>

If you want to solve this, then build a transformation for the block method
when you remote a block.  As others have suggested (Levente) you can
transform the bytecodes into normal sends (my blog post on the entire
scheme starts with implementing it using at: and at:put: before the special
bytecodes are added).  But making a change to all blocks breaks much of the
Sista adaptive optimizer.  We have to have the freedom to access indirect
temp vectors via special case bytecodes if we are to be able to
aggressively optimize code.  If indirect temp vectors are to be treated as
general purpose objects, then we are prevented from making many significant
optimizations.

So, as the doctor said, "don't do that".


>
>
>>
>> Le jeu. 28 mars 2019 à 20:05, Denis Kudriashov <dionisiydk at gmail.com> a
>> écrit :
>>
>>>
>>> Hi.
>>>
>>> I found interesting case where tempVectors can be used in remote
>>> scenarios. The store into remote temp can be really remote (not just about
>>> outer context).
>>> I played with following example:
>>>
>>> | temp |
>>> temp := 10.
>>> remote evaluate: [temp := temp + 1].
>>> temp.
>>>
>>>
>>> For the moment forget about remote thing and look into it as a normal
>>> local case:
>>> temp var here is managed indirectly through tempVector. You can see it
>>> using expression after first assignment:
>>>
>>> thisContext at: 1 "=>#(10)"
>>>
>>>
>>> So the value in fact is stored in the array instance and read from it.
>>> But because of optimization it happens out of the array control. No #at:
>>> and #at:put: messages are sent during this code. VM magically changes the
>>> state of this array (there are special bytecodes for this).
>>>
>>> Now my remote use case. Imagine that vm actually sends #at: and #at:put:
>>> messages to tempVector. Then remoting engine can transfer temp vector (as
>>> part of context) as a proxy. So on remote side the block [temp := temp + 1]
>>> will actually ask the sender (client) for the value and for the storage. So
>>> all block semantics will be supported. Temp in remote outer context will be
>>> modified. I think it would be super cool if such transparency would be
>>> possible.
>>>
>>> I played with this example using Seamless in Pharo. It already works in
>>> the way I described but due to VM optimization it does not provide expected
>>> behavior. And worse than that it actually corrupts transferred proxy
>>> because in place of array the proxy instance is materialized.
>>>
>>> This leads us to the issue with safety of tempVector operations.
>>> Following example shows how we can affect the state of tempVector using
>>> reflection:
>>>
>>> | temp |
>>> temp := 10.
>>> (thisContext at: 1) at: 1 put: 50.
>>> [temp := temp + 1] value.
>>> temp. "==>51"
>>>
>>> It is cool that we can do it. But there is no any safety check in the VM
>>> level over tempVector object:
>>>
>>> | temp |
>>> temp := 10.
>>> thisContext at: 1 put: Object new.
>>> [temp := temp + 1] value.
>>> temp.
>>>
>>>
>>> It breaks with DNU: #+ is sent to nil. Temp became nil.
>>>
>>>
>>> | temp |
>>> temp := 10.
>>> thisContext at: 1 put: #() copy.
>>> [temp := temp + 1] value.
>>> temp.
>>>
>>>
>>> Sometimes it breaks with same error. Sometimes it returns random number.
>>> I guess in these cases VM breaks memory boundary of tempVector.
>>>
>>> And two exotic cases:
>>>
>>>
>>> | temp |
>>> temp := 10.
>>> (thisContext at: 1) beReadOnlyObject.
>>> [temp := temp + 1] value.
>>> temp.
>>>
>>>
>>> It silently return 11. It does not break read only protection. But no
>>> error is signalled.
>>>
>>> | temp |
>>> temp := 10.
>>> (thisContext at: 1) become: #() copy.
>>> [temp := temp + 1] value.
>>> temp.
>>>
>>>
>>> It returns #().  (In Pharo  #() + 1 = #()  ).
>>> I use become to check how forwarding is working in that case. (it works
>>> fine when array has correct size)
>>>
>>> How we can improve this behavior? How it would effect performance?
>>> My proposal is to send real messages to tempVector when it is not an
>>> array instance. Then image will decide what to do.
>>>
>>> Best regards,
>>> Denis
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>

-- 
_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20190328/4e84e41f/attachment.html>


More information about the Vm-dev mailing list