[squeak-dev] Re: A few more arguments to instantiating object memory based on another one

Fri Aug 15 23:29:46 UTC 2008

2008/8/15 Klaus D. Witzel <klaus.witzel at cobss.com>:
> On Fri, 15 Aug 2008 19:24:02 +0200, Joshua Gargus wrote:
>
>> Klaus D. Witzel wrote:
>
> ...
>>>
>>> The main attention got GC, which has (among others) these aspects:
>>>
> ...
>>
>> That's all very interesting (especially the measurements about which
>> objects a Process references).  I'll reluctanly resist the temptation to
>> take the conversation in 10 different directions :-)
>
> Feel free to send them by email :)
>
>> (big snip)
>>
>>>> The technical details of your approach sound good to me (without having
>>>> thought deeply enough to provide truly constructive criticism).  However...
>>>>
>>>> My main concern is that your argument against separate images is
>>>> disingenuous.  They won't be slower if you store them as ByteArrays within
>>>> the main image.
>>>
>>> But then they are always in the way when GC comes around :( This would
>>> invalidate all the pointers of the parallel thread and require global
>>> synchronization :(
>>>
>>> Not a good idea :( we want things to run in parallel independent of each
>>> other's GC.
>>>
>> I think that there is a misunderstanding.  I'm saying that you can store a
>> prototype of an image as a ByteArray in the main image, but you wouldn't
>> actually run a spawned interpreter using the ByteArray as the object memory!
>>  You would use it to populate a separate, newly-spawned HydraVM object
>> memory.
>
> I thought about that but didn't find it interesting; this is what already
> happens when a snapshot is written and read in again. InterpreterSimulator,
> which holds the bytearray that you want (in a Bitmap) does this with help of
> its "real work" superclasses. No need to develop that again, IMHO.
>
>> It would actually be pretty funny to implement it the way you thought I
>> meant, in the same way that Intercal and Lolcode are funny (except this
>> would be more of an inside joke).  But certainly not practical!
>>
>> (hmm, maybe we could combine them... you could spawn a new interpreter
>> with the command "I can has new interpreter?"... what do you think?)
>
> Snapshit can't baby has? More humor and more imperatives, please :) Lolcode
> and Intercal are not easy for people sans English mother tongue :)
>
>>>> In fact, I believe that the opposite would be true; don't you agree?
>>>>  From a performance standpoint, it seems like separate images are the better
>>>> option.
>>>
>>> When creation of bytearray versus creation of separate heap can be
>>> ignored, there would be no difference in terms of performance (it's all oops
>>> all the way down, anyways). Only that bytearrays are not usable for parallel
>>> processing.
>>
>> Now that the confusion above has been cleared up...
>>
>> Wouldn't it be faster to spawn a new object-memory from an image in a
>> ByteArray (which requires a memcpy() and a single pass through the image to
>> relocate oops by a fixed amount)
>> compared to the scheme that Igor describes?
>
> No, there a lot of disadvantages with this. Lets' say that computing the
> desired object graph takes a minute, +100 milliseconds for your single pass.
> And thereafter your ByteArray is unusable, because for every change (or is
> it bug free? and maintenace free?) you have to go through the whole process
> again.
>
> So what is wrong with holding the desired object graph in an array
> (sometimes two arrays)? If you really want bytes (a BitMap) out of this then
> you can put it things a new and idle Hydra thread+heap and push the button
> with the "snapshot" label on it, there you go.
>
> Perhaps we misunderstand each other on what the content of the object graph
> / your bytearray is?
>

My 2 cents. The main difference is in approaches how you treating an
object memory
- do you want to treat it as a big blob of dead bytes
- or you want to treat it as a collection of live objects
interconnected with each other

The first approach give us little freedom in the ways how we can
operate with it: its just a bunch of dead bytes, which can start
living only in running interpreter.
With second, you keep staying with objects all the time - you don't
need to care about object formats, file formats etc - this is VM
responsibility which provides us abstraction layer and we don't need
to care about it anymore. So with a proper tools written, you will
have a full control on what is going on and how to form new object
memory without going deep in undertanding of VM internals.

For some people my words may be sound as herecy, but i think i'm not
alone with this POV - why operating with dead, rigid and hardly
maintainable data placed in files when we having much more beatiful
and powerful concepts found in smalltalk.

-- 
Best regards,
Igor Stasenko AKA sig.