[squeak-dev] Re: A few more arguments to instantiating object memory based on another one

Sat Aug 16 09:39:43 UTC 2008

On Sat, 16 Aug 2008 09:46:51 +0200, Joshua Gargus wrote:

> Klaus D. Witzel wrote:
>> On Fri, 15 Aug 2008 19:24:02 +0200, Joshua Gargus wrote:
>>> I think that there is a misunderstanding.  I'm saying that you can  
>>> store a prototype of an image as a ByteArray in the main image, but  
>>> you wouldn't actually run a spawned interpreter using the ByteArray as  
>>> the object memory!  You would use it to populate a separate,  
>>> newly-spawned HydraVM object memory.
>>
>> I thought about that but didn't find it interesting; this is what  
>> already happens when a snapshot is written and read in again.  
>> InterpreterSimulator, which holds the bytearray that you want (in a  
>> Bitmap) does this with help of its "real work" superclasses. No need to  
>> develop that again, IMHO.
>>
> Not develop it again, just use it.

But the source code in Interpreter+ObjectMemory is not usable for that  
unless you pass it an OS specific file handle.

Therefore this (passing of a bytearray which represents some .image) is  
not of interest to me.

>
>>> It would actually be pretty funny to implement it the way you thought  
>>> I meant, in the same way that Intercal and Lolcode are funny (except  
>>> this would be more of an inside joke).  But certainly not practical!
...
>>>>> In fact, I believe that the opposite would be true; don't you  
>>>>> agree?  From a performance standpoint, it seems like separate images  
>>>>> are the better option.
>>>>
>>>> When creation of bytearray versus creation of separate heap can be  
>>>> ignored, there would be no difference in terms of performance (it's  
>>>> all oops all the way down, anyways). Only that bytearrays are not  
>>>> usable for parallel processing.
>>> Now that the confusion above has been cleared up...
>>>
>>> Wouldn't it be faster to spawn a new object-memory from an image in a  
>>> ByteArray (which requires a memcpy() and a single pass through the  
>>> image to relocate oops by a fixed amount)
>>> compared to the scheme that Igor describes?
>>
>> No, there a lot of disadvantages with this. Lets' say that computing  
>> the desired object graph takes a minute, +100 milliseconds for your  
>> single pass. And thereafter your ByteArray is unusable, because for  
>> every change (or is it bug free? and maintenace free?) you have to go  
>> through the whole process again.
>>
> I'm trying to be clear that I think that Igor's idea is a promising way  
> to develop and create new object-memories.  I'm simply suggesting that  
> once you've created an object memory (using Igor's method, or via a  
> declarative specification, or whatever), then it is possibly better to  
> have production code spawn new interpreters from a "snapshot image".

Yes, from within the running .image is meant by both of us, to be clear.

The only difference we have is about the format of things passed to the  
routine which populates the new heap.

>> So what is wrong with holding the desired object graph in an array  
>> (sometimes two arrays)?
> Nothing, when you're developing.  But when I have a running production  
> system and I decide that I need to spawn a new interpreter, I want it to  
> happen as fast as possible.

Sure, but the input to that imaginary routine cannot be accepted  
unverified and this extra work cannot be done with less work than that for  
allocating objects in the new heap (in fact: cloneing ;)

I've taken the best parts out of ObjectMemory>>#clone: and  
ObjectMemory>>#allocateChunk: for that, with the following assumptions:

o fromOop is valid oop
o heap was allocated sufficiently large
o freeBlock is valid
o headerTypeBytes is valid
o lastHash is valid

Essentially this does: (freeBlock := freeBlock + numBytes "occupied by the  
clone") and copies the words (newOop[i] := fromOop[i]) thereby mapping  
references to new locations.

But this routine doesn't have to validate any oop, since the fromOop's are  
still alive and in good shape, and participating in cell division happily  
;)

> I don't want to be doing fancy traversals of an object graph,

o.k. NP. Once you have the object graph, it can be used times and again,  
like a blueprint for a new thing.

> I'd rather memcpy() a pre-created image and run through it once to fix  
> offsets.

Please not. Please do a Smalltalk typical sanitycheck, validate the input  
(your bytearray) as best you can. It will pay back (Murphy says: sooner ;)  
or later.

And if validation fails, inform the user instead of crashing the system  
(please).

Believe me, proper validation of unknown oops cannot be faster than proper  
cloneing of living objects ;)

>> If you really want bytes (a BitMap) out of this then you can put it  
>> things a new and idle Hydra thread+heap and push the button with the  
>> "snapshot" label on it, there you go.
> Sure, that's a fine way to implement it.  I don't care how it's  
> implemented, although simpler and reusing existing code is better.

I take your word that you "don't care" ;) and propose to go the way that  
was suggested earlier :)

>>
>> Perhaps we misunderstand each other on what the content of the object  
>> graph / your bytearray is?
> I think that you misunderstand the contents of my bytearray; it's just a  
> "snapshot image" (which could, as you suggest, be created by pushing the  
> "snapshot" button).
>
> It's possible that I'm overestimating the speed advantage of starting up  
> from a snapshot compared to creating an object-memory via Igor's method,

 From gut feeling it looks to be the same (#reverseBytesInImage is not  
needed because endianess is the same; also what #adjustAllOopsBy: would  
have to do).

But extra validation and crash prevention will count for the difference,  
IMO.

> but I believe that I understand his general approach.

Okay, happyness :)

/Klaus

> Cheers,
> Josh
>>
>> /Klaus
>>
>>> (snip the rest, where we are in agreement)
>>>
>>> Cheers,
>>> Josh
>>>
>>
>