[Vm-dev] Questions about Cog internals
Eliot Miranda
eliot.miranda at gmail.com
Tue May 3 17:17:55 UTC 2011
On Tue, May 3, 2011 at 4:49 AM, Mariano Martinez Peck <marianopeck at gmail.com
> wrote:
>
> Hi Eliot. I am really trying (with all my lack of knowledge) to understand
> a little about how Cog works internally. I am also reading your posts, and I
> have a couple of (probably newbie) questions. If any of them are answered in
> the blog, please point me to them (I couldn't ream all of them yet):
>
> 1) Suppose you have a CompiledMethod XXX that you JIT, and you get a
> CogMethod YYY. While doing the GC (#lastPointersOf:,
> #lastPointerWhileForwarding:, etc), you need to check whether XXX is a
> CogMethodReference because if so, you need to fetch XXX header from YYY.
> Perfect. To avoid the GC to look in the CogMethod "objects" you put a header
> with the special format for empty objects, hence the GC doesn't follow the
> non-existent "instVars" of CogMethod. Perfect. CogMethod has a pointer to
> its original CM (back-pointer), called 'methodObject'. In this case, YYY has
> a pointer to XXX. So....my question is, during a GC compaction or a #become,
> where the address of XXX is changed, how do you update YYY so that to point
> to the new address of XXX? because if you flag YYY as an empy object, then
> the GC doesn't update it.
>
The garbage collector uses NewObjectMemory>>#mapPointersInObjectsFrom:to: to
update pointers for compactions and becomes. This always invokes
CoInterpreter>>mapInterpreterOops which always invokes
CoInterpreter>>mapMachineCode, which always invokes
Cogit>>mapObjectReferencesInMachineCode:. That splits into either
Cogit>>mapObjectReferencesInMachineCodeForFullGC or
Cogit>>mapObjectReferencesInMachineCodeForIncrementalGC, depending on this
being an incremental GC or not. The CogMethodZone maintains a list of Cog
methods containing young references so in an incremental GC only these
methods are scanned.
2) As far as I understand, CogMethod doesn't "store/duplicate" the literals
> of the CompiledMethod. Hence, even when you have a jitted method, when you
> need a special literal, you ask it to the CM, using the backPointer
> 'methodObject'. Is this correct ?
>
That's not correct. Literals are embedded in machine code, both in inline
caches (selectors and classes) and in literal references. See
Cogit>>annotate:objRef:.
>
> 3) This is the most stupid question, but I don't see WHERE the machine code
> is kept. When I jit a method, I get a structure CogMethod, perfect. What
> where is the generated machine code? where is it kept? how can I know from a
> CogMethod which is the associated machine code?
>
Look at CoInterpreter>>readImageFromFile:HeapSize:StartingAt: (for the real
VM) and CogVMSimulator>>openOn:extraMemory: (for the simulator). These
set-up the memory via the variable memory (in the real VM) or 0 (in the
simulator the heap starts at address 0), and cogCodeSize. Then see
Cogit>>initializeCodeZoneFrom:upTo: for initialization. The CogMethodZone
is at the start of the heap.
4) I guess that my thought of 2) is not correct, because otherwise, I don't
> understand why you need CoInterpreter >>markAndTraceOrFreeMachineCode:. The
> comments says "Deal with a fulGC's effects on machine code. Either mark and
> trace oops in machine code or free machine-code methds that refer to
> freed oops. The stack pages have already been traced so any method of
> live stack activations have already been marked and traced."
>
> which oops do you mean by "oops in machine code" ? literals? the
> back-poiner to the CM?
>
Both, and oops in inline caches.
> and by " free machine-code methods that refer to freed oops" what do you
> mean? literals or oops as the back pointer? I can think you refer to the
> backpointer since the original CM could have been garbage collected and
> since you flag the CogMethod as empty...
>
This is the tracing step that marks live objects. It must identify all
object references in a Cog method. But if the Cog method's bytecoded method
isn't marked it frees the Cog method. See
Cogit>>markAndTraceOrFreeCogMethod:firstVisit:.
> 5) This is not a question, but rather that I would like to know whether I
> understood correctly or not. You Jit a method when it is secondly used, that
> is, when you find it in the cache. To know how to generate the machine code
> or a particular bytecode, you check in the table that you generate wth
> #initializeBytecodeTableForClosureV3 where you basically map bytecodes to
> methods that generates the machine code of such bytecode. If it is a
> primitive you use instead #compilePrimitive which cecks in a similar table,
> but for primitives, which is set in #initializePrimitiveTableForSqueakV3.
>
Methods are jitted either when found in the cache, or when a block is
invoked in the same method twice in a row (on the second block invocation)
or on the Nth backward jump in a loop or when a method is evaluated via
withArgs:executeMethod: (a doit). Look for transitive senders of
Cogit>>cog:selector:.
Now, I have compiled method XXX (selector xxx) which sends #foo. XXX was
> jitted to CogMethod YYY (selector yyy). When xxx is executed, YYY is
> executed. When YYY was jitted, you defined in
> #initializeBytecodeTableForClosureV3 that it just be a specific method,
> which at the end, for normal messages it is: #genSend:numArgs:. That method
> to generate the machine code includes the "trampoline" (which is searched in
> 'sendTrampolines', and in #generateSendTrampolines we can see how you map
> from one to the other one) and sends the associated message, in this case,
> #ceSend:super:to:numArgs:. So...the #foo will be finally "handle" in
> ceSend:super:to:numArgs:. This is ONLY true if the send was "unlinked". If
> #foo in fact was jitted also, then you try to link it (to avoid searching in
> cache next times???). Suppose you could link both of them,so next time YYY
> is executed, it will call DIRECTLY the CogMethod of #foo. In this case, the
> method to be executed in the VM is
> #executeCogMethodFromLinkedSend:withReceiver: instead of
> #ceSend:super:to:numArgs:
>
> So..I am delirious or that is more or less correct ?
>
More or less. Yes. Have you read
http://www.mirandabanda.org/cogblog/2011/03/01/build-me-a-jit-as-fast-as-you-can/?
It covers ceSend:... in detail.
>
> Thanks a lot in advance,
>
you're welcome.
>
> --
> Mariano
> http://marianopeck.wordpress.com
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20110503/50616bd1/attachment.htm
More information about the Vm-dev
mailing list