[Vm-dev] Exploring the simulator (was Re: REPL image for simulation)

Clément Bera bera.clement at gmail.com
Mon May 30 14:12:43 UTC 2016


Hi !

On Mon, May 30, 2016 at 2:39 PM, Ben Coman <btc at openinworld.com> wrote:

>
> On Sun, May 29, 2016 at 10:14 AM, Ben Coman <btc at openinworld.com> wrote:
> > Hi Clement, Thanks for your detailed reply.  I particularly liked your
> > warm up exercises.  Goal directed learning is better than general
> > browsing.
> >
> > On Tue, May 24, 2016 at 1:29 AM, Clément Bera <bera.clement at gmail.com>
> wrote:
> >>
> >> Hi Ben,
> >>
> >> The REPL image expects chunk format. Hence you need to write "3 + 4 !"
> >>
> >> To get warmed-up:
> >> 1) Inspect the object memory, then look for the first class table page
> instance variable. It's an oop referencing an array, try in the simulator
> to "printOop:" the address of the first class table page that you found. It
> should print it in the Transcript, the first entries are immediate, in
> Spur32 SmallInteger/Character/SmallInteger.
> >
> > The inspector showed a Spur32MMLECoSimulator and classTableFirstPage
> > held 16r5311F8. Plugging that into [print oop...] showed...
>
>   16r5311F8: a(n) Array
>    16r52D108 nil  16r15C3A50 class SmallInteger   16r878D70 class
> Character  16r15C3A50 class SmallInteger
>   16r1111DC0 class SmallFloat64   16r52D108 nil   16r52D108 nil
>  16r52D108 nil
>    16r52D108 nil   16r52D108 nil   16r52D108 nil   16r52D108 nil
>    16r52D108 nil   16r52D108 nil   16r52D108 nil   16r52D108 nil
>    16r87AE60 class Array   16r52D108 nil   16r52D108 nil   16r52D108 nil
>    16r52D108 nil   16r52D108 nil   16r52D108 nil   16r52D108 nil
>    16r52D108 nil   16r52D108 nil   16r52D108 nil   16r52D108 nil
>    16r52D108 nil   16r52D108 nil   16r52D108 nil   16r52D108 nil
>    16r878C58 class LargeNegativeInteger   16r878C90 class
> LargePositiveInteger  16r10AEAE8 class BoxedFloat64   16r879438 class
> Message


> All the nils I guess are due to the class table being a hash map?
>
> Is there some way from within the simulation to reference an object by
> its hex number.  For example, to use the size of that array from
> within the simulation, something like...
>
>    classTableSize := 16r5311F8 objectFromHex size
>

You got the right result. The the class table is a linked list of pages,
each page being an array. The first page, shown here, is reserved for
frequently used classes.

Indexes 0-15 are reserved for tagged object.
Indexes 16-32 are reserved for hidden classes. Typically the class table
pages are instances of Array, but the use index 16 so the VM know they are
hidden.
The rest is for real classes that are frequently used. There are many nils
so we have free space for new features. It's not a hash map.

I don't think things like that exists: *classTableSize := 16r5311F8
objectFromHex size. *For oops debugging features are tied to printing
through the simulator instance right now. However there is something like
that in the JIT. In the machine code zone we can access part of the bytes
as CogMethodSurrogate and its subclasses and in the stack we can do the
same for stack pages with the corresponding surrogate. In this case one can
do something like:
CogMethodSurrogate at: 16r51578 objectMemory: objectMemory cogit: cogit
And then one can ask the surrogate things like:
surrogate cmRefersToYoung
And it reads the correct bytes for you, in this case answering if the cog
method has a reference to a young object.


> >
> >
> >> 2) print the active stack, look for the method's address. Try to print
> it as an oop, and if it tells you "address in the machine code zone", print
> the cog method and its machine code instead.
>
> I presume is the active stack is
> [print call stack] which produces...
>
>   16r1012F8 M MultiByteFileStream(StandardFileStream)>basicNext
> 16r1E7408: a(n) MultiByteFileStream
>   16r101334 M UTF8TextConverter>nextFromStream: 16r1EA418: a(n)
> UTF8TextConverter
>   16r10135C M MultiByteFileStream>next 16r1E7408: a(n) MultiByteFileStream
>   16r10138C I MultiByteFileStream(PositionableStream)>nextChunkNoTag
> 16r1E7408: a(n) MultiByteFileStream
>   16r1013B0 I StdioListener>run 16r1E7C98: a(n) StdioListener
>   16r1013D0 I [] in UndefinedObject>(nil) 16r52D108: a(n) UndefinedObject
>   16r1013F0 I [] in BlockClosure>newProcess 16r1E7E00: a(n) BlockClosure
> ---------
>
> [print oop...] 16r1012F8   tells me...
>     16r1012F8 is in the stack zone
>
> [print cog method for...] 16r1012F8    tells me...
>     not a method
>
> [print mc/cog frame]   says...
> Assertion failed
> with debugger at CogVMSimulatorLSD(CoInterpreter)>>isMachineCodeFrame:
>
> So I seem to be missing something.
>
>
> I restarted the simulator and this time...
> [print call stack...]
>   16r1012F8 M MultiByteFileStream(StandardFileStream)>basicNext
> 16r2A1BA8: a(n) MultiByteFileStream
>   16r101334 M UTF8TextConverter>nextFromStream: 16r2A2188: a(n)
> UTF8TextConverter
>   16r10135C M MultiByteFileStream>next 16r2A1BA8: a(n) MultiByteFileStream
>   16r10138C I MultiByteFileStream(PositionableStream)>nextChunkNoTag
> 16r2A1BA8: a(n) MultiByteFileStream
>   16r1013B0 I StdioListener>run 16r2A1B00: a(n) StdioListener
>   16r1013D0 I [] in UndefinedObject>(nil) 16r52D108: a(n) UndefinedObject
>   16r1013F0 I [] in BlockClosure>newProcess 16r2A1690: a(n) BlockClosure
> ----------
>
> [print oop...] 16r1012F8
>   16r1012F8 is in the stack zone
>
> [print oop...] 16r2A1BA8
>   16r2A1BA8: a(n) MultiByteFileStream
>    16r2A2740
> '????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????'
>        16r1 =0 (16r0)        16r1 =0 (16r0)   16r52D108 nil
>    16r52D108 nil   16r52D118 false   16r9AA1E8 #stdin   16r24CC18 a
> ByteArray
>    16r2A2728 '?'   16r52D108 nil   16r2A2188 an UTF8TextConverter
> 16r6DF1D8 #lf
>    16r52D128 true
> -----------
>
> Now after doing 3+4! several times,
> [print call stack...] produces...
>   16r101300 M MultiByteFileStream(StandardFileStream)>basicNext
> 16r2A1BA8: a(n) MultiByteFileStream
>   16r10133C M UTF8TextConverter>nextFromStream: 16r2A2188: a(n)
> UTF8TextConverter
>   16r101364 M MultiByteFileStream>next 16r2A1BA8: a(n) MultiByteFileStream
>   16r10138C M MultiByteFileStream(PositionableStream)>nextChunkNoTag
> 16r2A1BA8: a(n) MultiByteFileStream
>   16r1013B0 I StdioListener>run 16r2A1B00: a(n) StdioListener
>   16r1013D0 I [] in UndefinedObject>(nil) 16r52D108: a(n) UndefinedObject
>   16r1013F0 I [] in BlockClosure>newProcess 16r2A1690: a(n) BlockClosure
>
> btw, What is the meaning of the M and I in the second column?  I
> notice that 16r10138C has changed from an I to a M.
>
> Also the address associated with basicNext changed from 16r1012F8 to
> 16r101300. Can some meaning be inferred from that?
>

Some explanations are needed here :-)

The M or I at the beginning of the printing are for 'Interpreted frame' or
'Machine code frame'.

When you do [print call stack], you print the list of stack frame in the
current stack. For example,
16r101300 M MultiByteFileStream(StandardFileStream)>basicNext
means that:
- the stack frame address in the stack zone is 16r101300
- the machine code version of the method is executed in this frame (M and
not I).
- the receiver has the type MultiByteFileStream
- the stack frame on top of the stack is the activation for the method
StandardFileStream>>basicNext

Now what you tried to do is to print the frame as a method, and that won't
work (It's not obvious and my exercise was not very precised, sorry).

You can use [print frame ...] and put the frame's hex to print it.
Alternatively, asyou usually want the top (a.k.a. head) frame, you can
directly use [print ext head frame] if it's a machine code frame.

That should print something like that (I print a random frame here):
 16r103160:        arg1:     16r8239 =16668(16r411C)
  16r10315C:        arg2:     16r825D =16686(16r412E)
  16r103158:        arg3:   16r2E21C0 =a(n) Point
  16r103154:        arg4:   16r9BC480 =a(n) StrikeFont
  16r103150:        arg5:        16r1 =0(16r0)
  16r10314C:   caller ip:    16r564E0=353504
  16r103148:    saved fp:   16r103184=1061252
  16r103144:      method:    16r51578 16r102BDD0 16r102BDD0: a(n)
CompiledMethod
  16r103144: mcfrm flags:        16r0  numArgs: 6 noContext notBlock
  16r103140:     context:   16r52D108 nil
  16r10313C:    receiver:   16r2E0078 a GrafPort
  16r103138:        stck:   16r2E0078 a GrafPort
  16r103134:        stck:  16r1A115D0

Now that you've print the frame, you can see the method addresses in this
line:
16r103144:      method:    16r51578  16r102BDD0 16r102BDD0: a(n)
CompiledMethod.
This is a machine code frame, so the method has two addresses:
16r51578 => in generated method, so you need to use
[disassembleMethod/trampoline...] and write down the hex to see the
disassembly. (Toggle Transcript first and open a large Transcript if you do
that).
16r102BDD0 => in the heap. This is the bytecode version of the method. You
can print it using [print oop...]


Ok ! One last warm-up exercise:

3) When the simulator has started and the REPL window has popped up, select
[single step]. Then enter something in the REPL window and execute it. Once
done, do [report recent instructions]. You should be able to see in the
Transcript the last 100 machine instruction with the register state
in-between each instruction.

I think I should write down those exercise as a blog post...





> cheers -ben
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20160530/a7fb4ea4/attachment-0001.htm


More information about the Vm-dev mailing list