[Vm-dev] What are classes defined with varibleSubclass?
eliot.miranda at gmail.com
Thu Jun 9 18:34:53 UTC 2011
On Wed, Jun 8, 2011 at 6:41 PM, Javier Pimás <elpochodelagente at gmail.com>wrote:
> thanks both mariano and you for the quick answers, they helped a lot. Now
> let's get the hands dirty: I want to understand a bit better the internals
> of this tiny animals. I have a MethodDictionary, with tally=1 and an array
> of size 32, filled with all nils except for the last position. Looking at it
> with gdb, including header I get
> 0x78c68dd4: 0x779cadf5 0x1848038d 0x00000003 0x78c68e64
> 0x78c68de4: 0x77831004 0x77831004 0x77831004 0x77831004
> 0x78c68df4: 0x77831004 0x77831004 0x77831004 0x77831004
> 0x78c68e04: 0x77831004 0x77831004 0x77831004 0x77831004
> 0x78c68e14: 0x77831004 0x77831004 0x77831004 0x77831004
> 0x78c68e24: 0x77831004 0x77831004 0x77831004 0x77831004
> 0x78c68e34: 0x77831004 0x77831004 0x77831004 0x77831004
> 0x78c68e44: 0x77831004 0x77831004 0x77831004 0x77831004
> 0x78c68e54: 0x77831004 0x77831004 0x77831004 0x778cb5d4
> 0x78c68e64: 0x00da3287 0x77831004 0x77831004 0x77831004
> where the second int (0x1848038d) is the base header (the oop points to
> 0x78c68dd8). The first fields is a smallint for 1, I guess.
Instead of using raw gdb, you could try using gdb plus the debug routines
included in the VM (or use the VM simulator). e.g.
set a breakpoint in interpret, run your favourite image (in this case an
updated trunk squeak 4.2) and hence examine it immediately after loading:
Breakpoint 1, interpret () at
(gdb) call printOop(nilObj)
0x141fc004: a(n) UndefinedObject
find its class UndefinedObject:
(gdb) call printOop(fetchClassOf(nilObj))
0x146f0c34: a(n) UndefinedObject class
0x147ff778 0x14d5cdd8 0x5 0x141fc004 0x14255af0
0x141fc004 0x1438e6e0 0x14696580 0x141fc004 0x141fc004
print its method dictionary, the second inst var, 0x14d5cdd8
(gdb) call printOop(0x14d5cdd8)
0x14d5cdd8: a(n) MethodDictionary
0x55 0x14d5ceec 0x141fc004 0x141fc004 0x141fc004
0x1438fdac 0x141fc004 0x141fc004 0x141fc004 0x141fc004
0x141fc004 0x141fc004 0x141fc004 0x143a1448 0x141fc004
0x141fc004 0x141fc004 0x143c8290 0x141fc004 0x14386320
0x1511d0ec 0x1438fc90 0x141fc004 0x14386a70 0x1439d52c
0x143b53c8 0x1438db24 0x1483bc30 0x1439d53c 0x143a07c4
0x143c9634 0x143e6ca4 0x144095e4 0x14386730 0x14390080
0x1438ff04 0x14424624 0x14402638 0x1511d004 0x143869c4
0x14386530 0x1439f5d4 0x1438ffb4 0x143864b8 0x14386a08
0x143c20d4 0x14386a18 0x1438647c 0x143868cc 0x141fc004
0x1438f800 0x143aba20 0x141fc004 0x141fc004 0x14424608
0x1438b2f0 0x1438d1b0 0x144095fc 0x141fc004 0x141fc004
0x141fc004 0x143a4ce4 0x143c44e4 0x143c82a4
and hex 55 is 2 * 42 + 1, so the dictionary has 42 entries (how appropriate)
and, by counting, 64 slots.
(gdb) call printOop(0x14d5ceec)
0x14d5ceec: a(n) Array
0x141fc004 0x141fc004 0x141fc004 0x14661cdc 0x141fc004
0x141fc004 0x141fc004 0x141fc004 0x141fc004 0x141fc004
0x141fc004 0x14661cf4 0x141fc004 0x141fc004 0x141fc004
0x14661d0c 0x141fc004 0x1486564c 0x1513faa0 0x14661d28
0x141fc004 0x14661d4c 0x14661d70 0x14661d88 0x14661da0
0x1483bcb4 0x14661dc0 0x14661dd8 0x14661df0 0x14661e28
0x14661e40 0x14661e58 0x14661e6c 0x14661ea4 0x14661ebc
0x14661ed4 0x1513fa80 0x14661ef4 0x14661f18 0x14661f2c
0x14661f40 0x14661f80 0x14661fa4 0x14661fb8 0x14661fd0
0x14661fe4 0x14661ff8 0x141fc004 0x1466200c 0x1466202c
0x141fc004 0x141fc004 0x14662048 0x146620cc 0x14662114
0x1466212c 0x141fc004 0x141fc004 0x141fc004 0x14865690
0x14662150 0x14662164 0x141fc004 0x14662180
Now the questions: in the header, size bits are 100011, why?
Does it? Anyway, the header constants are set forth in ObjectMemory
class>>initializeObjectHeaderConstants, and used in ObjectMemory's header
access protocol in methods such as ObjectMemory>>sizeBitsOf:.
Also second object would be the array oop in case the object weren't flat,
> but in this case it seems to point to the position past the last variable
> field, am I guessing right?
Don't guess :) Read the source and work it out.
> How does the VM manage this flattening of the instance vars?
As far as the GC is concerned there is nothing special about the object; it
is just a vector of object pointers. So the flattening (actually the
stepping over of named inst vars) is actually handled by the at: and at:put:
code, which is in Interpreter/StackInterpreter's indexing primitive support
protocol in methods stObject:at:, stObject:at;put:, subscript:with:format:
> On Wed, Jun 8, 2011 at 2:01 PM, Eliot Miranda <eliot.miranda at gmail.com>wrote:
>> On Wed, Jun 8, 2011 at 8:35 AM, Javier Pimás <elpochodelagente at gmail.com>wrote:
>>> Hi, this is another simple (I hope) question:
>>> I have an instance of MethodDictionary, which is defined as
>>> Dictionary variableSubclass: #MethodDictionary
>>> instanceVariableNames: ''
>>> classVariableNames: ''
>>> poolDictionaries: ''
>>> category: 'Kernel-Methods'
>>> looking at the format field it says 3, which I understand is that
>>> instances have both fixed and indexed oops. The question is, what does that
>>> mean? or better why is it that way if the class only defines an array an a
>>> tally as instance variables?
>> An object with both named and indexed inst vars is flat, i.e. is only a
>> single object. For example MethodContext. An object with an array to hold
>> its variable objects, e.g. the current OrderedCollection, is not flat, i.e.
>> two objects. The distinction is to do with the efficiency of implementing
>> become. In the original Smalltalk-80 implementations and in the VisualWorks
>> VM objects in the heap are split into a header and a body, with the header
>> containing a pointer to the body (and references to objects are pointers to
>> object headers). This makes certain algorithms like compaction easy to
>> implement, but it also results in a cheap become.
>> When Squeak was implemented it was decided to use flat objects in the VM,
>> following the lead of David Ungar's Berkeley Smalltalk implementation, and
>> of the subsequent Self implementations, all of which also use flat objects.
>> Flat objects makes for faster allocation and faster inst var access (since
>> accessing an inst var doesn't require the double indirection of following
>> the pointer to the header and then the pointer to the body). But it makes
>> become very much more expensive, since in the worst case the VM must scan
>> the entire heap looking for references to the objects in the become
>> operation and replacing them by references to their corresponding objects.
>> The solution David Ungar developed, which was adopted by Squeak, was to
>> unflatten objects that used become to grow, such as OrderedCollection, Set
>> and DIctionary, and use an array to hold their variable part.
>> A key point is that objects such as OrderedCollection, Set and DIctionary
>> encapsulate their state and so are free to grow by allocating a larger array
>> and copying the contents from the old to the new array. There is still an
>> issue with streams, which have also been changed not to use become when
>> growing their collections. It used to be the case that one could use
>> streams to grow objects, since in Smalltalk-80 they used become. But this
>> was not used very often and easily worked around.
>>> Also, looking at the header, I see some strange things in the size field
>>> (being bigger than what I'd expect). What is the format of size field in
>>> this case?
>>> Javier Pimás
>>> Ciudad de Buenos Aires
> Javier Pimás
> Ciudad de Buenos Aires
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Vm-dev