[squeak-dev] Re: image.version0.82.bits.html (was Re: Re: The solution of)

Klaus D. Witzel klaus.witzel at cobss.com
Wed Aug 13 11:54:53 UTC 2008


Hi Tony,

on Wed, 13 Aug 2008 09:47:52 +0200, you wrote:

> Hi Klaus,
>
> Klaus D. Witzel wrote:
>> - http://squeak.cobss.ch/ImageSpecs/image.version0.82.bits.html
>> That page contains a short description and the data, let me know when
>> something is missing or not clear.
>
> I like it a lot!

I'm happy you like it :) In the below I attempt to answer your questions  
and comment your suggestions (message got a bit longish, sorry ;)

> I've had a bit of fun throwing together a wee program
> to explore it (hideous python script attached).

:) You can make (more) use of the integer constants in line zero, I for  
example have no text/word constants in my spec data reader ;) and the  
extra indices appended behind the text in line zero are for convenience  
only. Another invariant is the position of the Class row and the MetaClass  
row. Together with -1 that this paragraph contains all the axioms ;)

Another interesting and important subproblem is oop reconstruction. Igor  
advised me to make use stubs and I intend to use instances of Association  
(class index -> instance index) for stubs at reader time, which can serve  
as already perfect oop and the be resolved away in a 2nd swoop/slide over  
the rows. That way the time complexity meet O(n+m), space complexity also.

> I have some questions
> for you:
>
> 1. It seems like there are two representations of SmallIntegers, e.g. "0
> 5" and "83 500" for 5 and 500, respectively.

Yes, small integer 5 should be represented by the pair "0 5", the 500 (and  
the 300 right to it) as well. You found a bug (thank you :) which slipped  
through until now! see also next paragraph.

> Do I have that right? Why is it done that way?

When filing in the spec data (initially for a Palm iiiC and Tungsten|C),  
class SmallInteger is not needed for creating instances of it, therefore  
"0 5". And all classes must be initialized with oops for their slots, so  
this recursion is not needed for "0 5" pairs+friends. But the bug slipped  
through because at the time "83 500" is resolved, class SmallInteger  
already exists. The bug will be fixed with the next udpate :)

Note that SmallInteger was renamed from its original SmallInt (not a good  
idea, never change history!), change will be reverted soon. But the "typo"  
fix will be kept in Magnatude's name ;) and Number class' name change will  
also be kept as the convention of the original author wants it for all the  
meta classes.

> 2. Why do sometimes the number of instance variables not line up with
> the number declared in an object's class? For instance,

Let's s/instvars/slots/ for avoiding confusion at this level, thank you.

>   MetaApplication has 6 instvars, but Class claims its instances have 5
>   MetaColor has 6 instvars, but Class claims its instances have 5
>   MetaImage has 6 instvars, but Class claims its instances have 5
>   MetaPoint has 6 instvars, but Class claims its instances have 5
>   String has 6 instvars, but MetaString claims its instances have 5

Compute (aClass class instanceVariables size) for the correct number of  
slots as intended by the original author of the system. Method  
#instanceVariables goes up the superclass ladder as usual and sums the #  
of variables declared by the contributing entities. And class variables  
are instance variables in a class' class, in this system.

You are correct with the other part of your observation, class String has  
indeed 6 slots but has or inherits no class variables, similiar for others  
that you mention. This has historical reasons, the original class builder  
was a human being who has a method for adding variables but no method for  
removing them. So you see some unused (usually nil'ed) slots which could  
be reactivated by increasing the value in the 'size' slot and appending an  
item to the 'variables' slot's oop (an array).

> 3. Why are Chars represented specially? Couldn't they be represented in
> the same style as Methods?

Similiar reason as with SmallInts, the spec data of a Char's code point  
dosn't need to mention the "primordial" code point's class.

Same is so for ByteArrays and would be so for Floats.

> 4. Why are False, True and Undefined represented specially? Couldn't
> they, too, be in the same style as Methods?

There was no intention to represent them specially, its a consequence of:  
the instances of False, True and Undefined are not transferred, they are  
created upon arrival. When -1 was created for "nothing more of this  
instance data can come from this row" it became systematically used for  
indicating the presence of *instances* and as a consequence the absence of  
transfered *data* for instances of False, True and Undefined.

> 5. Why don't all the rows have "0 -1" as the final two columns, if they
> have no special instances?

The -1 signals absence of *more* instance *data*, but also that one or  
more instances *must* be created on arrival. The cases you mention have no  
instances to be created on arrival (for the system to get going properly).

> And finally, I have a suggestion: if each row had an indication of how
> many fields to expect before the human-readable name and the instance
> data, that'd make parsing simpler.

The rows are parsed by a flat loop, like "2 to: remainingFileLength by: 2  
do:", with a single #peek (or equiv.) in it for recognizing the human  
readable, redundant name item (the code here is conceptual level).

In a class based system the nature of what is parsed from spec data must  
tell an oop of another class and its instance -- most of that material is  
found only in other rows (String's class name slot is as exception, also  
SmallInt's size slot and Array's methods slot ;)

There are no other delimiters needed for fully (self-)describing the spec  
data with its own integer pointers, IMO. But if there is something simpler  
(for representation and for parsing) then I'd love to make use of it :)

> Alternatively, removing the
> human-readable text and simply putting ":" or something to delimit the
> instance data from the other data would be an improvement.

And I tought that text/word would make the spec data an interesting read  
;) Well, that text/word is supposed to be there for the convenience of the  
human being whose job title is debugger ;) I won't want to disappoint that  
already hard working person by eliminating ever more clues ;) I hope for  
your understanding :)

> What led you to produce this artifact? It's very interesting.

When the original .image data file was found on the web together with Tim  
Budd's source code for interpreting it (does it in javaneese), the first  
idea that came up was to recompile the system with inserted "I'm here" and  
"now I do that" snippets.

But at that time, no tool-chain was still available for producing a  
code/VM compatible *executable* for reading the data created by same  
tool-chain vendors' own serialization facility: they had "serialization  
data version mismatch" :( And looking for dubious backup copies from sites  
specialized on keeping "copied" material downloadable was ruled out  
because of the commercial status of the tool-chain/VM vendor.

So, luckily the original system does and is Smalltalk and a tracer*) was  
written in it for creating a usable data file with the .image spec, and a  
new #primitive was then written for parsing the spec data from its  
non-textual binary representation. This without basing anything on any  
tool-chain/VM vendors' "version mismatch" options any longer.

The result was then transfered to my Palm iiiC and Tungsten|C for seeing  
it in action, with additions for colors, a browser with #compile, tabbed  
GUI (workspace, browser panes), text selection #doIt and #printInt, on  
those small devices :)

The more actual .image is an update, makes use of the original author's  
new GUI primitives, but still uses the tool-chain/VM vendors' serialized  
representation (it's easy, when it works). Making spec data out of this  
updated .image was not a problem with a tracer who was born in the older  
.image ;)

> Regards,
>    Tony

*) this was quite an experience, max 16 literals per method (including  
selectors, but super selectors could be placed past the 16th literal), max  
16 temps (incl. *all* blocks' args), max 256-1 jump target locations (big  
#while*: loop anybody? nesting blocks a bit deeper anybody? single #if*  
around the remaining method body anybody?)

And, it was and is f-u-n :)

/Klaus




More information about the Squeak-dev mailing list