The Mosner bit

Mats Nygren nygren at sics.se
Fri Sep 1 11:34:53 UTC 2000


Hi,

I found on the swiki reference to Hans-Martin Mosner's extra bit for
tagging object pointers.

  http://www.heeg.de/~hmm/squeak/2tagbits/

This must have been discussed in the past. I wish to renew that
discussion as I think it is an exceptionally good idea. Perhaps H-M M
himself will give his current position on this. The following is what
I make of it.

Description:

The details can be elaborated in different ways here's one (showing
the two least significant bits):

10 - small integer
00 - pointer
01 - special i
11 - special ii

Small integers works as previously but with one bit less. This will
probably cause some solvable problems.

The interesting part is special i/ii, I propose that they are used as
follows:

byte3 byte2 byte1 - together forms 24 bits giving 16M values.
byte0 - bit 0 is constant = 1, 7 bits left gives 128 values used as
tags.

So we have 128 tags, if used wisely that is a huge possibility.

Here are some possibilities:

Characters
  ascii/lf, ascii/cr, ascii/crlf, iso-xxxx-1, utf-8, utf-16, ..,
home-brew-1, ..
  consider for example one tag meaning the present character set
(ascii/cr) with
extra info for font, style, size, color.

Standard Classes -
  a well chosen set of essential classes, can be easily accessed
  and communicated. This should include Object, Symbol and many
  such and also the ParseNode-hierarchy and similar.
Ansi (and other well established) protocols -
  all standard interfaces can be cataloged in this form and
  easily communicated
bytecodes -
  the normal byte code set can be considered numeric
  code/symbol at the same time
widget family
primitive methods
special (simple) methods (projections, many others)
a nomenclature for (C-like) types
tightly packed structures
html tags
Prolog-like variables and other things with special "roles in the system"
other important (closed) coding systems, midi, vrml
many other possibilities exist

Mosner gives an example that in this version would give 12-bit
coordinates for Point.

This can be considered universal (cross image) pointers. Things that
are lifted above gc (global tenure). It is a good help in communicating
with plugins, providing a rich language independent of gc.

Some of the above I have experience with. No doubt others will find
intereseting uses of the idea if it gets available.

The above is a bit cryptic and very incomplete, bottom line:

This is a good thing, lets reimplement it. If Hans-Martin Mosner will do it
good, if not I will. If it is wanted that is. An immediate gain is for different
character sets including large ones, 24M no problem, the current way of
handling characters doesn't scale.

Note also that this idea will be even better on 64-bit machines that will
appear sooner or later. One would then have an immense set of interesting
values liberated from the burden of gc.

/Mats





More information about the Squeak-dev mailing list