Creating squeak pluggable primitives - a few ideas, some confusion, looking for like minded folks

Andreas Raab Andreas.Raab at gmx.de
Thu Nov 8 02:06:36 UTC 2001


Hi Mark,

> Here's what we learned thus far
>
> 1)  Forget the numbered primitives, as described in Pope's
> paper on the same.

Correct for your kind of system. It's much better (and much easier for you)
to maintain named primitives in separate shared libraries.

> 2)  Don't do built-in plugins for the same reason.  Yeah, I
> guess you could build them in later, we might even do that.
> But not till we exit development, as there's a big difference
> between rebuilding our  2 pages of glue code in a
> dynamic library or the whole system.  Me, I'd pick door # 1
> every time.

Also correct - except that building the primitives internally allows you to
ship a VM without getting into DLL-hell; which is the reason why I am
shipping the Windows VM with all plugins builtin. There's nothing to beat
external plugins during developement but it can get into some serious
problems when it comes to deploying your stuff. I'd always pick internal
plugins up to the point where you ship.

> 3) [Here's where it would be nice if some experts chimed in]
> In looking at implementers and senders of Interpreter and plugin methods,
> there seems to be a few standards drifting around, some of which are
> pretty obviously orphaned.

Please give a few examples for those places. I don't know what exactly
you're talking about so it's pretty much impossible to "chime in" about
something where I have no idea what you're actually referring to ;-)

> 4) It seems that native ints are 31 bits ?

Depends on your interpretation of 'native' - SmallIntegers (those being
represented in immediate form) are in fact 31bits (30+sign bit).

> I'm not positive, but several minor discoveries about the use of int and
> bitflags relating to returns from primitive functions meant you
> couldn't do the lazy mans integration of treating pointers
> in the client system (V3 in our case) as magic integers in
> Smalltalk.

If done correctly, you could. All you'd need to do is to use
#positive32BitValueOf: for getting the value of the object
(SmallInteger/LargePositiveInteger) and #positive32BitValueFor: for creating
the Squeak integers. In other words, a function returning a "pointer as
integer" should do something like:

	interpreterProxy push: (interpreterProxy positive32BitValueFor: pointer).

and a function accepting an integer as a pointer should do something like:

	pointer := interpreterProxy positive32BitValueOf: someIntegerOop.

Needless to say that you're always living on the edge when treating pointers
and integers interchangeably.

> 5) A word I've always liked is "thunk", which I first heard
> when msft tried to explain how 16 and 32 bit code could lie
> down like the lion and the lamb.
> And we all know what happened there....

Most people on this list probably don't but I know what you're saying ;-)

> Nevertheless, it does appear that a generalized description
> of the process of interfacing Smalltalk to another systems
> has a very clear thunking layer, which exists to bridge the
> differing operational natures of the interfaces between
> the systems.

Yes, it does.

> The clearest indicator of this is in the primitive call
> mechanism itself, which can be generalized as a call of
> a procedure which does not affect the stack but does change
> the execution pointer.
>
> Dig this -
> V3Object>>getName
> 	rawGetName: myInternalV3Object
>
> and this -
> V3Object>>rawGetName " :anObject"
> 	<primitive: 'rawGetName' module: 'purehell'>
> 	self error: 'If you are here you are dead'
>
>
> I believe that in many integration efforts, especially if the
> target system has a clue about objects, that Squeak will end up
> using a lot of the native capabilities of the target, i.e. in
> this case Squeak *would not* maintain a separate copy of the name.
> Way too much potential for problems...  This means
> that there will be this point where one needs to switch from
> the high level squeak object to an internal client system object.
> I would argue that this mechanism could benefit from a formalized
> generalization, as we could then construct various helpful code fragments
> to aid us in these efforts.

I agree. What you're basically after is an "object extension" to the plugin
mechanism which would allow you to call methods on "client objects" more or
less natively. The layers you're describing are exactly what the FFI is
doing dynamically - except that the static compilation solves some of the
more ugly parts (like name mangling and calling conventions).

> 	2) We (sigh) use MFC from MSFT cause it was a lot
> cheaper than writing a bunch of inet oriented classes
> ourselves and because it's really hard to keep the
> damn thing out of your builds anyway.  Given this need,
> and our extensive set of compiler configuration flags
>(try using the STL without disabling function length
> warnings) we needed to be able to emit includes before *ANY*
> code was written to the file.

Try CCodeGenerator>>addHeaderFile: - this will emit a bunch of includes
before any code.

> Here are our questions of the moment-
>
> 	1)  Has anyone got notes/doco/whatever on what's
> current and what's not in the Interpreter and plugin
> support classes for passing and obtaining selectors ?
> I see code that is implemented using selectors to
> call plugin and other compiled code, such as the FFI
> interfaces, but when I have a compiled plugin method that
> needs to call another compiled method, I keep having to
> directly manipulate the stack with push params, call function,
> pop params, I can't seem to get selector calls to work reliably.

Are you trying to invoke a method on a Squeak object from the C++ layer?!
This won't work at all. Selectors (Symbols) are no C Strings and in order to
send a message from a primitive you'd have to get your hands on the selector
itself first[**]. After which you'd have to be able to call back into the
interpreter which is not supported either.

[**] A very few selectors (like #doesNotUnderstand: or #cannotInterpret:)
are kept in a form that allows the VM to send them directly.

> 	2) When we call translateDoInlining on our plugin,
> every once in a while we get an error claiming
>'undef objects are not indexable' (duh) at the end of the
> codegen process (after file is created).  Rerunning the
> translate does not reproduce the error, it just works on
> the second call.  The problem itself seems intermittent.

Mail out a bug report the next time you run across it - also keep in mind
that chances are that the problem was introduced by the C++ patches (I've
never seen nor heard about this bug).

> 	3) When compiling a number of our primitives, we have
> variables that have to be defined cause we are using them
> to move values to/from squeak.  If we're accomplishing this
> primarily through use of clever cCode: calls, the method compiler keeps
> asking us whether we want to delete these unused variables, or
> do we know we're referring to an uninitialized variable.
> We could really use a method to be able to explain to the code
> generator that it shouldn't worry because it's clueless, we aren't.

Here it is:

	| a b c |
	false ifTrue:[
		a := b := c := nil. a. b. c.
	].

> 	4)  The getModuleName function, automagically generated
> by the translator does not compile in C++, because the moduleName
> is defined as an int, and the function is defined as returning
> a string.

moduleName is defined as "const char *" - at least in a reasonable recent
system ;-)

> 	5) and last and least ====  for those using squeak on mswindows,
> 	   is there an easy way to remap the ALT and CTL keys so I'm not
> 	   constantly ALT copying in word and control copying in squeak.

Go into ParagraphEditor class and change those shortcuts ;-)

Cheers,
  - Andreas





More information about the Squeak-dev mailing list