#!!!printOutline +

$Id: writingPlugins.otl,v 1.4 2003/02/10 19:48:01 ned Exp $

+

Extending Squeak by Writing Plugins

| by Ned Konz | |
|

Quick links:

|
| intro Setting the stage | Imagine a world in which we couldn't choose what our programming | languages could talk to; where programmers had to rely on language | vendors for access to libraries, operating system calls, or | devices. | Fortunately, those of us using open source languages generally have | an alternative. | Today's most successful languages are all capable of | being extended more or less easily to integrate with new systems | and devices. | This ease depends on a number of factors, but generally, | dynamic languages (like Smalltalk, Ruby, or Perl) can be harder to | integrate with external libraries than C, C++, or assembly | language. | Most of them provide memory management that is different than the | memory management (if any) of external libraries written in C. | Further, since the execution model is probably different, there is | usually some glue code required between the language and the | extension library. | | I recently decided that I was going to make Squeak Smalltalk work | with an open source package that is available as a static | link library. | It took me a while to learn how to do this correctly, so I thought | I'd share what I learned with you. What is Squeak | Squeak is an open-source Smalltalk | language development system that comes with a powerful development | environment, graphics frameworks, and a number of other tools. | | You can write code to do most of the things you need to do in | Squeak directly in the Smalltalk language. | But it's also possible to extend Squeak using code | that's written in C or another language. What is a primitive | These extensions are called plugins, and contain | primitives, which are named subroutines that can be called | directly from Squeak code. | | This article will explain how to make your own plugin in Squeak, | and will take you through the construction of an example plugin. | There is an appendix at the end for use as a quick | language reference. | | I assume that the reader has some familiarity with both Smalltalk | and C syntax, and at least some familiarity with non-blocking file | I/O and the select() runtime library function. +

Why bother writing primitives?

| Why would you choose to write a primitive rather than writing a | method in Smalltalk? faster | One reason is that the primitive will probably run faster than | Smalltalk. | Since Squeak uses a byte code interpreter, individual | instructions run more slowly than native code produced by a good | compiler would. | For some applications -- realtime streaming video or audio, | compression, crypto algorithms, and JPEG decoding, for | instance -- this gain in speed can make the difference between | an application being usable or not. external lib linkage | Another reason to write a primitive is to use the services of an | pre-existing library. | This could be anything | from native OS services (like sockets, asynchronous file support, or | serial port usage), | to extension libraries like zlib (compression) or | pcre (regular expressions), | to interfaces with other programs (like OLE, Applescript, or X11 | servers). callbacks (of a sort) | Primitives also let you deal with callbacks from external | sources -- somewhat. | Unfortunately, the Squeak interpreter doesn't let you call | Smalltalk code from external code. | Because of this, the usual idiom is to receive the callback in a | routine written in another language, and signal a Squeak | Semaphore to let a Squeak Process continue running to handle the | condition. asynchronous I/O polling | The other important justification for Squeak primitives is to make sure | that Squeak doesn't block while waiting for I/O. | The problem is that Squeak runs in a single OS process, and has | its own multi-tasker internally. | If one Squeak Process blocks at an OS level, no other Squeak | Process can run. | To make non-blocking I/O possible, most ports of Squeak have a | provision for checking I/O events (files or sockets that have | become readable or writable, or sockets that have exceptions) and | calling back to user code in a plugin. | This code then sets a Semaphore as described above in the | discussion of callbacks. | Using this scheme, a Squeak Process that needs I/O service can | start the request and block on a Semaphore until the transfer is | complete, letting other Processes run. examples of existing prims | For some examples of existing primitives, you can look at the | classes in the class category VMConstruction-Plugins. | Good examples include: - the DSAPlugin, which is an example of a plugin created for the purpose of speed. It calls no external libraries. - the Mpeg3Plugin, which is an example of a plugin that accesses an external library. - the AsynchFilePlugin, which is an example of calling OS services and using asynchronous notification via semaphore signaling. +

About the Spread plugin

| I can best demonstrate how to write a plugin by showing you a | concrete example: my Spread plugin. | This is a plugin that I made to interface with an external library, | in this case the Spread library libsp. | Let me introduce you to Spread and take you through the process of | writing this plugin. what is Spread | Spread is a group communications | system that allows messaging to groups across the | network. | I want to add Spread capability to Squeak so that I can | experiment with various broadcast, collaboration, and | distributed object schemes. | A Spread system consists of one or more Spread daemons | that receive requests from Spread clients and pass messages | between themselves and between Spread clients. | A Spread client can be in as many groups as it wants, | and it can send messages to as many groups as it wants (even | ones that it doesn't belong to). plugin vs. writing net protocol | After looking at the Spread API documentation and source code, | I saw two choices for connecting Squeak to Spread. | | One was to duplicate all the client logic in Smalltalk, down | to sending packets over the network. | I could read the C or Java implementations and duplicate them | in Smalltalk. | This has the advantage of not requiring a compiled plugin, but | has the serious disadvantage of being a lot of work. | | The other choice that I saw was to hook Squeak up to the | Spread client library libsp, which is written in | C and is available as a static linker library. | Although this choice requires compilation of a plugin, it has | the advantage of being able to track new versions of Spread | easily by a simple re-compile. | But most important for me, it looks like much less work, | so this is the strategy I chose. libsp is cross-platform | libsp is written in reasonably portable C and is | compilable on all the popular desktop platforms that support the | standard Sockets API. | This means that my plugin potentially can be used on most of the | computers that run Squeak. simple API | The Spread API itself is quite simple. | At its core, it consists of the following functions: - SP_connect() Connect an application to a Spread daemon. - SP_disconnect() Disconnect an application from a Spread daemon. - SP_join() Add a client to a (possibly newly created) group. - SP_leave() Remove a client from a (possibly nonexistent) group. - SP_multicast() (and its variants) Send a message to all members of one or more groups. The message will be marked as having originated from the sending client. There are six different levels of service that specify different guarantees on message reception and ordering. - SP_receive() Receive the next message on a given connection. Messages are either regular messages that are explictly sent by a client, or are membership notification messages of various kinds that are sent upon changes to group membership. SP_receive() will block until a message is received. - SP_poll() Return the number of bytes waiting on a particular connection without blocking. I could use this by itself rather than using a Semaphore, but I don't want the additional overhead of having to poll periodically. | Getting Squeak to use these functions seems straightforward | enough, except for SP_receive(). | I don't want to call a blocking function, because if I do, | none of the other Squeak Processes will have a chance to run | until the function completes. | So I'm going to have to avoid calling SP_receive() until I know | that it won't block. | This requires knowing that there are bytes to be received on | the socket that is being used by the Spread client | connection. | Luckily, one of the return values from | SP_connect() is actually a socket file | descriptor (though this isn't documented). but there is platform-specific asynchronous work to be done | Using the socket file descriptor to test for readiness to | receive requires Squeak to call the runtime library select() | periodically to test whether the file is ready. | This support has already been built into Squeak for the use of | Squeak's native sockets and asynchronous file support, so I need | to hook into it. | Unfortunately, there isn't yet a standard API for this | select() polling, so there will have to be a | platform-specific portion of my Spread plugin. | | To make the SpreadPlugin easy to port, I should write it so that | the Smalltalk part doesn't have to change for different | platforms. | So it looks like I'll end up with these files: - SpreadPlugin.c Platform-independent code. - SpreadPlugin.h Declarations of types and external functions needed by the SpreadPlugin.c code. - sp{Platform}Spread.c Platform-specific code needed for hooking up the functions in SpreadPlugin.c to the Spread API and the operating system specific Squeak polling mechanism. | I'm writing this first for Linux, so my platform-specific file | will be called spUnixSpread.c . developing Squeak primitives +

Anatomy of the plugin

| Now that I've figured out a broad strategy, what do I need to get | this plugin to work? | The required pieces are independent of the way I choose to write | the plugin code itself. | From the top down, I will have: - Squeak client code that uses the plugin. - A Squeak class that supplies the interface for the client code. - Methods in the interface class that call functions within the plugin. - Functions within the plugin that are designed to be called by Squeak. These are the primitives. - Other functions required internally by the plugin. - The Spread API itself. | It probably makes sense to define these from the top down, so that | the client interface is as clear and Squeak-friendly as possible. | So let's go over each of these pieces in order from the top down. Squeak client code | At the highest level, my requirements for using this plugin from | Squeak seem pretty simple: - The users of the plugin shouldn't have to know that they're using a plugin. - I want it to be possible for one or more Processes to be sending messages while another Process is waiting for messages. - Ideally, the different kinds of Spread messages (regular and membership) should be represented by instances of different classes, so I can use polymorphism to dispatch the messages. | There's also some unknowns and code that I don't want to write | right now: - I'm not too sure what kind of message polymorphism I need yet, so I'm going to defer that decision until I've had some experience using the plugin. In other words, I'll start with a single SpreadMessage class that looks very much like the messages I receive from SP_receive(). - I'm not sure how I'm going to handle the probable situation where different Processes want to receive messages that were sent to different groups. I'll assume for now that I'm going to build a dispatcher on top of whatever plugin interface I come up with. interface class that communicates with plugin | The next level below the client code is the interface class. | Since all of the operations in the Spread API either require or | return mailboxes (which identify individual connections, | and are represented by socket file descriptors), it makes sense | to have the interface class represent a connection. | I'll call it a SpreadConnection. | It will have to present the appropriate API for client code, of | course, but it will also have to hold whatever data I need to | represent the state of the connection itself for the use of the | plugin code. holds data for plugin: sem indexes, file handles, etc. | This state data includes at least the file descriptor | returned from SP_connect() and the Semaphore | used to block a single Squeak Process while waiting for a | receive. | It might also be nice to maintain whatever data pertaining to | the connection that SP_connect() returns for the | sake of client code, though I might not actually need it. | So I'll add the private group name that is returned by | SP_connect(). | Maybe later I'll also save the name and/or port of the Spread | daemon for error reporting, but not now. Spread API (Smalltalk) | At first, I'll make the interface of the | SpreadConnection class mirror the Spread API. | This will make debugging easier, but may not be appropriate | for final use. | As I discover more about the needs of my programs that use | this plugin, I can add to or change the interface. | | All the Spread API calls return a numeric error code of some | sort; some of the calls also pass back a byte count in the | error code. | I'm going to return the same error code from my Smalltalk | API for the time being, because it makes testing easier. | | So my initial interface will be: connectTo:privateName:wantGroupMembershipMessages: - connectTo: daemonName privateName: privateNameOrNil wantGroupMembershipMessages: wantsGroupMembershipMessages Connect to the Spread daemon named by daemonName, with a (daemon-)unique privateName (if nil, one is assigned). wantsGroupMembershipMessages indicates whether or not group membership messages will be sent to me. Answer the Spread error code. disconnect - disconnect Disconnect from Spread. Wake up all Processes that were waiting for data to appear on this connection. Answer the Spread error code. join: - join: aGroup Join the group whose name is aGroup. Answer the Spread error code. leave: - leave: aGroup Leave the group whose name is aGroup. Answer the Spread error code. multicast:messageType:serviceType:groups: - multicast: mess messageType: messType serviceType: serviceType groups: groups Send the message mess, with user message type messType (16 bits), and Spread service type serviceType, to the groups whose names are listed in groups. Answer the Spread error code. poll - poll Answer the number of bytes waiting to be read, without blocking. receive - receive Block the current Process if necessary until data is ready, then answer a single message. Dealing with shutdown/startup | After thinking about it a while, I realize that these | connections will have to be able to withstand an image save | and startup. | However, if part of their state is a file descriptor, that | descriptor will certainly not be valid when the image comes | back up. | I could either disconnect all the connections on a shutdown | using a class shutDown method, or I could just | keep track of whether they're valid somehow. | I think I'll do both, because I also need to be able to close | out connections when they get garbage collected, if someone | forgets to disconnect them. | And I need to tell if a connection that has been closed is | safe to use. | So I'm going to add a way to query validity from Smalltalk | (I get to figure out how to do this later): isValid - isValid Answer whether I am a valid connection. If I am not, all other Spread API calls will cause an Error. Squeak code from interface class calls interface methods, usually named primXXX:arg: | Now that I've mapped out the top SpreadConnection | layer, I have to actually call from Squeak to the primitives in | the plugin. | I'll call these interface methods here. | Squeak has a special syntax for these calls. | They look like this: |
		|primIsValid: conn
		|	<primitive: 'primitiveIsValid' module:'SpreadPlugin'>
		|	^ false
		|
these have <primitive:module:> call followed by st code | The first line of these calls looks like the first line of | a normal Smalltalk method, with the name of the method and | arguments, if any. | This is followed by the special syntax
| | <primitive: 'primitiveIsValid' module:'SpreadPlugin'> |
| which calls a named primitive (in this case, | primitiveIsValid()) | in a named plugin (here SpreadPlugin). st code is for when the prim fails | After the primitive call is Smalltalk code. This code is | only run if the primitive call fails for some reason. | Reasons for a primitive failing include not having the proper | plugin, not being able to load it because of library | dependencies or sending the wrong kind of parameters. | Since the SpreadPlugin requires the use of the | primitives and can't be effectively replaced by Smalltalk | code, all of the plugin's primitive calls, except for | primIsValid: and primConnect:..., | which will raise an exception if they fail. Aside: using Smalltalk to translate args/retval for the primitive | Another thing that the interface code can do easily is to | translate and prepare argument data for the primitive, and to | modify the output data from the primitive. | I've found that it's often easier to do this kind of translation in | Smalltalk than down in the primitive code. | For instance, C often wants NUL-terminated strings. | But Squeak's strings have a count and no NUL. | My first draft of a couple of these interface methods passed the primitive | the string and its count, so that I could avoid counting the string | in the primitive. | Some translation that did survive my optimization is the packing | and unpacking of group names. | SP_join() has a list of groups as an input, and | SP_receive() | returns a list of groups. | The C interface to the Spread plugin expects these names to be in | fixed-size arrays, 32 bytes per each group name. | However, it's much more natural for Squeak to pass around | collections of group names. | Rather than requiring a specific data type to be passed (say, | requiring an Array of Strings), | I allocate and stuff a ByteArray with the | characters going in to the SP_join() call, | and I allocate a buffer of a nominal size to | return the groups from the SP_receive() call. | Luckily, Spread will return an error code telling me if my | buffers are too short, and will also indicate how long they have to | be. | So my Smalltalk code allocates nominally sized buffers, calls the | primitive, and then reallocates buffers and calls the primitive | again until the buffers are big enough. Spread interface methods | Now I'm ready to specify how the interface methods look. | These look very similar to the higher level interface; the | differences are mostly because some of these API calls have | multiple return values that have to be returned in Smalltalk | objects. | The interface section of the SpreadConnection | class looks like this: - primConnect: daemonName privateName: privName groupMembership: wantsMessages privateGroupBuf: groupNameBuf semaIndex: semaIndex An additional input semaIndex was added here to pass down the index of the semaphore to signal when bytes are ready to read. The groupNameBuf is another return value from the SP_connect() call. - primDisconnect - primIsValid - primJoin: aGroup - primLeave: aGroup - primMulticast: message messageType: messageType serviceType: serviceType groups: groups numberOfGroups: numberOfGroups Here the collection of groups passed in to the higher level routine has been copied to an array of fixed-length character arrays, as expected by the C interface to SP_multicast(). - primPoll - primReceive: message dropRecv: drop In this primitive, a message object gets all the return values from SP_receive() except for the error code. These interface methods link to named primitives in plugins, either internal or external | The linkage between Squeak and the named primitives in the | plugin is managed by Squeak, which will load the external | library (or hook up the internal library) when needed and | arrange for the method calls to call the primitives. args and return on stack | Primitives don't take arguments or return values like, say, C | functions do; | rather, they get inputs from and leave their output on | the Squeak object stack. | Further, the contents of the Squeak stack aren't C objects, | so translation is usually needed before your C code can do | anything with the arguments or receiver of the message. | You also have to translate the C return value back into a | Smalltalk object. could use Slang | I could write my primitives directly in C, but decided | instead to have the C generated by a compiler for a language | called Slang. | This compiler comes standard with Squeak. | Its syntax is that of Smalltalk, but it generates plain old | (non-object) C code. | So method calls become regular function calls, plugin instance | variables become plugin globals, and other Smalltalk | expressions become C expressions. Primitives are usually written in C though could be written in any language | All of the Squeak plugins that I know about have been compiled | from C. | And most (but not all) of this C has been generated from Slang code in | the image. | There's nothing magical about either C or Slang; | since all that is needed is to have functions in a library that | have C calling convention, I could write my primitives in | (non-object) C++, or assembly language, or Delphi, or any other | language that was compatible. have no params other than on stack (except setInterpreter()) | If I were writing in C, I'd declare the primitives as | taking no arguments and returning an (unused) int or | void. | However, I'm using Slang to generate my C, so the Slang compiler will | automatically generate code to deal with the Squeak stack. external modules in DLLs and loaded if/when needed internal modules linked in statically Spread Primitives | Because I'm using Slang, the primitives are declared just | like the interface methods. | However, their exported names are the names given in the | interface methods (i.e. primitiveReceive rather | than primitiveReceive:dropRecv:). | They declare these exported names in the Slang code. | | The declaration of the primitives looks familiar: - primitiveConnect:privateName:groupMembership:privateGroupBuf:semaIndex: - primitiveDisconnect - primitiveIsValid - primitiveJoin: - primitiveLeave: - primitiveMulticast:messageType:serviceType:groups:numberOfGroups: - primitivePoll - primitiveReceive:dropRecv: | There are also three methods that will be called in | plugins that define them; their names are fixed by the runtime system. | Since I need both startup and shutdown processing, I define | all of them: - initialiseModule This routine is optional; if a plugin exports this, it will be called immediately after loading the plugin. In my code, I use it to initialize a global data structure. - shutdownModule This routine is optional; if a plugin exports this, it will be called at shutdown time. It will also be called if you manually unload your module, like this:
			  Smalltalk unloadModule: 'SpreadPlugin'.
			  
Manually unloading modules can be handy during module development, as it lets you re-compile an external plugin and test it without leaving Squeak. - setInterpreter() This routine is required, and will be generated automatically by Slang if you use it. If you're writing your primitives directly in C, you have to provide this function. It takes a single int argument, which is actually a pointer to the interpreterProxy. Plugin code saves this in a file-scoped global called interpreterProxy. | Then there's also an internal routine that my plugin code | will use to convert a SpreadConnection Smalltalk object into | a C struct SpreadConnection: - mboxPointerFrom: +

Inside a primitive

| Now that I've mapped out the general structure of my plugin, let me | talk about the primitives themselves. Primitives then call API routines, OS, etc. | The primitives have access to whatever module-wide variables | they declare (which appear as plugin instance variables using | Slang), as well as a global pointer to something called the | interpreterProxy. | This is a C structure that has pointers to many of the | Interpreter's methods, for use by the plugin code. | The pointer is set up by a call to a function within the plugin | called setInterpreter() that is called before any | primitives. | Within a primitive, the "instance variable" | interpreterProxy refers to this C structure. | It is through the interpreterProxy that all stack | access, memory allocation and most object conversion is done. may also call platform-specific code | Since I have to call platform-specific code (in this case, for | the asynchronous notification from select(), the | place to do it is not in the primitive method itself, but in a | separate function called by the primitive. | Remember, I have separated all the platform-specific bits into a separate | file called sqUnixSpread.c. this gives cross-platform capabilities separate platform-specific into separate files keep prims generic receiver and args are pushed on stack in order | So what does the primitive see when it's run? | When a primitive is called, the receiver (in my case a | SpreadConnection object) and any arguments to | the method are pushed on the stack in left-to-right | order. | So slot 0 (the top of stack) is the last argument in a call | with N parameters, | N-1 is the first argument, | and N is the receiver. slot 0 is last arg slot n-1 is first arg (of n) slot n is receiver stack is comprised of oops | The stack holds 32-bit numbers which are called | Oops. | There are two kinds of Oops: pointers to objects, and | SmallIntegers. | Since the SmallInteger objects are common, | small, and immutable, not using a pointer saves lots of | space and time. | To tell the difference between a SmallInteger | and a pointer, Squeak uses the low bit of an Oop to mark | it as a SmallInteger. | This leaves SmallIntegers with a 31-bit | range. they call various methods in the interpreter to access Squeak stack they leave return value on stack (unstacking args) or leave stack intact in case of error | Inside the primitives, I convert the arguments into primitive (C) | types, do whatever other preparation is required (like | allocating temporary buffers) and then call the Spread API | functions. | I then convert the return values from these functions (which are numeric error | codes) into Squeak SmallInteger objects, drop all | the arguments and receiver from the stack, and push the | converted error code. | If there is an error at the primitive level that keeps me from | even calling the Spread function, then the primitive leaves the | stack intact (for the Smalltalk cleanup code), and signals its failure | by calling interpreterProxy->failed() or | interpreterProxy->success(aBoolean). Can keep pointers during prim | Ordinarily, since Squeak is single-threaded from an OS point of | view, the rest of Squeak doesn't run while a primitive is | being executed. | Because of this, I'm free to use pointers to Squeak | ByteArrays, Strings, | SmallIntegers, etc. in my primitive. but no longer | However, because Squeak moves objects around during garbage | collection, I can't hang on to a pointer to a Squeak object | after the primitive ends. | So buffers and other structures that have to be accessed | asynchronously by C code (in my case, the socket file descriptor | and semaphore index) must be allocated from C code. | In my primitiveConnect call, I only need to hang on | to the file descriptor, which is an integer. | When I pass this to the asynchronous notification code, that | code copies it into memory it allocated. GC is not running while in prim | I could, if I needed to, also call back to the | interpreterProxy to allocate Smalltalk objects. | This is one alternative for making variant return values from a | primitive. | However, when Squeak is asked to allocate an object, it may decide | to do a garbage collection. | So any pointers to Squeak objects will not necessarily be | valid after allocating a Squeak object. unless you allocate memory so your accesses are safe during prim but if you want to hold on to buffers, etc. you have to copy them. | To deal with the moving objects problem, you can either copy | data into locally allocated memory, or you can take advantage | of a technique for protecting objects from the garbage | collector. or use pushRemappableOop: etc. | The interpreter supports a separate stack of objects that | will be protected from the garbage collector. | Within a primitive, you can push Squeak Oops onto this stack, | and be guaranteed that those objects won't be moved by the GC | until after they're popped from the stack. | So if you're doing allocation of Squeak objects and require | access to the internals of other Squeak objects later, you | should use this technique. | This is how it would be written in Slang (the C version | merely has a different syntax): |
			| interpreterProxy pushRemappableOop: bufferOop.
			| "allocate some memory"
			| returnBufferOop := interpreterProxy instantiateClass: classByteArray indexableSize: 20.
			| "now do something safely with bufferOop"
			| "and pop it when done"
			| interpreterProxy popRemappableOop.
			| 
+

How to make a plugin

data structures needed by plugins no asynchronous notification/operation needed then all data can be in ST object. asynchronous notification needed can't use ST object after prim call since it could move solution: separately allocated state object includes semaphore index, other data (fd, etc.) malloc state object put state* into ST object ex: asynchronous files saved fd registered with notifier storing external data WordArray/ByteArray objects WordArray/ByteArray in objects using semaphores semaIndex := Smalltalk registerExternalObject: semaphore can pass semaIndex (a SmallInteger) to C code (after converting to C int) which can call signalSemaphoreWithIndex(semaIndex) should be matched with unregisterExternalObject: semaphore choosing internal or external module construction can change easily by setting preproc flags internal is easier to debug external is easier to add to existing Squeak installations slang-c translation generating sources using VMMaker generating sources programmatically MyPlugin translateInDirectory: (FileDirectory on: '/tmp') doInlining: false compiling & linking platform specific unix example installing unix example running Squeak with new prims available unix example debugging print statements printf, fprintf work fine. using ddd/gdb with squeak detecting load of dynamic module (external) XXX howto? dlopen()? breakpointing on static method (internal) BKPT() defn/usage XCFLAGS settings for debugging may be different on different platforms +

Using Slang to write your primitives

Slang is a Smalltalk subset that can be compiled to normal C code advantages source lives in ST, can be dist with CS can run in simulation TIP does type coercion automatically disadvantages yet another language to learn requires separate translation step use subclass of InterpreterPlugin two subclasses of InterpreterPlugin to use as parent InterpreterPlugin TestInterpreterPlugin more recent, subclass of InterpreterPlugin, provides type coercion. written by Andrew Greenberg. class-side setup moduleName by default derived from the class name simulatorClass by default it is the plugin class. declareCVarsIn: addHeaderFile: use var:type:, var:declareC:, or var:type:array: to declare module-globals declareHeaderFilesIn: just adds ModuleName.h hasHeaderFile default is false requiresCrossPlatformFiles default is hasHeaderFile requiresPlatformFiles default is false shouldBeTranslated writing your primitive in Slang many methods in Object cat "translation support" also see appendix I declaring variables var:type: var:declareC: var:type:array: declare and initialize an array variable of a given type. declaring the name and parameters of your primitive rcvr := primitive: 'primName' all params are oops rcvr is assumed to be Oop rcvr := primitive: 'primName' parameters: #(Type1 Type2) must specify all params rcvr is assumed to be Oop rcvr := primitive: 'primName' parameters: #(Type1 Type2) receiver: #Rcvr will create local of Rcvr* and set to firstIndexableField of receiver. handy for WordArray or ByteArray subclasses +

Putting it all together: all the code for one primitive

id=fullExample | Let's look at all the code that I need to connect a Squeak method | to a primitive. | For this example, I chose | SpreadConnection>>connectTo:privateName:wantsGroupMembershipMessages:, | which is one of the more complicated routines in this plugin. SpreadConnection>>connectTo: daemonName privateName: privateNameOrNil wantGroupMembershipMessages: wantsGroupMembershipMessages | At the highest level, I have the Smalltalk method | SpreadConnection>>connectTo: daemonName | privateName: privateNameOrNil | wantGroupMembershipMessages: | wantsGroupMembershipMessages. | | It's responsible for: - initializing instance variables. In this case, I have to initialize:
    - wantsGroupMessages - privateName - semaphore this gets a new Semaphore that is then registered with the system (via my class register method). - mbox this is initialized to a ByteArray of 12 bytes. mbox is what is used to communicate the connection state (file descriptor, semaphore index, session ID) to the C code.
- Registering the semaphore. There is a system method called Smalltalk>>registerExternalObject: that takes a Semaphore and saves it in a special table of registered semaphores. It returns a small positive integer that is then used by lower-level code to signal the Semaphore. - Translating data types as needed for the next layer. Here I translate privateName into a ByteArray, and wantsGroupMembershipMessages into a SmallInteger. | |
connectTo: daemonName privateName: privateNameOrNil wantGroupMembershipMessages: wantsGroupMembershipMessages 
	|	 "Connect to the Spread daemon named by daemonName, 
	|	 with a (daemon-)unique privateName (if nil, one is assigned). 
	|	 wantsGroupMembershipMessages indicates whether or not group 
	|	 membership messages will be sent to me. 
	|	 Answer the Spread error code."
	|	 | semaIndex groupNameBuf retval |
	|	 semaphore := Semaphore new.
	|	 semaIndex := self class register: self.
	|	 semaIndex <= 0
	|		 ifTrue: [^ self error: 'can''t register semaphore'].
	|	 wantsGroupMessages := wantsGroupMembershipMessages.
	|	 privateName := privateNameOrNil.
	|	 groupNameBuf := String new: SpreadMessage maxGroupName.
	|	 mbox := ByteArray new: self class connectionStructSize.
	|	 retval := self
	|				 primConnect: daemonName asByteArray
	|				 privateName: (privateName
	|						 ifNil: ['']) asByteArray
	|				 groupMembership: (wantsGroupMembershipMessages
	|						 ifTrue: [1]
	|						 ifFalse: [0])
	|				 privateGroupBuf: groupNameBuf
	|				 semaIndex: semaIndex.
	|	 retval ~= SpreadMessage acceptSession
	|		 ifTrue: [self class unregister: self.
	|			 semaphore := nil.
	|			 mbox := nil.
	|			 ^ retval].
	|	 privateGroupName := groupNameBuf copyUpTo: (Character value: 0).
	|	 ^ retval
	| 
SpreadConnection>>primConnect: daemonName privateName: privName groupMembership: wantsMessages privateGroupBuf: groupNameBuf semaIndex: semaIndex | At the next level down, I have the interface method that calls the | primitive. | There's nothing too interesting here, except that if the primitive | fails I return an error code rather than raising an exception. | Since this is the first Spread primitive call that will be made, | this allows testing for the presence of the Spread plugin without | throwing exceptions. | |
	| primConnect: daemonName privateName: privName groupMembership: wantsMessages
	| 			privateGroupBuf: groupNameBuf semaIndex: semaIndex 
	| 	<primitive: 'primitiveConnect' module:'SpreadPlugin'> "primitiveExternalCall" 
	| 	^ SpreadMessage primitiveConnectFailure 
	| 
SpreadPlugin>>primitiveConnect: daemonName privateName: privateName groupMembership: wantsGroupMsgs privateGroupBuf: groupBuf semaIndex: semaIndex | Going down to the next level, I have the primitive itself (in the | SpreadPlugin class). | This is written in Slang. | Since I'm using the TestInterpreterPlugin, I can | declare the types of the method arguments using the | primitive:parameters: method. | This way I don't have to remember what slot numbers everything is | in on the stack. | The return value of this method is the receiver Oop (called | connection here), which I pass to the (inline) routine | mboxPointerFrom: to get the pointer to the struct | SpreadConnection that the C code will need. | | Then I get the lengths of the two strings, and re-check the | interpreterProxy success flag once more just in case, then call the | C code function sqSpreadConnect(). | Since this function returns a C int, I have to convert | it into a SmallInteger Oop. | | Note the trick here, which I learned from Andreas Raab's JPEG | plugin code, of using cCode:inSmalltalk: to fool the | compiler into thinking I've used the temporary variables. | Without this, the compiler will complain up to twice for each | variable when I accept a method change. | When I compile it, though, the statement generates no C code at | all (actually it generates a bare semicolon, but I can live | with that). | |
	| primitiveConnect: daemonName
	| 	privateName: privateName
	| 	groupMembership: wantsGroupMsgs
	| 	privateGroupBuf: groupBuf
	| 	semaIndex: semaIndex 
	| | daemonNameSize privateNameSize s connection |
	|
	| self var: #s type: 'SpreadConnection *'.
	|
	| "The following keeps the compiler from complaining."
	| self cCode: '' inSmalltalk: [ s := nil. privateNameSize := nil. daemonNameSize := nil.
	| 	privateNameSize. daemonNameSize. s. ].
	|
	| connection := self
	| 	primitive: 'primitiveConnect'
	| 	parameters: #(#String #String #SmallInteger #ByteArray #SmallInteger ).
	|
	| s := self mboxPointerFrom: connection.
	|
	| daemonNameSize := interpreterProxy sizeOfSTArrayFromCPrimitive: daemonName.
	|
	| privateNameSize := interpreterProxy sizeOfSTArrayFromCPrimitive: privateName.
	|
	| interpreterProxy failed ifTrue: [ ^nil ].
	|
	| ^(self cCode: 'sqSpreadConnect(s, daemonName, daemonNameSize, privateName,
	| 	privateNameSize, wantsGroupMsgs, groupBuf, semaIndex)') asSmallIntegerObj
	| 
Translated primitiveConnect() | Here's the C result of the Slang-to-C translation, which is what | I'd have to write in C by hand if I wasn't using Slang. | This was translated by by VMMaker into the file | src/plugins/SpreadPlugin/SpreadPlugin.c. | Note that if the interpreterProxy fails, the stack is left as it | was; it's only on successful exit that the receiver and parameters | are popped and the return value is pushed on the stack. | |
	| EXPORT(int) primitiveConnect(void) {
	| 	int privateNameSize;
	| 	int daemonNameSize;
	| 	SpreadConnection * s;
	| 	int connection;
	| 	char *daemonName;
	| 	char *privateName;
	| 	int wantsGroupMsgs;
	| 	char *groupBuf;
	| 	int semaIndex;
	| 	int _return_value;
	| 
	| 	interpreterProxy->success(interpreterProxy->isBytes(interpreterProxy->stackValue(4)));
	| 	daemonName = ((char *) (interpreterProxy->firstIndexableField(interpreterProxy->stackValue(4))));
	| 	interpreterProxy->success(interpreterProxy->isBytes(interpreterProxy->stackValue(3)));
	| 	privateName = ((char *) (interpreterProxy->firstIndexableField(interpreterProxy->stackValue(3))));
	| 	wantsGroupMsgs = interpreterProxy->stackIntegerValue(2);
	| 	interpreterProxy->success(interpreterProxy->isBytes(interpreterProxy->stackValue(1)));
	| 	groupBuf = ((char *) (interpreterProxy->firstIndexableField(interpreterProxy->stackValue(1))));
	| 	semaIndex = interpreterProxy->stackIntegerValue(0);
	| 	;
	| 	connection = interpreterProxy->stackValue(5);
	| 	if (interpreterProxy->failed()) {
	| 		return null;
	| 	}
	| 	s = interpreterProxy->fetchArrayofObject(0, connection);
	| 	daemonNameSize = interpreterProxy->sizeOfSTArrayFromCPrimitive(daemonName);
	| 	privateNameSize = interpreterProxy->sizeOfSTArrayFromCPrimitive(privateName);
	| 	if (interpreterProxy->failed()) {
	| 		return null;
	| 	}
	| 	_return_value = interpreterProxy->integerObjectOf((sqSpreadConnect(s, daemonName, 
	| 		daemonNameSize, privateName, privateNameSize, wantsGroupMsgs, groupBuf, semaIndex)));
	| 	if (interpreterProxy->failed()) {
	| 		return null;
	| 	}
	| 	interpreterProxy->popthenPush(6, _return_value);
	| 	return null;
	| }
	| 
Header declaration | The generated SpreadPlugin.c code includes the cross-platform header file | platforms/Cross/plugins/SpreadPlugin/SpreadPlugin.h. | I wrote this by hand along with my sqUnixSpread.c interface code. | In this header, my C connect routine is declared as: |
	|/* returns error code, fills in s and groupBuf */
	|int sqSpreadConnect(SpreadConnection *s,        /*OUT*/
	|		const char *daemonName,
	|		int daemonNameSize,
	|		const char *privateName,
	|		int privateNameSize,
	|		int wantsGroupMsgs,
	|		char groupBuf[MAX_GROUP_NAME],  /*OUT*/
	|		int semaIndex);
	|
Unix-specific sqSpreadConnect in platforms/unix/plugins/SpreadPlugin/sqUnixSpread.c | At the lowest level is the platform-specific function | sqSpreadConnect(), which lives in the | platforms/unix/plugins/SpreadPlugin/sqUnixSpread.c file. | This is where the SP_connect() routine is actually called. | | Note that I'm making copies of the passed in strings | daemonName and | privateName. | I'm doing this because I have to provide a NUL-terminated string, | and the incoming Smalltalk strings aren't large enough for me to | insert a NUL. | These get freed right after the SP_connect() call. | | If the connect call succeeds, I fill in the fields of | mbox with the | file descriptor, session ID, and semaphore index, then register the | file descriptor with the async IO routines using | aioEnable() and | aioHandle(). | I also keep track of which file descriptors this plugin is using so | that at plugin shutdown time I can tell the aio layer | not to watch the file descriptors any more. | |
	| int sqSpreadConnect(
	| 	SpreadConnection *s,	/*OUT*/
	| 	const char *daemonName,
	| 	int daemonNameSize,
	| 	const char *privateName,
	| 	int privateNameSize,
	| 	int wantsGroupMsgs,
	| 	char *groupBuf,			/*OUT*/
	| 	int semaIndex)
	| {
	| 	int retval;
	| 	int mbox;
	| 	char *privateNameCopy, *daemonNameCopy;
	| 
	| 	if (s == NULL)
	| 		return ILLEGAL_SESSION;
	| 
	| 	privateNameCopy = strndup(privateName, privateNameSize);
	| 	daemonNameCopy = strndup(daemonName, daemonNameSize);
	| 
	| 	retval = SP_connect(daemonNameCopy, privateNameCopy, 0, wantsGroupMsgs,
	| 			&mbox, groupBuf);
	| 
	| 	free(privateNameCopy);
	| 	free(daemonNameCopy);
	| 
	| 	if (retval == ACCEPT_SESSION)
	| 	{
	| 		s->semaIndex = semaIndex;
	| 		s->mbox = mbox;
	| 		s->sessionID = sessionID;
	| 
	| 		FD_SET(mbox, &fds);
	| 		nfd= max(nfd, mbox + 1);
	| 		aioEnable(mbox, (void*)semaIndex, AIO_EXT);
	| 		aioHandle(mbox, &dataReadyCallback, AIO_R);
	| 	}
	| 
	| 	return retval;
	| }
	| 
+Appendix 1: Slang reference id=appendix Smalltalk operators and methods that can be in a Slang method taken from initializeCTranslationDictionary text +

Operators and methods supported by InterpreterPlugin

id=appendixInterpreterPlugin +translated more or less directly to C operators -& -(or) | -and: -or: -not -+ -(minus) - -(times) * -/ -// -\\ -<< ->> -bitAnd: -anyMask: -bitOr: -bitXor: -bitShift: -bitInvert32 -< -<= -= -> ->= -~= -== -~~ -raisedTo: | calls pow() -min: | calls min() -max: | calls max() +comparisons with nilObject -isNil -notNil +loop constructs (remember: no real blocks) -whileTrue: -whileFalse: -whileTrue -whileFalse -to:do: -to:by:do: +conditionals -ifTrue: -ifFalse: -ifTrue:ifFalse: -ifFalse:ifTrue: +indexed access -at: -at:put: -basicAt: -basicAt:put: +integer oop conversion/testing -integerValueOf: -integerObjectOf: -isIntegerObject: +miscellaneous -cCode: -cCode:inSmalltalk: -cCoerce:to: -preIncrement -preDecrement +directives -inline: -export: -returnTypeC: -static: +casts -asFloat -asInteger +subroutine calls -perform: -perform:with: -perform:with:with: -perform:with:with:with: -perform:with:with:with:with: +

Additional operators and methods supported by TestInterpreterPlugin

id=appendixTestInterpreterPlugin +automatically provides conversions as needed, makes life simpler. +conversions oop=>C -asCInt -asCUnsigned -asCBoolean -asCDouble -asCharPtr -asIntPtr -asValue: Class +conversions C=>oop -asSmallIntegerObj -asPositiveIntegerObj -asBooleanObj -asFloatObj -cPtrAsOop -asOop: Class +named slot access (handy, but no type checking!) -asIf: Class var: 'instVarName' | ->fetchPointerOfObject -asIf: Class var: 'instVarName' asValue: Type | ->firstIndexableField(fetchPointerOfObject()) | ->(fetchPointerOfObject())>>1 -asIf:var:put: +numbered slot access -field: fld | ->fetchPointerOfObject(fld,rcvr) -field:put: +array access -stSize -stAt: -stAt:put: +type testing -isFloat -isIndexable -isIntegerOop -isIntegerValue -isWords -isWordsOrBytes -isPointers -isNil -isMemberOf: -isKindOf: +miscellaneous -class | ->fetchClassOf(rcvr) -next | *var++ -fromStack: #(name1 name2) | generates assignments into name1, name2, etc. | must name all stack vars -remapOop: #name in: [ block ] | pushes, block, pops -debugCode: [ stuff ] | only compiles if in debug mode +Appendix 2: InterpreterProxy API (given as Smalltalk) id=appendixInterpreterProxy | though this is in Smalltalk, you can replace these with the C equivalents by running the selector together: |
	| interpreterProxy doSomethingTo: x with: y becomes:
	| 	interpreterProxy->doSomethingTowith:(x, y)
	| 	
+stack access -pop: -pop:thenPush: -push: -pushBool: -pushFloat: -pushInteger: -stackFloatValue: -stackIntegerValue: -stackObjectValue: -stackValue: +object access -argumentCountOf: -arrayValueOf: -byteSizeOf: -fetchArray:ofObject: -fetchClassOf: -fetchFloat:ofObject: -fetchInteger:ofObject: -fetchPointer:ofObject: -fetchWord:ofObject: -firstFixedField: -firstIndexableField: -literal:ofMethod: -literalCountOf: -methodArgumentCount -methodPrimitiveIndex -primitiveIndexOf: -primitiveMethod -sizeOfSTArrayFromCPrimitive: -slotSizeOf: -stObject:at: -stObject:at:put: -stSizeOf: -storeInteger:ofObject:withValue: -storePointer:ofObject:withValue: +testing -includesBehavior:ThatOf: -is:KindOf: -is:MemberOf: -isBytes: -isFloatObject: -isIndexable: -isIntegerObject: -isIntegerValue: -isPointers: -isWeak: -isWords: -isWordsOrBytes: +converting -booleanValueOf: -checkedIntegerValueOf: -floatObjectOf: -floatValueOf: -integerObjectOf: -integerValueOf: -positive32BitIntegerFor: -positive32BitValueOf: -positive64BitIntegerFor: -positive64BitValueOf: -signed32BitIntegerFor: -signed32BitValueOf: -signed64BitIntegerFor: -signed64BitValueOf: +special objects -characterTable -displayObject -falseObject -nilObject -trueObject +special classes -classArray -classBitmap -classByteArray -classCharacter -classFloat -classLargeNegativeInteger -classLargePositiveInteger -classPoint -classSemaphore -classSmallInteger -classString +instance creation -clone: -instantiateClass:indexableSize: -makePointwithxValue:yValue: -popRemappableOop -pushRemappableOop: +other -minorVersion -majorVersion -become:with: -byteSwapped: -failed -fullDisplayUpdate -fullGC -incrementalGC -ioMicroMSecs -primitiveFail -showDisplayBits:Left:Top:Right:Bottom: -signalSemaphoreWithIndex: -success: -superclassOf: -suppressFailureGuards: | suppresses interpreterFailed checks in assignments and calls with args XXX detail this -ioMicroMsecs +Other resources id=otherResources | There are two papers I know about that deal with writing Squeak | primitives: -Andrew Greenberg's Extending the Squeak Virtual Machine | A good, though long, discussion of writing named primitives. | Unfortunately, it mostly skips discussion of the | TestInterpreterPlugin (which Andrew wrote). | -Stephen Pope's The Do-It-Yourself Guide to Squeak Primitives | This is a somewhat older document that covers numbered primitives. | Some of its discussion (getting to things on the stack, for | instance) has some parts that are still relevant, though the | TestInterpreterPlugin improves the situation a good deal. | +Sidebar: more about Spread id=spreadSidebar +Sidebar: more about Squeak id=squeakSidebar # vim: tw=72 ts=3 sw=3 ft=otl