On Tue, Nov 19, 2013 at 7:00 PM, Andres Valloud <avalloud@smalltalk.comcastbiz.net> wrote:
There are other points of view worth considering.  Let's require that the resulting system works correctly, and backtrack from there to determine how to achieve that goal.

Sometimes, such as with Single Unix Specification / POSIX sockets, it is *impossible* to use an FFI correctly because the standard is such that using an FFI cannot be guaranteed to produce correct results.  Another way of saying the same thing is that you can use an FFI, as long as you don't care about the presence of undefined behavior in the general case.

(note that "undefined behavior" is specification language short hand for "execute arbitrary instructions", basically.  Usually this results in a segfault, but data corruption and security holes are possible too)


> Show me how you can replace the SocketPlugin with FFI, and
> I'll consider it. ;)

Specifically, SUS / POSIX sockets rely on partially specified structs that can change size, field, and field order from Unix to Unix. Moreover, the functions you'd call using those structs as arguments can be defined as macros.  Even trivial things like malloc() can be macros.  It's impossible to use those kinds of APIs in a sane manner from an FFI.  

That's not so. I came up with a scheme and implemented a prototype for VW.  All one need do is generate a wrapper and compile it on the platform.  One can autogenerate and autocompile the wrapper.  The wrapper can either be something that outputs metadata interpreted by the image or something that actually wraps the platform functions.  If it can be called from C then, with a little ingenuity, it an be called through an FFI.  An FFI is not just a marshaller.

I would argue that in fact the best way to deal with differing UNIX implementations is this approach.  For example, ioctl defines, socket constant defines, struct layouts, etc, etc all differ markedly between UNIX implementations, and hence one easy way to extract exact information is to generate, compile and either run or load a program that reveals the implementation details.

 
Theoretically it's conceivable, but at the cost of breaking C's encapsulation mechanism, thus making the whole application non portable across SUS / POSIX compliant implementations.  If one wanted to go that route, keep in mind the resulting never ending maintenance homework is extremely time consuming, and the application's behavior cannot ever be proven correct.  In real life, the FFI approach to these APIs means applications are not rationally supportable due to undefined behavior.

Speaking of symlinks, the function-like-things symlink() and stat() can also be macros as per SUS / POSIX.  So, even if there was a function called "symlink" you could find via dlsym() or an equivalent, it's *unsafe* to assume you can use an FFI to call that something called "symlink" and produce the same effect as writing "symlink" in a C source file that is given to a C compiler.

This problem has already been satisfactorily addressed in the form of a C compiler and a properly configured compilation environment producing primitives (or things equivalent to primitives), such that you write something like

        make fooPrimitivesOrBarPlugin

and in O(1 second) you have something that could possibly work correctly.  Note that I mean "correctly" as in

        "if it doesn't work, then it's conceivable you can file a well documented bug report with the maintainer after a modest amount of effort",

as opposed to

        "send the author a circumstantial account to the effect that after looking at random .h files with a random (perhaps human) .h file parser, using binaries compiled with random optimization switches on a random machine, and violating the relevant specification that describes the rational use of the feature in question, the resulting application fails due to an unspecified cause --- help!".

For some reason, code maintainers tend to pay attention to the former and ignore the latter.

In short, an issue with these types of FFIs is that all too often they merely *appear* to work.  The only rational usage model for some (most?) of the APIs mentioned in this thread involves a C compiler, which in practice means a C primitive or a C plugin.

The above points, argued strictly on technical grounds, are not intended to "cause a confrontation" or to "negate benefits of FFIs and plugins".  I just strongly care that applications Work(TM).  That goal sometimes implies dealing with SUS / POSIX (or, gasp, MSDN) and a C compiler. Maybe it's not necessarily the most enjoyable activity, but at least then the C stuff will be used as intended.  The alternative is non stop stochastic crashes preventing everyone's progress.

... my 2 cents...


On 11/19/13 10:35 , Eliot Miranda wrote:
Hi All,

     this is an important discussion that is taking a religious tone
that we should strive to avoid.  There are good arguments for plugins,
namely security and encapsulation.  There are good arguments for an FFI,
namely extensibility and platform compatibility.

Plugins provide security because they allow the system to control any
and all access to the underlying platform, permitting access only
through plugins.  With an FFI the underlying platform is exposed and one
needs other mechanisms, for example Newspeak mirrors, to prevent
untrusted code from accessing the platform with potentially disastrous
effects (self shell: '/bin/rm -rf /*').

Plugins encapsulate all sorts of details behind a potentially simple
primitive interface.  This can avoid confusing the newcommer (but at the
same time frustrate them by hiding details), provide portability, can
make it easier to determine the extent of work in moving to a new OS
platform, and so on.

An FFI allows immediate extensibility.  External functionality can be
invoked immediately.  With plugins a primitive interface must be
designed and then implemented. With the FFI the API is already defined;
it must "merely" be accessed.  This immediacy can itself provide
simplicity, especially where callbacks and threads are involved.
  Plugins can hide a lot of complexity (e.g. the SocketPlugin
encapsulates platform threads that are waiting on blocking calls so that
Squeak itself is provided with an interrupt-driven interface,
necessitated by the Squeak platform's lack of native thread support).

An FFI allows all underlying functionality to be accessed.  The plugin
approach necessitates defining a lowest common denominator approach to
functionality, especially irksome in some applications where setting the
right flag, e.g. on a socket stream, can have a significant performance
impact.

So there are good arguments either way.  In a system oriented towards
safe play plugins make excellent sense.  In a platform oriented towards
industrial development an FFI is a must-have, and a weak one will really
hurt acceptance.

IMO Squeak needs to have both.  It needs plugins to provide its
hallmarks such as eToys.  But to be a more general platform it needs an
FFI.  Managing this split personality will take work but I don't see any
fundamental issues.  Having a well-factored base into which packages can
be loaded to create different personalities is key, and good work is
being done here.  There may be a half-way house where the FFI is
strictly encapsulated, but this is hypothetical.  I know how to solve
threads, pinning, etc, but I don't know off the top of my head how to
encapsulate the FFI, so I can't propose it as a solution.

A number of straw men have been raised against the FFI in this
discussion.  OK, that's unfair.  A number of important questions have
been asked of the FFI in this discussion.

Levente asks "Show me how you can replace the SocketPlugin with FFI, and
I'll consider it. ;)".
The issue here is threads.  The SocketPlugin encapsulates blocking
calls, spawning hidden OS threads to make these calls and then signal
semaphores when they complete.  To solve this one needs both native
thread support in the VM (and I have a prototype that needs Spur's
facilities to make practicable) and pinning (the ability to stop certain
objects moving).  Spur provides pinning.

David says "I remember when somebody on the Pharo list suggested
reimplementing the
OSProcessPlugin in FFI. I told them it was a really great idea, and they
should give it a try. That settled the matter quite quickly ;-)".  Again
they failed because of the lack of necessary underlying functionality
from the VM.  With threads, pinning and a way of expressing the array of
pointers to strings idiom (a simple extension to marshalling, and/or
pinning, e.g. provide an address of first field primitive) an FFI can do
all the OSProcessPlugin can do and significantly simpler.

David also says "it is a complete mystery to me why people are willing
to work so hard to avoid writing a VM plugin. VM plugins are reliable,
portable, and debuggable. They work across a range of processors. They
work on 64-bit platforms. So why would someone prefer to switch to a
calling interface that basically only works on 32-bit Intel processors
and that may require low level knowledge of calling conventions, word
alignment, and platform-specific data types?"

This is a non-sequitur.  The sentences beginning "So why would
someone..." don't follow from the first sentences.  Writing the plugin
requires even more knowledge than writing the FFI interface because one
needs to know the VM facilities for mating Squeak objects to plugins.
  Writing plugins /and/ writing interfaces above FFIs are hard.  But in
my experience a powerful FFI provides a faster and easier development
experience.  Both can be difficult to port, but plugins have the
advantage that only the innards have to be ported while facing the C
code face.  My experience in that regard leaves me with a preference for
FFIs.  The lack of a 64-bit FFI is a bad weakness of the Squeak
platform, something Spur again makes easy to rectify.

Bert asks "Suppose we add a new VM platform, like a VM running on
JavaScript in the browser. Do you really want to re-implement all the C
libraries utilized via FFI? Or rather a handful of primitives in your
language of choice?".  First it is not clear that one *can* implement
these primitives taking either approach.  If the platform, e.g.
JavaScript in a browser, takes the Squeak plugin approach of preventing
access to the platform except through a restricted set of facilities,
then certain functionality will simply be off-limits, whether one has an
FFI or not.  Second, reimplementing all the C libraries isn't
obligatory.  If the platform provides an FFI one simply mates to its FFI
and accesses the underlying libraries.  If it doesn't then that
functionality is off-limits, but that doesn't mean the rest of the
system doesn't work.  It also means that Squeak running in that context
is no less useful than any other platform, because the underlying
platform (just as Squeak does with plugins)

--
best,
Eliot




--
best,
Eliot