[Pharo-project] [Vm-dev] Re: Can OSProcess functionality be implemented using FFI instead of plugin?

Thu Jan 21 12:43:37 UTC 2016

>
>
> Let's say we define a subclass of SharedPool called FFISharedPool.
> FFISharedPool 's job is to manage autogenerating a C file, compiling it for
> the platform, and organizing parsing the relevant output.  Let's say we use
> a convention like class-side pragmas to define include files, and compiler
> flags.  The VM provides two crucial pieces of information:
>
> 1. the platform name
> 2. the word size
>
> One can't run a Mac OS VM on Linux, and one can't run a 64-bit VM on a
> 32-bit operating system.  So taking this information from the VM accurately
> tells the current system what ABI (application binary interface) to use,
> and that's what's important in generating the right constants.
>
> So we use these two pieces of information to index the method pragmas that
> tell us what specific files to include.
>
> Let's imagine we subclass FFISharedPool to add a shared pool for constants
> for an SQL database.  We might have a class declaration like
>
> FFISharedPool subclass: #MYSQLInterface
> instanceVariableNames: ''
> classVariableNames: 'MYSQL_DEFAULT_AUTH MYSQL_ENABLE_CLEARTEXT_PLUGIN
> MYSQL_INIT_COMMAND MYSQL_OPT_BIND MYSQL_OPT_CAN_HANDLE_EXPIRED_PASSWORDS
> MYSQL_OPT_COMPRESS
> MYSQL_OPT_CONNECT_ATTR_DELETE MYSQL_OPT_CONNECT_ATTR_RESET'
> poolDictionaries: ''
> category: 'MYSQLInterface-Pools'
>
> The job of FFISharedPool is to compute the right values for the class
> variables on every platform we want to deploy the MYSQL interface on.
>
> So we need to know the relevant include files and C flags for each
> platform/word-size combination.  A few of them might look like
>
>
> MYSQLInterface class methods for platform information
> mac32
>     "I describe the include files and C flags to use when developing a
> 32-bit MYSQL FFI interface on Mac OS X"
>     <platformName: 'Mac OS' wordSize: 4>
>     <cFlags: #('-m32') includeFiles: #('/opt/mysql/include32')>
>     ^self "all the info is in the pragmas"
>
> mac64
>     "I describe the include files and C flags to use when developing a
> 64-bit MYSQL FFI interface on Mac OS X"
>     <platformName: 'Mac OS' wordSize: 8>
>     <cFlags: #('-m64') includeFiles: #('/opt/mysql/include64')>
>
> The above might cause FFISharedPool to autogenerate files called
> MYSQLInterface.mac32.c & MYSQLInterface.mac64.c.  And these, when run,
> might output ston notation to MYSQLInterface.mac32.ston &
> MYSQLInterface.mac64.ston (or maybe to stdout which has to be redirected to
> MYSQLInterface.mac32.ston; whatever).
>
> Now, you might use pragmas, or you might answer a Dictionary instance.
> What ever style pleases you and seems convenient and readable.  But these
> methods define the necessary metadata (C flags, include paths, and ...?)
> for FFISharedPool to autogenerate the C program that, when compiled with
> the supplied C flags and run on the current platform, outputs the values
> for the constants the shared pool wants to define.
>
>
> You can get fancy and have FFISharedPool autogenerate the C programs
> whenever one adds or removes a constant name.  Or you can require the
> programmer run something, e.g. MYSQLInterface generateInterfaces.  It's
> really nice if FFISharedPool submits the file to the C compiler
> automatically, but this can only work for e.g. 32 & 64 bit versions on a
> single platform.  You have to compile the autogenerated program on the
> relevant platform, with the necessary libraries and include files installed.
>
> You could imagine a set of servers for different platforms so one could
> submit the autogenerated program for compilation and execution on each
> platform.  That's a facility I'd make it easy to implement.  I could
> imagine that a programmer whose company develops an FFI interface and
> deploys it on a number of platforms would love to be able to automate
> compiling and running the relevant autogenerated code on a set of servers.
> I could imagine the Pharo community providing a set of servers upon which
> lots of software is installed for precisely this purpose. That means that
> people could develop FFI interfaces without even having to have the C
> compiler installed on their platform.
>
> You could also add a C parser to FFISharedPool  that parses the
> post-preprocessed code and extracts function declarations.  But the
> important thing is autogenerating the C program so that it generates easily
> parsable output containing the values for the constants.  You can extend
> the system in interesting ways once you ave this core functionality
> implemented.
>
> So once the program is autogenerated and compiled for the current
> platform, it is run and its output collected in a file whose name can be
> recognised by FFISharedPool.
>
>
Hi Eliot,

OK, I have currently a very first prototype where I can autogenerate the C
file from a FFISharedPool subclass, compile it, run it and get the ston
file output. Please, read below.

>
> Now the class side of FFISharedPool might be declared as
>
> FFIShardPool class
> instanceVariableNames: 'platformName wordSize'
>
> and on start-up FFIShardPool could examine its subclasses, and for each
> whose platformName & wordSize do not match the current platform, search for
> all the matching FOOInterface.plat.ston files, parse them and update the
> subclasses' variables, and update that pool's platformName & wordSize.  It
> could emit a warning on the Transcript or stdout (headful vs headless)
> indicating which subclasses it couldn't find the relevant
> FOOInterface.plat.ston files for.
>
> But the end result is that
>
> a) providing the system is deployed with FOOInterface.plat.ston files for
> each interface and platform used, a cross-platform application can be
> deployed *that does not require a C compiler*.
> b) providing that a system's FOOInterface files have been initialized on
> the intended platform, a platform-specific application can be deployed for
> a single platform *without needing the ston files*.
>
>
I was thinking the following. Having to distribute the FFI wrapper (take as
an example the myself wrapper) with the .ston files is a bit of a pain with
MC.  So I was thinking...what if FFISharedPool has all the machinery to
allow FFI lib wrapper developer (the developer of the MySQL wrapper), to
autogenerate the ston file as we said, BUT, the ston file is stored as
methods in the MYSQLInterface subclass? Probably under a "autogenerated"
protocol. That way, it's very easy to distribute and in addition, at system
startup it's easier to "search" for the "ston files".

The only drawback is the for very large ston files MC will suffer a bit..
but..

Thoughts?

> Does this make more sense now?
>
> c) at startup the image checks its current platform.  If the platform is
>>> the same that it was saved on, no action is taken.  But if the platform as
>>> changed then the relevant ston file is selected, parsed, and the values for
>>> the variables in the shared pool updated to reflect the values of the
>>> current platform.
>>>
>>> So the C compiler is only needed when developing the interface, not when
>>> deploying it.
>>>
>>>
>> OK
>>
>>
>>>
>>>> Then Nicolas made a point that if we plan to manage all that complexity
>>>> at the image level it may become a hell too.
>>>>
>>>> So.... what if we take a simpler (probably not better) approach and we
>>>> consider the "c program that exports constants and sizes" a VM Plugin?
>>>> Let's say we have a UnixPreprocessorPlugin (that would work for OSX, Linux
>>>> and other's Unix I imagine for the time being) which provides a function
>>>> (that is exported) which answers an array of arrays. For each constant, we
>>>> include the name of the constant, the value, and the sizeof().  Then from
>>>> image side, we simply do one FFI call, we get the large array and we adapt
>>>> it to a SharedPool or whatever kind of object representing that info.
>>>>
>>>
>>>
>>>
>>> This is what I suggestred in teh first place.  That what is
>>> autogenerated is a shared object (be it a plgin or a dll doesn't matter, it
>>> is machine code generated by a C compiler form an autogenerated C program
>>> compiled with the platform's C compiler) that can be loaded at run-time and
>>> interrogated to fetch the values of a set of variables
>>>
>>
>> OK, got it. But still, it would be easier if the "platform" in this case
>> is the "machine where we build the VM we will then distribute" right? i
>> mean, I would like to put this in the CI jobs that automatically builds the
>> VM, and not myself building for each platform.
>>
>
> NO!  For example, why would a company that has some proprietary arithmetic
> package implemented in its secret labs in C or C++ and accessed through the
> FFI want to have that code on the Pharo community's build servers?
>
>
>>
>> *I mean, my main doubt is if this job of autogenerating C code, compile
>> it, run it, export text file, and distribute text file with the VM, could
>> be done as part of the VM building. *
>>
>
> For fuck's sake.  Developing an FFI is not something one does when
> building a VM.  It is something one does wen using the system.  f you want
> to do this you *use a plugin*.  The FFI is a different beast.  It is to
> allow programers to interface to external librarys that are *independent
> from teh VM*.
>
> I'm not going to answer this one again.  OK?
>
>
>
>>
>>
>>
>>> .  But I think that the textual notation suggested above is simpler.
>>> The test files are easier to distribute and change.  Shared objects and
>>> plugins have a habit of going stale, and there needs to be metadata in
>>> there to describe the set of constants etc, which is tricky to generate and
>>> parse because it is binary (pointer sizes, etc, etc).  Instead a simple
>>> textual format should be much more robust.  One could even edit by hand to
>>> add new constants.  It would be easy to make the textual file a versioned
>>> file.  Etc, etc.
>>>
>>>
>>
>> OK. Got it. And do you think using X Macros for the autogenerated C (from
>> the SharedPool) is a good idea?
>> And then I simply write a text file out of it.
>>
>>
>>
>>>
>>>> I know that different users will need different constants. But let's
>>>> say the infrastructure (plugin etc) is already done. And let's say I am a
>>>> user that I want to build something with FFI and I need some constants that
>>>> I see are not defined. Then I can simply add the ones I need in the plugin,
>>>> and next VM release will have those. If Cog gets moved to Github, then this
>>>> is even easier. Everybody can do a PR with the constants he needs. And in
>>>> fact, if we have the infrastructure in place, I think that we each of us
>>>> spend half an hour, we may have almost everything we need.
>>>>
>>>> For example, I can add myself all those for signals (to use kill() from
>>>> FFI), all those from fcntl (to make none blocking pipes), all those from
>>>> wait()/waitpid() family (so that I can do a waitpid() with WNOHANG), etc
>>>> etc etc.
>>>>
>>>> I know it's not the best approach but it's something that could be done
>>>> very easily and would allow A LOT of stuff to be moved to FFI just because
>>>> we have no access to preprocess constants or sizeof()  (to know how to
>>>> allocate). I also know this won't cover macros and other stuff. But still.
>>>>
>>>> If you think this is a good idea, I can spend the time to do it.
>>>>
>>>> Cheers,
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, May 10, 2012 at 10:09 AM, Nick Ager <nick.ager at gmail.com>
>>>> wrote:
>>>>
>>>>> <snip>
>>>>> Well, like opendbx, maybe because opengl has quite standard
>>>>> interface...
>>>>> </snip>
>>>>>
>>>>> and
>>>>>
>>>>> <snip>
>>>>> It's not that it's not doable, it's that we gonna reinvent gaz plant
>>>>> and it gonna be so boring...
>>>>> I'd like to see a proof of concept, even if we restrict to libc, libm,
>>>>> kernel.dll, msvcrt.dll ...
>>>>> </snip>
>>>>>
>>>>> <snip>
>>>>> Is the unix style select()
>>>>> ubiquitous or should I use WaitForMultipleObject() on Windows? Are
>>>>> specification of read/write streams implementation machine independant
>>>>> (bsd/sysv/others...)
>>>>> </snip>
>>>>>
>>>>> Perhaps *a* way forward is to try to find existing projects which have
>>>>> already created cross-platform abstractions for platform specific
>>>>> functionality. Then we can use FFI to access that interface in a similar
>>>>> way to OpenGL and OpenDBX. For example NodeJs works across unixes - perhaps
>>>>> they have a useful cross-platform abstraction, boost  has abstractions of
>>>>> IPC etc
>>>>>
>>>>> Nick
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Mariano
>>>> http://marianopeck.wordpress.com
>>>>
>>>>
>>>
>>>
>>> --
>>> _,,,^..^,,,_
>>> best, Eliot
>>>
>>
>>
>>
>> --
>> Mariano
>> http://marianopeck.wordpress.com
>>
>
>
>
> --
> _,,,^..^,,,_
> best, Eliot
>

-- 
Mariano
http://marianopeck.wordpress.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20160121/420329ba/attachment-0001.htm