Hi all,
Sorry for reviving an old thread but I thought it was better to continue the discussion here because of the context. As you may have read, the other day I released a first approeach to a subset of OSProcess based on FFI (posix_spwan() family of functions):
https://github.com/marianopeck/OSSubprocess
And with that in mind, I wanted to share a few things with you. The main 2 problems I found with implementing this with FFI was:
1) We have all already agree and discussed that fork+exec cannot be done in separate FFI calls. So at the very min you need either a plugin method that does the fork()+exec() OR wrapping a lib like posix_spwan()
2) The other main problem, is, as you all said (and mostly Nicolas), is the problems with the preprocessor (constants, macros, etc).
With all that said, I was able to get my stuff working. However, I am still using some primitives of OSProcess plugin because of 2).
I read Eliot idea and what I don't like is the need of a C compiler in the user machine. I think that's a high constrain. Then Igor suggested that WE (developers and maintainers of a certain tool) are the ones that compiles the little C program to extract constant values etc and then WE provide as part of our source code, some packages with some SharedPool depending on the platform/OS. And Igor approach looked a bit better to me.
Then Nicolas made a point that if we plan to manage all that complexity at the image level it may become a hell too.
So.... what if we take a simpler (probably not better) approach and we consider the "c program that exports constants and sizes" a VM Plugin? Let's say we have a UnixPreprocessorPlugin (that would work for OSX, Linux and other's Unix I imagine for the time being) which provides a function (that is exported) which answers an array of arrays. For each constant, we include the name of the constant, the value, and the sizeof(). Then from image side, we simply do one FFI call, we get the large array and we adapt it to a SharedPool or whatever kind of object representing that info.
I know that different users will need different constants. But let's say the infrastructure (plugin etc) is already done. And let's say I am a user that I want to build something with FFI and I need some constants that I see are not defined. Then I can simply add the ones I need in the plugin, and next VM release will have those. If Cog gets moved to Github, then this is even easier. Everybody can do a PR with the constants he needs. And in fact, if we have the infrastructure in place, I think that we each of us spend half an hour, we may have almost everything we need.
For example, I can add myself all those for signals (to use kill() from FFI), all those from fcntl (to make none blocking pipes), all those from wait()/waitpid() family (so that I can do a waitpid() with WNOHANG), etc etc etc.
I know it's not the best approach but it's something that could be done very easily and would allow A LOT of stuff to be moved to FFI just because we have no access to preprocess constants or sizeof() (to know how to allocate). I also know this won't cover macros and other stuff. But still.
If you think this is a good idea, I can spend the time to do it.
Cheers,
On Thu, May 10, 2012 at 10:09 AM, Nick Ager nick.ager@gmail.com wrote:
<snip> Well, like opendbx, maybe because opengl has quite standard interface... </snip>
and
<snip> It's not that it's not doable, it's that we gonna reinvent gaz plant and it gonna be so boring... I'd like to see a proof of concept, even if we restrict to libc, libm, kernel.dll, msvcrt.dll ... </snip>
<snip> Is the unix style select() ubiquitous or should I use WaitForMultipleObject() on Windows? Are specification of read/write streams implementation machine independant (bsd/sysv/others...) </snip>
Perhaps *a* way forward is to try to find existing projects which have already created cross-platform abstractions for platform specific functionality. Then we can use FFI to access that interface in a similar way to OpenGL and OpenDBX. For example NodeJs works across unixes - perhaps they have a useful cross-platform abstraction, boost has abstractions of IPC etc
Nick
On Sat, 16 Jan 2016, Mariano Martinez Peck wrote:
(No quote, thanks google.)
Forking a process is easy compared to communicating with it. How would you do the latter with FFI?
Levente
On Sat, Jan 16, 2016 at 11:33 AM, Levente Uzonyi leves@caesar.elte.hu wrote:
On Sat, 16 Jan 2016, Mariano Martinez Peck wrote:
(No quote, thanks google.)
Forking a process is easy compared to communicating with it. How would you do the latter with FFI?
Hi Levente,
I am not sure if I understand the question. For OSSubprocess I use 2 ways of communicating the parent and the child, either with pipes or with regular files, both cases doing a dup2() of those to the standard streams. Same way as OSProcess does as far as I am concern.
Cheers,
On Sat, 16 Jan 2016, Mariano Martinez Peck wrote:
(Still no quote.)
How will you read the output of the process without having your image's process blocked in the FFI callout? How will you make sure that writes to input of the process won't block the FFI callout?
Levente
On Sat, Jan 16, 2016 at 9:37 PM, Levente Uzonyi leves@caesar.elte.hu wrote:
On Sat, 16 Jan 2016, Mariano Martinez Peck wrote:
(Still no quote.)
How will you read the output of the process without having your image's process blocked in the FFI callout? How will you make sure that writes to input of the process won't block the FFI callout
I don't see how your question are related to what I proposed of creating a plugin who exports constants. And I also don't understand what would change in that regard wether it is via FFI or plugin. Could you explain please?
In any case, for the reading, I simply use none blocking pipes. Something like this:
fcntl(descriptor, F_SETFL, flags | O_NONBLOCK)
That way the read operations won't block and simply answer what is available on the pipe. Then, from image side, is up to the user to decide how to get it. Could be a polling loop or whatever.
Something I wanted to give it a try one day is to use blocking pipes but with threaded FFI callouts. But until that's in place, using non blocking pipes plus a image side polling is that I am using.
The writing to the stdin I think it's blocking, although I think there is non blocking writing possibility too, but I never tried.
Does this answer your question?
On Sat, 16 Jan 2016, Mariano Martinez Peck wrote:
(Again no quote.)
My question was obviously unrelated to the plugin with the constants. I was just wondering if there's a reason to use FFI instead of OSProcess. Based on your answers, I assume that - Your solution is Unix/Mac only, since dup and dup2 don't work on Windows. At least they didn't work when I wrote the ProcessWrapper plugin. - You have the drawbacks of non-blocking streams: polling, repeated write attempts. - You have to solve the problem of different platforms (constant values, library names and paths, etc), something that OSProcess handles by default.
All this makes me wonder: - Why do you want this FFI-based solution at all? - How will it be better than OSProcess? - If you decide to make a plugin as well to tackle some of the problems, why make it a mixed solution instead of plugin-only?
Levente
On Sun, Jan 17, 2016 at 12:28 PM, Levente Uzonyi leves@caesar.elte.hu wrote:
On Sat, 16 Jan 2016, Mariano Martinez Peck wrote:
(Again no quote.)
My question was obviously unrelated to the plugin with the constants. I was just wondering if there's a reason to use FFI instead of OSProcess.
Well, the discussion of FFI vs Plugin, is what Git vs MC is in Pharo mailing list. It has been discussed a million times and both approaches have pros and cons. If you want, Google a bit, but I won't loose my energy yet again in such discussion.
Based on your answers, I assume that
- Your solution is Unix/Mac only, since dup and dup2 don't work on
Windows. At least they didn't work when I wrote the ProcessWrapper plugin.
Yes, and this has nothing to do wether it is FFI or plugin. OSProcess dose use dup2 too.
- You have the drawbacks of non-blocking streams: polling, repeated write
attempts.
Yes, but it even seems it's not a big deal for most users having non-blocking streams with image-based polling. This is confirmed but the survey I made weeks ago (40 answers so far). In any case, thanks to Eliot, Esteban, and many others, we are hoping to have threaded FFI callout soon. So that will allow blocking threaded callouts. And I wonder...wouldn't be even easier this with FFI rather than plugin? Once the threaded FFI callouts are done, then I have nothing special to do. Contrary, OSProcess plugin should understand and know how to mange the threads I imagine for that, right?
- You have to solve the problem of different platforms (constant values,
library names and paths, etc), something that OSProcess handles by default.
Sure, and that was the whole point of my answer to Eliot proposal. But you are missing something. If we can make such a solution, that would be general, for every possible FFI tool. NOT ONLY for my OSProcess alternative.
All this makes me wonder:
- Why do you want this FFI-based solution at all? - How will it be better
than OSProcess?
Already answered.
- If you decide to make a plugin as well to tackle some of the problems,
why make it a mixed solution instead of plugin-only?
If we can make this tool for the constants, the only thing I would need from a custom plugin or OSProcess is the custom handler (semaphore) to receive signals. And that's only if you want to catch SIGCHLD in order to collect exit status. If the user chooses to do a polling wait, then that's not even necessary.
On Sat, Jan 16, 2016 at 4:37 PM, Levente Uzonyi leves@caesar.elte.hu wrote:
On Sat, 16 Jan 2016, Mariano Martinez Peck wrote:
(Still no quote.)
How will you read the output of the process without having your image's process blocked in the FFI callout? How will you make sure that writes to input of the process won't block the FFI callout?
This presupposes the threaded FFI. The threaded FFI allows the VM to make any number of blocking calls, adding a new thread to run the VM whenever the VM is stalled when the heartbeat beats. hence one can freely read and write to/from i/o blocking i/o streams (including pipes and sockets) or blocking database connexions, etc, all without stating that the FFI call must be done in a special way, since all calls through the FFI can block without blocking the VM.
Note that the scheme is also amenable to plugins, but the plugins must be rewritten to include the release vm/acquire vm calls around a blocking call. With the threaded VM the FFI includes these calls around every FFI call.
HTH _,,,^..^,,,_ best, Eliot
On Sat, 16 Jan 2016, Eliot Miranda wrote:
On Sat, Jan 16, 2016 at 4:37 PM, Levente Uzonyi leves@caesar.elte.hu wrote:
On Sat, 16 Jan 2016, Mariano Martinez Peck wrote: (Still no quote.) How will you read the output of the process without having your image's process blocked in the FFI callout? How will you make sure that writes to input of the process won't block the FFI callout?
This presupposes the threaded FFI. The threaded FFI allows the VM to make any number of blocking calls, adding a new thread to run the VM whenever the VM is stalled when the heartbeat beats. hence one can freely read and write to/from i/o blocking i/o streams (including pipes and sockets) or blocking database connexions, etc, all without stating that the FFI call must be done in a special way, since all calls through the FFI can block without blocking the VM.
I think it was you who said (in a discussion with Craig) that the threaded FFI was not production ready. Is it ready for produciton now?
Levente
Note that the scheme is also amenable to plugins, but the plugins must be rewritten to include the release vm/acquire vm calls around a blocking call. With the threaded VM the FFI includes these calls around every FFI call.
HTH _,,,^..^,,,_ best, Eliot
Hi Levente,
On Jan 17, 2016, at 7:30 AM, Levente Uzonyi leves@caesar.elte.hu wrote:
On Sat, 16 Jan 2016, Eliot Miranda wrote:
On Sat, Jan 16, 2016 at 4:37 PM, Levente Uzonyi leves@caesar.elte.hu wrote:
On Sat, 16 Jan 2016, Mariano Martinez Peck wrote: (Still no quote.) How will you read the output of the process without having your image's process blocked in the FFI callout? How will you make sure that writes to input of the process won't block the FFI callout?
This presupposes the threaded FFI. The threaded FFI allows the VM to make any number of blocking calls, adding a new thread to run the VM whenever the VM is stalled when the heartbeat beats. hence one can freely read and write to/from i/o blocking i/o streams (including pipes and sockets) or blocking database connexions, etc, all without stating that the FFI call must be done in a special way, since all calls through the FFI can block without blocking the VM.
I think it was you who said (in a discussion with Craig) that the threaded FFI was not production ready. Is it ready for produciton now?
No, but I expect this is the year it will be. Spur provides pinning, so the VM infrastructure is there. The Pharo community plus some commercial relationships that have developed are providing funding. Esteban Lorenzano and I want to collaborate on this and I hope to get help from some other people, such as Ronie Salgado. And Mariano is working on an important part of the problem. So I feel there's sufficient momentum for us to realize the threaded FFI this year.
Levente
_,,,^..^,,,_ (phone)
Note that the scheme is also amenable to plugins, but the plugins must be rewritten to include the release vm/acquire vm calls around a blocking call. With the threaded VM the FFI includes these calls around every FFI call. HTH _,,,^..^,,,_ best, Eliot
wow impressive work guys , thats awesome news.
On Sun, Jan 17, 2016 at 6:53 PM Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Levente,
On Jan 17, 2016, at 7:30 AM, Levente Uzonyi leves@caesar.elte.hu
wrote:
On Sat, 16 Jan 2016, Eliot Miranda wrote:
On Sat, Jan 16, 2016 at 4:37 PM, Levente Uzonyi leves@caesar.elte.hu
wrote:
On Sat, 16 Jan 2016, Mariano Martinez Peck wrote: (Still no quote.) How will you read the output of the process without having your
image's process blocked in the FFI callout?
How will you make sure that writes to input of the process won't
block the FFI callout?
This presupposes the threaded FFI. The threaded FFI allows the VM to
make any number of blocking calls, adding a new thread to run the VM whenever the VM is stalled when the heartbeat beats. hence one can freely read and write to/from i/o blocking i/o streams
(including pipes and sockets) or blocking database connexions, etc, all
without stating that the FFI call must be done in a special way, since all calls through the FFI can block without blocking the VM.
I think it was you who said (in a discussion with Craig) that the
threaded FFI was not production ready. Is it ready for produciton now?
No, but I expect this is the year it will be. Spur provides pinning, so the VM infrastructure is there. The Pharo community plus some commercial relationships that have developed are providing funding. Esteban Lorenzano and I want to collaborate on this and I hope to get help from some other people, such as Ronie Salgado. And Mariano is working on an important part of the problem. So I feel there's sufficient momentum for us to realize the threaded FFI this year.
Levente
_,,,^..^,,,_ (phone)
Note that the scheme is also amenable to plugins, but the plugins must
be rewritten to include the release vm/acquire vm calls around a blocking call. With the threaded VM the FFI includes these calls around every FFI call.
HTH _,,,^..^,,,_ best, Eliot
On Jan 17, 2016, at 8:53 AM, Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Levente,
On Jan 17, 2016, at 7:30 AM, Levente Uzonyi leves@caesar.elte.hu wrote:
On Sat, 16 Jan 2016, Eliot Miranda wrote:
On Sat, Jan 16, 2016 at 4:37 PM, Levente Uzonyi leves@caesar.elte.hu wrote:
On Sat, 16 Jan 2016, Mariano Martinez Peck wrote: (Still no quote.) How will you read the output of the process without having your image's process blocked in the FFI callout? How will you make sure that writes to input of the process won't block the FFI callout?
This presupposes the threaded FFI. The threaded FFI allows the VM to make any number of blocking calls, adding a new thread to run the VM whenever the VM is stalled when the heartbeat beats. hence one can freely read and write to/from i/o blocking i/o streams (including pipes and sockets) or blocking database connexions, etc, all without stating that the FFI call must be done in a special way, since all calls through the FFI can block without blocking the VM.
I think it was you who said (in a discussion with Craig) that the threaded FFI was not production ready. Is it ready for produciton now?
No, but I expect this is the year it will be. Spur provides pinning, so the VM infrastructure is there. The Pharo community plus some commercial relationships that have developed are providing funding. Esteban Lorenzano and I want to collaborate on this and I hope to get help from some other people, such as Ronie Salgado. And Mariano is working on an important part of the problem. So I feel there's sufficient momentum for us to realize the threaded FFI this year.
and when Craig Latta tried to use it late last year it worked up to a point. The thing that didn't work was callbacks from foreign threads. So it looks like the core threading code is not too far away from working.
Another really important part, bigger than threading, is marshaling. Being able to handle the full x85_64 abi requires a better approach than interpreting tops signatures. Igor's NativeBoost gave an example of how to generate marshaling machine code, but alas only for x86. But Sista includes an extensible bytecode set for arbitrary instructions. Sista is close to production, and we know the bytecode set works. So the plan is to use these bytecodes to do the marshaling. That neatly solves the problems of a) associating marshaling machine code with a method and b) marshaling in an interpreted stack VM, since the bytecode set works in any Cog VM. So the plan is to write an ABI compiler from C signatures to marshaling code to replace the interpreted FFI plugin.
So this year I hope we will have an excellent high performance FFI.
Levente
_,,,^..^,,,_ (phone)
Note that the scheme is also amenable to plugins, but the plugins must be rewritten to include the release vm/acquire vm calls around a blocking call. With the threaded VM the FFI includes these calls around every FFI call. HTH _,,,^..^,,,_ best, Eliot
On 17 Jan 2016, at 18:04, Eliot Miranda eliot.miranda@gmail.com wrote:
So this year I hope we will have an excellent high performance FFI.
this will be an exciting year. BTW when ronie will arrive here we will plan some skype session with you.
Stef
On Sun, Jan 17, 2016 at 12:13 PM, stephane ducasse < stephane.ducasse@gmail.com> wrote:
On 17 Jan 2016, at 18:04, Eliot Miranda eliot.miranda@gmail.com wrote:
So this year I hope we will have an excellent high performance FFI.
this will be an exciting year. BTW when ronie will arrive here we will plan some skype session with you.
Yes, if we're to pull this off we have all to be working with a coherent shared plan and to keep in communication. This is exciting. I've been dreaming of this FFI for a while now :-).
_,,,^..^,,,_ best, Eliot
...when Craig Latta tried to use [Alien FFI] late last year it worked up to a point. The thing that didn't work was callbacks from foreign threads. So it looks like the core threading code is not too far away from working.
(Yes, it seemed close enough that I spent several hours debugging, trying to get it the rest of the way. I ran out of time, so I wrote a wrapper C library around the one I wanted to use, with threaded C callback functions that signalled Smalltalk semaphores on which my synchronous-FFI Smalltalk process waited. A hack, but it worked fine and was simple.)
-C
-- Craig Latta netjam.org +31 6 2757 7177 (SMS ok) + 1 415 287 3547 (no SMS)
so I assume that means callbacks from inside C threads works fine which make it more than enough at least for now.
Does that mean that the VM will implement a real threading mechanism ?
On Mon, Jan 18, 2016 at 1:11 PM Craig Latta craig@netjam.org wrote:
...when Craig Latta tried to use [Alien FFI] late last year it worked up to a point. The thing that didn't work was callbacks from foreign threads. So it looks like the core threading code is not too far away from working.
(Yes, it seemed close enough that I spent several hours debugging,
trying to get it the rest of the way. I ran out of time, so I wrote a wrapper C library around the one I wanted to use, with threaded C callback functions that signalled Smalltalk semaphores on which my synchronous-FFI Smalltalk process waited. A hack, but it worked fine and was simple.)
-C
-- Craig Latta netjam.org +31 6 2757 7177 (SMS ok)
- 1 415 287 3547 (no SMS)
Hi Dimitris,
On Jan 18, 2016, at 3:29 AM, Dimitris Chloupis kilon.alios@gmail.com wrote:
so I assume that means callbacks from inside C threads works fine which make it more than enough at least for now.
C threads us a misnomer. Callbacks from native threads that are currently calling out work. I /think/ callbacks from native threads that have previously called out but are not currently calling out work, but am not sure. Callbacks from native threads the VM has not seen before don't yet work; the VM doesn't service them.
Does that mean that the VM will implement a real threading mechanism ?
This is a mechanism that allows one to freely share the VM between arbitrary native threads but only one thread can run the VM at any one time. So it provides true multi threading but it does /not/ provide concurrency.
On Mon, Jan 18, 2016 at 1:11 PM Craig Latta craig@netjam.org wrote:
...when Craig Latta tried to use [Alien FFI] late last year it worked up to a point. The thing that didn't work was callbacks from foreign threads. So it looks like the core threading code is not too far away from working.
(Yes, it seemed close enough that I spent several hours debugging,
trying to get it the rest of the way. I ran out of time, so I wrote a wrapper C library around the one I wanted to use, with threaded C callback functions that signalled Smalltalk semaphores on which my synchronous-FFI Smalltalk process waited. A hack, but it worked fine and was simple.)
-C
-- Craig Latta netjam.org +31 6 2757 7177 (SMS ok)
- 1 415 287 3547 (no SMS)
AFAIK OS thread are capable of being assigned to multiple cores thus offering real concurency, thus I presume your native threads are not OS threads. So the VM run on one thread but can communicate with other threads ? Does that apply to multithreaded VM ?
On Mon, Jan 18, 2016 at 2:02 PM Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Dimitris,
On Jan 18, 2016, at 3:29 AM, Dimitris Chloupis kilon.alios@gmail.com wrote:
so I assume that means callbacks from inside C threads works fine which make it more than enough at least for now.
C threads us a misnomer. Callbacks from native threads that are currently calling out work. I /think/ callbacks from native threads that have previously called out but are not currently calling out work, but am not sure. Callbacks from native threads the VM has not seen before don't yet work; the VM doesn't service them.
Does that mean that the VM will implement a real threading mechanism ?
This is a mechanism that allows one to freely share the VM between arbitrary native threads but only one thread can run the VM at any one time. So it provides true multi threading but it does /not/ provide concurrency.
On Mon, Jan 18, 2016 at 1:11 PM Craig Latta craig@netjam.org wrote:
...when Craig Latta tried to use [Alien FFI] late last year it worked up to a point. The thing that didn't work was callbacks from foreign threads. So it looks like the core threading code is not too far away from working.
(Yes, it seemed close enough that I spent several hours debugging,
trying to get it the rest of the way. I ran out of time, so I wrote a wrapper C library around the one I wanted to use, with threaded C callback functions that signalled Smalltalk semaphores on which my synchronous-FFI Smalltalk process waited. A hack, but it worked fine and was simple.)
-C
-- Craig Latta netjam.org +31 6 2757 7177 (SMS ok)
- 1 415 287 3547 (no SMS)
Hi Dimitris,
On Jan 18, 2016, at 4:11 AM, Dimitris Chloupis kilon.alios@gmail.com wrote:
AFAIK OS thread are capable of being assigned to multiple cores thus offering real concurency, thus I presume your native threads are not OS threads. So the VM run on one thread but can communicate with other threads ? Does that apply to multithreaded VM ?
As I said these /are/ native OS threads. The VM arranges that only one native thread can run the VM at any one time. The owning thread locks out other threads. The VM makes an FFI call out on one OS thread which unlocks the VM just before calling out and if that callout takes enough time for the heartbeat thread to beat then another thread will be released to try and lock and run the VM. If it wins the race the other call out thread will block when trying to rhea quire the VM to return it's result.
So this is real multi threading without concurrency as I said.
On Mon, Jan 18, 2016 at 2:02 PM Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Dimitris,
On Jan 18, 2016, at 3:29 AM, Dimitris Chloupis kilon.alios@gmail.com wrote:
so I assume that means callbacks from inside C threads works fine which make it more than enough at least for now.
C threads us a misnomer. Callbacks from native threads that are currently calling out work. I /think/ callbacks from native threads that have previously called out but are not currently calling out work, but am not sure. Callbacks from native threads the VM has not seen before don't yet work; the VM doesn't service them.
Does that mean that the VM will implement a real threading mechanism ?
This is a mechanism that allows one to freely share the VM between arbitrary native threads but only one thread can run the VM at any one time. So it provides true multi threading but it does /not/ provide concurrency.
On Mon, Jan 18, 2016 at 1:11 PM Craig Latta craig@netjam.org wrote:
...when Craig Latta tried to use [Alien FFI] late last year it worked up to a point. The thing that didn't work was callbacks from foreign threads. So it looks like the core threading code is not too far away from working.
(Yes, it seemed close enough that I spent several hours debugging,
trying to get it the rest of the way. I ran out of time, so I wrote a wrapper C library around the one I wanted to use, with threaded C callback functions that signalled Smalltalk semaphores on which my synchronous-FFI Smalltalk process waited. A hack, but it worked fine and was simple.)
-C
-- Craig Latta netjam.org +31 6 2757 7177 (SMS ok)
- 1 415 287 3547 (no SMS)
ah ok now its crystal clear thanks for the explanation.
On Mon, Jan 18, 2016 at 3:11 PM Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Dimitris,
On Jan 18, 2016, at 4:11 AM, Dimitris Chloupis kilon.alios@gmail.com wrote:
AFAIK OS thread are capable of being assigned to multiple cores thus offering real concurency, thus I presume your native threads are not OS threads. So the VM run on one thread but can communicate with other threads ? Does that apply to multithreaded VM ?
As I said these /are/ native OS threads. The VM arranges that only one native thread can run the VM at any one time. The owning thread locks out other threads. The VM makes an FFI call out on one OS thread which unlocks the VM just before calling out and if that callout takes enough time for the heartbeat thread to beat then another thread will be released to try and lock and run the VM. If it wins the race the other call out thread will block when trying to rhea quire the VM to return it's result.
So this is real multi threading without concurrency as I said.
On Mon, Jan 18, 2016 at 2:02 PM Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Dimitris,
On Jan 18, 2016, at 3:29 AM, Dimitris Chloupis kilon.alios@gmail.com wrote:
so I assume that means callbacks from inside C threads works fine which make it more than enough at least for now.
C threads us a misnomer. Callbacks from native threads that are currently calling out work. I /think/ callbacks from native threads that have previously called out but are not currently calling out work, but am not sure. Callbacks from native threads the VM has not seen before don't yet work; the VM doesn't service them.
Does that mean that the VM will implement a real threading mechanism ?
This is a mechanism that allows one to freely share the VM between arbitrary native threads but only one thread can run the VM at any one time. So it provides true multi threading but it does /not/ provide concurrency.
On Mon, Jan 18, 2016 at 1:11 PM Craig Latta craig@netjam.org wrote:
...when Craig Latta tried to use [Alien FFI] late last year it worked up to a point. The thing that didn't work was callbacks from foreign threads. So it looks like the core threading code is not too far away from working.
(Yes, it seemed close enough that I spent several hours debugging,
trying to get it the rest of the way. I ran out of time, so I wrote a wrapper C library around the one I wanted to use, with threaded C callback functions that signalled Smalltalk semaphores on which my synchronous-FFI Smalltalk process waited. A hack, but it worked fine and was simple.)
-C
-- Craig Latta netjam.org +31 6 2757 7177 (SMS ok)
- 1 415 287 3547 (no SMS)
Hi Eliot,
Ok, I started from the first step and I am able to type a C program that would export the STON I need. This is all hardcoded for the moment and not autogenerated. The question I have now is....do you think I should use "cmake" for managing the compilation of this C program (later to be autogenerated) ??? If not, what other choice I have?
Thanks!
On Mon, Jan 18, 2016 at 10:16 AM, Dimitris Chloupis kilon.alios@gmail.com wrote:
ah ok now its crystal clear thanks for the explanation.
On Mon, Jan 18, 2016 at 3:11 PM Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Dimitris,
On Jan 18, 2016, at 4:11 AM, Dimitris Chloupis kilon.alios@gmail.com wrote:
AFAIK OS thread are capable of being assigned to multiple cores thus offering real concurency, thus I presume your native threads are not OS threads. So the VM run on one thread but can communicate with other threads ? Does that apply to multithreaded VM ?
As I said these /are/ native OS threads. The VM arranges that only one native thread can run the VM at any one time. The owning thread locks out other threads. The VM makes an FFI call out on one OS thread which unlocks the VM just before calling out and if that callout takes enough time for the heartbeat thread to beat then another thread will be released to try and lock and run the VM. If it wins the race the other call out thread will block when trying to rhea quire the VM to return it's result.
So this is real multi threading without concurrency as I said.
On Mon, Jan 18, 2016 at 2:02 PM Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Dimitris,
On Jan 18, 2016, at 3:29 AM, Dimitris Chloupis kilon.alios@gmail.com wrote:
so I assume that means callbacks from inside C threads works fine which make it more than enough at least for now.
C threads us a misnomer. Callbacks from native threads that are currently calling out work. I /think/ callbacks from native threads that have previously called out but are not currently calling out work, but am not sure. Callbacks from native threads the VM has not seen before don't yet work; the VM doesn't service them.
Does that mean that the VM will implement a real threading mechanism ?
This is a mechanism that allows one to freely share the VM between arbitrary native threads but only one thread can run the VM at any one time. So it provides true multi threading but it does /not/ provide concurrency.
On Mon, Jan 18, 2016 at 1:11 PM Craig Latta craig@netjam.org wrote:
...when Craig Latta tried to use [Alien FFI] late last year it worked up to a point. The thing that didn't work was callbacks from foreign threads. So it looks like the core threading code is not too far away from working.
(Yes, it seemed close enough that I spent several hours debugging,
trying to get it the rest of the way. I ran out of time, so I wrote a wrapper C library around the one I wanted to use, with threaded C callback functions that signalled Smalltalk semaphores on which my synchronous-FFI Smalltalk process waited. A hack, but it worked fine and was simple.)
-C
-- Craig Latta netjam.org +31 6 2757 7177 (SMS ok)
- 1 415 287 3547 (no SMS)
On Wed, Jan 20, 2016 at 10:55 AM, Mariano Martinez Peck < marianopeck@gmail.com> wrote:
Hi Eliot,
Ok, I started from the first step and I am able to type a C program that would export the STON I need. This is all hardcoded for the moment and not autogenerated. The question I have now is....do you think I should use "cmake" for managing the compilation of this C program (later to be autogenerated) ??? If not, what other choice I have?
Cmake is overkill. You only have to compile one file with very simple flags. So for now use the old OSProcess (or compile manually) but... once you have your own OSProcess then the right way is to compile it using your new OSProcess. You have to require that the user put the C compiler in their path, but everything else you have in the pragmas. e.g. you're only doing
cc -m32 -o MYSQLInterface.mac32 MYSQLInterface.mac32.c ./MYSQLInterface.mac32
Thanks!
On Mon, Jan 18, 2016 at 10:16 AM, Dimitris Chloupis <kilon.alios@gmail.com
wrote:
ah ok now its crystal clear thanks for the explanation.
On Mon, Jan 18, 2016 at 3:11 PM Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Dimitris,
On Jan 18, 2016, at 4:11 AM, Dimitris Chloupis kilon.alios@gmail.com wrote:
AFAIK OS thread are capable of being assigned to multiple cores thus offering real concurency, thus I presume your native threads are not OS threads. So the VM run on one thread but can communicate with other threads ? Does that apply to multithreaded VM ?
As I said these /are/ native OS threads. The VM arranges that only one native thread can run the VM at any one time. The owning thread locks out other threads. The VM makes an FFI call out on one OS thread which unlocks the VM just before calling out and if that callout takes enough time for the heartbeat thread to beat then another thread will be released to try and lock and run the VM. If it wins the race the other call out thread will block when trying to rhea quire the VM to return it's result.
So this is real multi threading without concurrency as I said.
On Mon, Jan 18, 2016 at 2:02 PM Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Dimitris,
On Jan 18, 2016, at 3:29 AM, Dimitris Chloupis kilon.alios@gmail.com wrote:
so I assume that means callbacks from inside C threads works fine which make it more than enough at least for now.
C threads us a misnomer. Callbacks from native threads that are currently calling out work. I /think/ callbacks from native threads that have previously called out but are not currently calling out work, but am not sure. Callbacks from native threads the VM has not seen before don't yet work; the VM doesn't service them.
Does that mean that the VM will implement a real threading mechanism ?
This is a mechanism that allows one to freely share the VM between arbitrary native threads but only one thread can run the VM at any one time. So it provides true multi threading but it does /not/ provide concurrency.
On Mon, Jan 18, 2016 at 1:11 PM Craig Latta craig@netjam.org wrote:
...when Craig Latta tried to use [Alien FFI] late last year it worked up to a point. The thing that didn't work was callbacks from foreign threads. So it looks like the core threading code is not too far away from working.
(Yes, it seemed close enough that I spent several hours debugging,
trying to get it the rest of the way. I ran out of time, so I wrote a wrapper C library around the one I wanted to use, with threaded C callback functions that signalled Smalltalk semaphores on which my synchronous-FFI Smalltalk process waited. A hack, but it worked fine and was simple.)
-C
-- Craig Latta netjam.org +31 6 2757 7177 (SMS ok)
- 1 415 287 3547 (no SMS)
-- Mariano http://marianopeck.wordpress.com
On Wed, Jan 20, 2016 at 5:33 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
On Wed, Jan 20, 2016 at 10:55 AM, Mariano Martinez Peck < marianopeck@gmail.com> wrote:
Hi Eliot,
Ok, I started from the first step and I am able to type a C program that would export the STON I need. This is all hardcoded for the moment and not autogenerated. The question I have now is....do you think I should use "cmake" for managing the compilation of this C program (later to be autogenerated) ??? If not, what other choice I have?
Cmake is overkill. You only have to compile one file with very simple flags. So for now use the old OSProcess (or compile manually) but... once you have your own OSProcess then the right way is to compile it using your new OSProcess. You have to require that the user put the C compiler in their path, but everything else you have in the pragmas. e.g. you're only doing
cc -m32 -o MYSQLInterface.mac32 MYSQLInterface.mac32.c ./MYSQLInterface.mac32
Thanks Eliot,
I had the same feeling. Thanks for confirming.
BTW, let me ask...I have already starting coding this so I have to give a name to the project so that I can create a repo and start committing :) (even if at the end this is merged in FFI package or whatever) Do you have a good name in mind? I would like FFI in it's name, but not SharedPool (that's a low level impl detail). I don't want to couple with "Constants" either as it may help us with other stuff like sizeof() etc. I guess it should be something related to FFIPreprocessorInfoExtractor or something like that.
BTW2: Yes, I have my own "OSProcess" working, called OSSubprocess: https://github.com/marianopeck/OSSubprocess/
Thanks!!
Thanks!
On Mon, Jan 18, 2016 at 10:16 AM, Dimitris Chloupis < kilon.alios@gmail.com> wrote:
ah ok now its crystal clear thanks for the explanation.
On Mon, Jan 18, 2016 at 3:11 PM Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Dimitris,
On Jan 18, 2016, at 4:11 AM, Dimitris Chloupis kilon.alios@gmail.com wrote:
AFAIK OS thread are capable of being assigned to multiple cores thus offering real concurency, thus I presume your native threads are not OS threads. So the VM run on one thread but can communicate with other threads ? Does that apply to multithreaded VM ?
As I said these /are/ native OS threads. The VM arranges that only one native thread can run the VM at any one time. The owning thread locks out other threads. The VM makes an FFI call out on one OS thread which unlocks the VM just before calling out and if that callout takes enough time for the heartbeat thread to beat then another thread will be released to try and lock and run the VM. If it wins the race the other call out thread will block when trying to rhea quire the VM to return it's result.
So this is real multi threading without concurrency as I said.
On Mon, Jan 18, 2016 at 2:02 PM Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Dimitris,
On Jan 18, 2016, at 3:29 AM, Dimitris Chloupis kilon.alios@gmail.com wrote:
so I assume that means callbacks from inside C threads works fine which make it more than enough at least for now.
C threads us a misnomer. Callbacks from native threads that are currently calling out work. I /think/ callbacks from native threads that have previously called out but are not currently calling out work, but am not sure. Callbacks from native threads the VM has not seen before don't yet work; the VM doesn't service them.
Does that mean that the VM will implement a real threading mechanism ?
This is a mechanism that allows one to freely share the VM between arbitrary native threads but only one thread can run the VM at any one time. So it provides true multi threading but it does /not/ provide concurrency.
On Mon, Jan 18, 2016 at 1:11 PM Craig Latta craig@netjam.org wrote:
> ...when Craig Latta tried to use [Alien FFI] late last year it worked > up to a point. The thing that didn't work was callbacks from foreign > threads. So it looks like the core threading code is not too far > away from working.
(Yes, it seemed close enough that I spent several hours
debugging, trying to get it the rest of the way. I ran out of time, so I wrote a wrapper C library around the one I wanted to use, with threaded C callback functions that signalled Smalltalk semaphores on which my synchronous-FFI Smalltalk process waited. A hack, but it worked fine and was simple.)
-C
-- Craig Latta netjam.org +31 6 2757 7177 (SMS ok)
- 1 415 287 3547 (no SMS)
-- Mariano http://marianopeck.wordpress.com
-- _,,,^..^,,,_ best, Eliot
On Thu, Jan 21, 2016 at 9:07 AM, Mariano Martinez Peck marianopeck@gmail.com wrote:
On Wed, Jan 20, 2016 at 5:33 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
On Wed, Jan 20, 2016 at 10:55 AM, Mariano Martinez Peck marianopeck@gmail.com wrote:
Hi Eliot,
Ok, I started from the first step and I am able to type a C program that would export the STON I need. This is all hardcoded for the moment and not autogenerated. The question I have now is....do you think I should use "cmake" for managing the compilation of this C program (later to be autogenerated) ??? If not, what other choice I have?
Cmake is overkill. You only have to compile one file with very simple flags. So for now use the old OSProcess (or compile manually) but... once you have your own OSProcess then the right way is to compile it using your new OSProcess. You have to require that the user put the C compiler in their path, but everything else you have in the pragmas. e.g. you're only doing
cc -m32 -o MYSQLInterface.mac32 MYSQLInterface.mac32.c ./MYSQLInterface.mac32
Thanks Eliot,
I had the same feeling. Thanks for confirming.
BTW, let me ask...I have already starting coding this so I have to give a name to the project so that I can create a repo and start committing :) (even if at the end this is merged in FFI package or whatever) Do you have a good name in mind? I would like FFI in it's name, but not SharedPool (that's a low level impl detail). I don't want to couple with "Constants" either as it may help us with other stuff like sizeof() etc. I guess it should be something related to FFIPreprocessorInfoExtractor or something like that.
How about: * FFIDefines * FFIHeaders * FFIDefs * FFIDecl * FFIDeclarations
or do like other language FFIs and distinguish that this is a "C" interface... * CDeclaration * FFICDecl
btw, only slightly off topic... I just read something very interesting about using the host** finalization and garbage collector to free guest** malloc'd memory to reduce memory leak. Scroll down to "Memory management: let the garbage collector do the work" http://book.realworldhaskell.org/read/interfacing-with-c-the-ffi.html (**are these reaonble terms to use here?) Note: I don't actually know Haskell. I only got distracted and bumped into this while looking for other info.
Lua seems to have something similar... cdata = ffi.gc(cdata, finalizer) http://luajit.org/ext_ffi_api.html
https://wiki.haskell.org/HSFFIG/Tutorial * 4.1 Naming conventions
https://wiki.haskell.org/GHC/Using_the_FFI
On 21 Jan 2016, at 2:56 , Ben Coman btc@openInWorld.com wrote:
btw, only slightly off topic... I just read something very interesting about using the host** finalization and garbage collector to free guest** malloc'd memory to reduce memory leak. Scroll down to "Memory management: let the garbage collector do the work" http://book.realworldhaskell.org/read/interfacing-with-c-the-ffi.html http://book.realworldhaskell.org/read/interfacing-with-c-the-ffi.html (**are these reaonble terms to use here?) Note: I don't actually know Haskell. I only got distracted and bumped into this while looking for other info.
Lua seems to have something similar... cdata = ffi.gc(cdata, finalizer) http://luajit.org/ext_ffi_api.html http://luajit.org/ext_ffi_api.html
https://wiki.haskell.org/HSFFIG/Tutorial https://wiki.haskell.org/HSFFIG/Tutorial
- 4.1 Naming conventions
https://wiki.haskell.org/GHC/Using_the_FFI https://wiki.haskell.org/GHC/Using_the_FFI
http://luajit.org/ext_ffi_api.html http://luajit.org/ext_ffi_api.html
https://colberg.org/gcc-lua-cdecl/ffi-cdecl.html https://colberg.org/gcc-lua-cdecl/ffi-cdecl.html
NB-FFI does this too, see NBExternalResourceExecutor.
Cheers, Henry
On Sat, Jan 16, 2016 at 11:00:39AM -0300, Mariano Martinez Peck wrote:
So.... what if we take a simpler (probably not better) approach and we consider the "c program that exports constants and sizes" a VM Plugin?
I think that is a good approach. It gives you what you need, and you can probably do it in slang without the need for any external C code.
Dave
I think the most important thing is to create something that is easy to maintain. Also as much I appreciate doing the most in the image I also don't underestimate the elegance of C for low level stuff and the fact that it comes with great deal of documentation on the subject. Doesn't C offer already a solution to this problem ?
On Sat, 16 Jan 2016 at 23:28, David T. Lewis lewis@mail.msen.com wrote:
On Sat, Jan 16, 2016 at 11:00:39AM -0300, Mariano Martinez Peck wrote:
So.... what if we take a simpler (probably not better) approach and we consider the "c program that exports constants and sizes" a VM Plugin?
I think that is a good approach. It gives you what you need, and you can probably do it in slang without the need for any external C code.
Dave
On Sat, Jan 16, 2016 at 6:28 PM, David T. Lewis lewis@mail.msen.com wrote:
On Sat, Jan 16, 2016 at 11:00:39AM -0300, Mariano Martinez Peck wrote:
So.... what if we take a simpler (probably not better) approach and we consider the "c program that exports constants and sizes" a VM Plugin?
I think that is a good approach. It gives you what you need, and you can probably do it in slang without the need for any external C code.
Hi Dave,
Thanks! I guess I will only be able to give it a try if the rest also agrees that would be worth and to have some green light for integration.
BTW, I was investigating a bit over internet and it seems that it may be easier with X Macros?? See this for example:
http://stackoverflow.com/questions/264269/what-is-a-good-reference-documenti...
Cheers!
Hi Mariano,
On Sat, Jan 16, 2016 at 6:00 AM, Mariano Martinez Peck < marianopeck@gmail.com> wrote:
Hi all,
Sorry for reviving an old thread but I thought it was better to continue the discussion here because of the context. As you may have read, the other day I released a first approeach to a subset of OSProcess based on FFI (posix_spwan() family of functions):
https://github.com/marianopeck/OSSubprocess
And with that in mind, I wanted to share a few things with you. The main 2 problems I found with implementing this with FFI was:
- We have all already agree and discussed that fork+exec cannot be done
in separate FFI calls. So at the very min you need either a plugin method that does the fork()+exec() OR wrapping a lib like posix_spwan()
- The other main problem, is, as you all said (and mostly Nicolas), is
the problems with the preprocessor (constants, macros, etc).
With all that said, I was able to get my stuff working. However, I am still using some primitives of OSProcess plugin because of 2).
I read Eliot idea and what I don't like is the need of a C compiler in the user machine. I think that's a high constrain. Then Igor suggested that WE (developers and maintainers of a certain tool) are the ones that compiles the little C program to extract constant values etc and then WE provide as part of our source code, some packages with some SharedPool depending on the platform/OS. And Igor approach looked a bit better to me.
You misunderstand the proposal. The C compiler is needed /only when changing the set of constants/, i.e. when /developing/ the interface. The C compiler is /not/ needed when deploying.
The idea is to a) at development time, e.g. when a new variable is added to a SharedPool containing platform constants, a C program is autogenerated that outputs in some format a description of the names and values of all the constants defined in the pool. One convenient notation is e.g. STON. For the purposes of this discussion let's assume we're using ston, but any format the image an parse (or indeed a shared object the image can load on teh current pkatform) will do. The output of the autogenerated C program would be called something like <SharedPoolName>.<PlatformName>.ston, e.g. UnixConstants.MacOSX64.ston or UnixConstants.Linux32.ston. The ston files can easily be parsed by facilities in the Smalltalk image.
b) when deploying the system to a set of platforms one includes all the relevant platform-specific ston files.
c) at startup the image checks its current platform. If the platform is the same that it was saved on, no action is taken. But if the platform as changed then the relevant ston file is selected, parsed, and the values for the variables in the shared pool updated to reflect the values of the current platform.
So the C compiler is only needed when developing the interface, not when deploying it.
Then Nicolas made a point that if we plan to manage all that complexity at
the image level it may become a hell too.
So.... what if we take a simpler (probably not better) approach and we consider the "c program that exports constants and sizes" a VM Plugin? Let's say we have a UnixPreprocessorPlugin (that would work for OSX, Linux and other's Unix I imagine for the time being) which provides a function (that is exported) which answers an array of arrays. For each constant, we include the name of the constant, the value, and the sizeof(). Then from image side, we simply do one FFI call, we get the large array and we adapt it to a SharedPool or whatever kind of object representing that info.
This is what I suggestred in teh first place. That what is autogenerated is a shared object (be it a plgin or a dll doesn't matter, it is machine code generated by a C compiler form an autogenerated C program compiled with the platform's C compiler) that can be loaded at run-time and interrogated to fetch the values of a set of variables. ut I think that the textual notation suggested above is simpler. The test files are easier to distribute and change.
I know that different users will need different constants. But let's say the infrastructure (plugin etc) is already done. And let's say I am a user that I want to build something with FFI and I need some constants that I see are not defined. Then I can simply add the ones I need in the plugin, and next VM release will have those. If Cog gets moved to Github, then this is even easier. Everybody can do a PR with the constants he needs. And in fact, if we have the infrastructure in place, I think that we each of us spend half an hour, we may have almost everything we need.
For example, I can add myself all those for signals (to use kill() from FFI), all those from fcntl (to make none blocking pipes), all those from wait()/waitpid() family (so that I can do a waitpid() with WNOHANG), etc etc etc.
I know it's not the best approach but it's something that could be done very easily and would allow A LOT of stuff to be moved to FFI just because we have no access to preprocess constants or sizeof() (to know how to allocate). I also know this won't cover macros and other stuff. But still.
If you think this is a good idea, I can spend the time to do it.
Cheers,
On Thu, May 10, 2012 at 10:09 AM, Nick Ager nick.ager@gmail.com wrote:
<snip> Well, like opendbx, maybe because opengl has quite standard interface... </snip>
and
<snip> It's not that it's not doable, it's that we gonna reinvent gaz plant and it gonna be so boring... I'd like to see a proof of concept, even if we restrict to libc, libm, kernel.dll, msvcrt.dll ... </snip>
<snip> Is the unix style select() ubiquitous or should I use WaitForMultipleObject() on Windows? Are specification of read/write streams implementation machine independant (bsd/sysv/others...) </snip>
Perhaps *a* way forward is to try to find existing projects which have already created cross-platform abstractions for platform specific functionality. Then we can use FFI to access that interface in a similar way to OpenGL and OpenDBX. For example NodeJs works across unixes - perhaps they have a useful cross-platform abstraction, boost has abstractions of IPC etc
Nick
-- Mariano http://marianopeck.wordpress.com
On Sat, Jan 16, 2016 at 6:00 AM, Mariano Martinez Peck < marianopeck@gmail.com> wrote:
Hi all,
Sorry for reviving an old thread but I thought it was better to continue the discussion here because of the context. As you may have read, the other day I released a first approeach to a subset of OSProcess based on FFI (posix_spwan() family of functions):
https://github.com/marianopeck/OSSubprocess
And with that in mind, I wanted to share a few things with you. The main 2 problems I found with implementing this with FFI was:
- We have all already agree and discussed that fork+exec cannot be done
in separate FFI calls. So at the very min you need either a plugin method that does the fork()+exec() OR wrapping a lib like posix_spwan()
- The other main problem, is, as you all said (and mostly Nicolas), is
the problems with the preprocessor (constants, macros, etc).
With all that said, I was able to get my stuff working. However, I am still using some primitives of OSProcess plugin because of 2).
I read Eliot idea and what I don't like is the need of a C compiler in the user machine. I think that's a high constrain. Then Igor suggested that WE (developers and maintainers of a certain tool) are the ones that compiles the little C program to extract constant values etc and then WE provide as part of our source code, some packages with some SharedPool depending on the platform/OS. And Igor approach looked a bit better to me.
You misunderstand the proposal. The C compiler is needed /only when changing the set of constants/, i.e. when /developing/ the interface. The C compiler is /not/ needed when deploying.
The idea is to a) at development time, e.g. when a new variable is added to a SharedPool containing platform constants, a C program is autogenerated that outputs in some format a description of the names and values of all the constants defined in the pool. One convenient notation is e.g. STON. For the purposes of this discussion let's assume we're using ston, but any format the image an parse (or indeed a shared object the image can load on teh current pkatform) will do. The output of the autogenerated C program would be called something like <SharedPoolName>.<PlatformName>.ston, e.g. UnixConstants.MacOSX64.ston or UnixConstants.Linux32.ston. The ston files can easily be parsed by facilities in the Smalltalk image.
b) when deploying the system to a set of platforms one includes all the relevant platform-specific ston files.
c) at startup the image checks its current platform. If the platform is the same that it was saved on, no action is taken. But if the platform as changed then the relevant ston file is selected, parsed, and the values for the variables in the shared pool updated to reflect the values of the current platform.
So the C compiler is only needed when developing the interface, not when deploying it.
Then Nicolas made a point that if we plan to manage all that complexity at the image level it may become a hell too.
So.... what if we take a simpler (probably not better) approach and we consider the "c program that exports constants and sizes" a VM Plugin? Let's say we have a UnixPreprocessorPlugin (that would work for OSX, Linux and other's Unix I imagine for the time being) which provides a function (that is exported) which answers an array of arrays. For each constant, we include the name of the constant, the value, and the sizeof(). Then from image side, we simply do one FFI call, we get the large array and we adapt it to a SharedPool or whatever kind of object representing that info.
This is what I suggestred in teh first place. That what is autogenerated is a shared object (be it a plgin or a dll doesn't matter, it is machine code generated by a C compiler form an autogenerated C program compiled with the platform's C compiler) that can be loaded at run-time and interrogated to fetch the values of a set of variables. But I think that the textual notation suggested above is simpler. The test files are easier to distribute and change. Shared objects and plugins have a habit of going stale, and there needs to be metadata in there to describe the set of constants etc, which is tricky to generate and parse because it is binary (pointer sizes, etc, etc). Instead a simple textual format should be much more robust. One could even edit by hand to add new constants. It would be easy to make the textual file a versioned file. Etc, etc.
I know that different users will need different constants. But let's say the infrastructure (plugin etc) is already done. And let's say I am a user that I want to build something with FFI and I need some constants that I see are not defined. Then I can simply add the ones I need in the plugin, and next VM release will have those. If Cog gets moved to Github, then this is even easier. Everybody can do a PR with the constants he needs. And in fact, if we have the infrastructure in place, I think that we each of us spend half an hour, we may have almost everything we need.
For example, I can add myself all those for signals (to use kill() from FFI), all those from fcntl (to make none blocking pipes), all those from wait()/waitpid() family (so that I can do a waitpid() with WNOHANG), etc etc etc.
I know it's not the best approach but it's something that could be done very easily and would allow A LOT of stuff to be moved to FFI just because we have no access to preprocess constants or sizeof() (to know how to allocate). I also know this won't cover macros and other stuff. But still.
If you think this is a good idea, I can spend the time to do it.
Cheers,
On Thu, May 10, 2012 at 10:09 AM, Nick Ager nick.ager@gmail.com wrote:
<snip> Well, like opendbx, maybe because opengl has quite standard interface... </snip>
and
<snip> It's not that it's not doable, it's that we gonna reinvent gaz plant and it gonna be so boring... I'd like to see a proof of concept, even if we restrict to libc, libm, kernel.dll, msvcrt.dll ... </snip>
<snip> Is the unix style select() ubiquitous or should I use WaitForMultipleObject() on Windows? Are specification of read/write streams implementation machine independant (bsd/sysv/others...) </snip>
Perhaps *a* way forward is to try to find existing projects which have already created cross-platform abstractions for platform specific functionality. Then we can use FFI to access that interface in a similar way to OpenGL and OpenDBX. For example NodeJs works across unixes - perhaps they have a useful cross-platform abstraction, boost has abstractions of IPC etc
Nick
-- Mariano http://marianopeck.wordpress.com
On Sat, Jan 16, 2016 at 11:02 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
On Sat, Jan 16, 2016 at 6:00 AM, Mariano Martinez Peck < marianopeck@gmail.com> wrote:
Hi all,
Sorry for reviving an old thread but I thought it was better to continue the discussion here because of the context. As you may have read, the other day I released a first approeach to a subset of OSProcess based on FFI (posix_spwan() family of functions):
https://github.com/marianopeck/OSSubprocess
And with that in mind, I wanted to share a few things with you. The main 2 problems I found with implementing this with FFI was:
- We have all already agree and discussed that fork+exec cannot be done
in separate FFI calls. So at the very min you need either a plugin method that does the fork()+exec() OR wrapping a lib like posix_spwan()
- The other main problem, is, as you all said (and mostly Nicolas), is
the problems with the preprocessor (constants, macros, etc).
With all that said, I was able to get my stuff working. However, I am still using some primitives of OSProcess plugin because of 2).
I read Eliot idea and what I don't like is the need of a C compiler in the user machine. I think that's a high constrain. Then Igor suggested that WE (developers and maintainers of a certain tool) are the ones that compiles the little C program to extract constant values etc and then WE provide as part of our source code, some packages with some SharedPool depending on the platform/OS. And Igor approach looked a bit better to me.
You misunderstand the proposal.
I think I did. But let me confirm that below ;)
The C compiler is needed /only when changing the set of constants/, i.e. when /developing/ the interface. The C compiler is /not/ needed when deploying.
The idea is to a) at development time, e.g. when a new variable is added to a SharedPool containing platform constants, a C program is autogenerated that outputs in some format a description of the names and values of all the constants defined in the pool. One convenient notation is e.g. STON. For the purposes of this discussion let's assume we're using ston, but any format the image an parse (or indeed a shared object the image can load on teh current pkatform) will do. The output of the autogenerated C program would be called something like <SharedPoolName>.<PlatformName>.ston, e.g. UnixConstants.MacOSX64.ston or UnixConstants.Linux32.ston. The ston files can easily be parsed by facilities in the Smalltalk image.
b) when deploying the system to a set of platforms one includes all the relevant platform-specific ston files.
OK. But let me ask something. Below you said "be it a plugin or a dll doesn't matter". To autogenerate the C program, I must know which header files to include for each platform and probably a few others things. For example, besides exporting the value, I would also like to export the sizeof(). At that depends how was the VM compiled, right? So...my question is...if such a autogenerated C code could be part of the VM building (considering all the settings being assume when building), cannot I reuse the knowledge the VM already has? Like which header files to include, if it was compiled 32 bits or 64 bits, which C compiler to use, etc..
What I mean is if it would be easier if I take the SharedPool at VM building time, and from there I autogenerate (and run) the C code that would generate the output. Then, when we "deploy" the VM, we can deploy it with relevant platform specific ston files as you said.
c) at startup the image checks its current platform. If the platform is the same that it was saved on, no action is taken. But if the platform as changed then the relevant ston file is selected, parsed, and the values for the variables in the shared pool updated to reflect the values of the current platform.
So the C compiler is only needed when developing the interface, not when deploying it.
OK
Then Nicolas made a point that if we plan to manage all that complexity at the image level it may become a hell too.
So.... what if we take a simpler (probably not better) approach and we consider the "c program that exports constants and sizes" a VM Plugin? Let's say we have a UnixPreprocessorPlugin (that would work for OSX, Linux and other's Unix I imagine for the time being) which provides a function (that is exported) which answers an array of arrays. For each constant, we include the name of the constant, the value, and the sizeof(). Then from image side, we simply do one FFI call, we get the large array and we adapt it to a SharedPool or whatever kind of object representing that info.
This is what I suggestred in teh first place. That what is autogenerated is a shared object (be it a plgin or a dll doesn't matter, it is machine code generated by a C compiler form an autogenerated C program compiled with the platform's C compiler) that can be loaded at run-time and interrogated to fetch the values of a set of variables
OK, got it. But still, it would be easier if the "platform" in this case is the "machine where we build the VM we will then distribute" right? i mean, I would like to put this in the CI jobs that automatically builds the VM, and not myself building for each platform.
*I mean, my main doubt is if this job of autogenerating C code, compile it, run it, export text file, and distribute text file with the VM, could be done as part of the VM building. *
. But I think that the textual notation suggested above is simpler. The test files are easier to distribute and change. Shared objects and plugins have a habit of going stale, and there needs to be metadata in there to describe the set of constants etc, which is tricky to generate and parse because it is binary (pointer sizes, etc, etc). Instead a simple textual format should be much more robust. One could even edit by hand to add new constants. It would be easy to make the textual file a versioned file. Etc, etc.
OK. Got it. And do you think using X Macros for the autogenerated C (from the SharedPool) is a good idea? And then I simply write a text file out of it.
I know that different users will need different constants. But let's say the infrastructure (plugin etc) is already done. And let's say I am a user that I want to build something with FFI and I need some constants that I see are not defined. Then I can simply add the ones I need in the plugin, and next VM release will have those. If Cog gets moved to Github, then this is even easier. Everybody can do a PR with the constants he needs. And in fact, if we have the infrastructure in place, I think that we each of us spend half an hour, we may have almost everything we need.
For example, I can add myself all those for signals (to use kill() from FFI), all those from fcntl (to make none blocking pipes), all those from wait()/waitpid() family (so that I can do a waitpid() with WNOHANG), etc etc etc.
I know it's not the best approach but it's something that could be done very easily and would allow A LOT of stuff to be moved to FFI just because we have no access to preprocess constants or sizeof() (to know how to allocate). I also know this won't cover macros and other stuff. But still.
If you think this is a good idea, I can spend the time to do it.
Cheers,
On Thu, May 10, 2012 at 10:09 AM, Nick Ager nick.ager@gmail.com wrote:
<snip> Well, like opendbx, maybe because opengl has quite standard interface... </snip>
and
<snip> It's not that it's not doable, it's that we gonna reinvent gaz plant and it gonna be so boring... I'd like to see a proof of concept, even if we restrict to libc, libm, kernel.dll, msvcrt.dll ... </snip>
<snip> Is the unix style select() ubiquitous or should I use WaitForMultipleObject() on Windows? Are specification of read/write streams implementation machine independant (bsd/sysv/others...) </snip>
Perhaps *a* way forward is to try to find existing projects which have already created cross-platform abstractions for platform specific functionality. Then we can use FFI to access that interface in a similar way to OpenGL and OpenDBX. For example NodeJs works across unixes - perhaps they have a useful cross-platform abstraction, boost has abstractions of IPC etc
Nick
-- Mariano http://marianopeck.wordpress.com
-- _,,,^..^,,,_ best, Eliot
Hi Mariano,
On Sat, Jan 16, 2016 at 6:25 PM, Mariano Martinez Peck < marianopeck@gmail.com> wrote:
On Sat, Jan 16, 2016 at 11:02 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
On Sat, Jan 16, 2016 at 6:00 AM, Mariano Martinez Peck < marianopeck@gmail.com> wrote:
Hi all,
Sorry for reviving an old thread but I thought it was better to continue the discussion here because of the context. As you may have read, the other day I released a first approeach to a subset of OSProcess based on FFI (posix_spwan() family of functions):
https://github.com/marianopeck/OSSubprocess
And with that in mind, I wanted to share a few things with you. The main 2 problems I found with implementing this with FFI was:
- We have all already agree and discussed that fork+exec cannot be done
in separate FFI calls. So at the very min you need either a plugin method that does the fork()+exec() OR wrapping a lib like posix_spwan()
- The other main problem, is, as you all said (and mostly Nicolas), is
the problems with the preprocessor (constants, macros, etc).
With all that said, I was able to get my stuff working. However, I am still using some primitives of OSProcess plugin because of 2).
I read Eliot idea and what I don't like is the need of a C compiler in the user machine. I think that's a high constrain. Then Igor suggested that WE (developers and maintainers of a certain tool) are the ones that compiles the little C program to extract constant values etc and then WE provide as part of our source code, some packages with some SharedPool depending on the platform/OS. And Igor approach looked a bit better to me.
You misunderstand the proposal.
I think I did. But let me confirm that below ;)
The C compiler is needed /only when changing the set of constants/, i.e. when /developing/ the interface. The C compiler is /not/ needed when deploying.
The idea is to a) at development time, e.g. when a new variable is added to a SharedPool containing platform constants, a C program is autogenerated that outputs in some format a description of the names and values of all the constants defined in the pool. One convenient notation is e.g. STON. For the purposes of this discussion let's assume we're using ston, but any format the image an parse (or indeed a shared object the image can load on teh current pkatform) will do. The output of the autogenerated C program would be called something like <SharedPoolName>.<PlatformName>.ston, e.g. UnixConstants.MacOSX64.ston or UnixConstants.Linux32.ston. The ston files can easily be parsed by facilities in the Smalltalk image.
b) when deploying the system to a set of platforms one includes all the relevant platform-specific ston files.
OK. But let me ask something. Below you said "be it a plugin or a dll doesn't matter". To autogenerate the C program, I must know which header files to include for each platform and probably a few others things. For example, besides exporting the value, I would also like to export the sizeof(). At that depends how was the VM compiled, right? So...my question is...if such a autogenerated C code could be part of the VM building (considering all the settings being assume when building), cannot I reuse the knowledge the VM already has? Like which header files to include, if it was compiled 32 bits or 64 bits, which C compiler to use, etc..
I actually said that using text is easier than a dll. So I'm saying autogenerate a C program that outputs name-value pairs in some convenient textual representation, e.g. ston. But answering your question...
The knowledge in the VM as to what header files are included *applies only to the include files the VM uses*. The VM uses a subset of the platform. It doesn't for example include any headers that define a database interface. It doesn't include header files that define the interface to a UI tooklit such at GTK. Etc, etc. So in fact the VM *doesn't* include the knowledge one needs to determine the set of include files for an arbitrary FFI interface. And even so, the include files that it does use are in the VM's platform source files, and that information is not readily accessible.
Let me summarise. No, the VM cannot be used to determine the set of include files needed to generate constants used in an arbitrary FFI interface.
What I mean is if it would be easier if I take the SharedPool at VM
building time, and from there I autogenerate (and run) the C code that would generate the output. Then, when we "deploy" the VM, we can deploy it with relevant platform specific ston files as you said.
No. The VM is something that provides an FFI. It doesn't *define* an FFI. One must be able to develop an FFI interface without needing to rebuild the VM. So computing the values of constants should be *separate* from building a VM. Now let me give you more of an example.
Let's say we define a subclass of SharedPool called FFISharedPool. FFISharedPool 's job is to manage autogenerating a C file, compiling it for the platform, and organizing parsing the relevant output. Let's say we use a convention like class-side pragmas to define include files, and compiler flags. The VM provides two crucial pieces of information:
1. the platform name 2. the word size
One can't run a Mac OS VM on Linux, and one can't run a 64-bit VM on a 32-bit operating system. So taking this information from the VM accurately tells the current system what ABI (application binary interface) to use, and that's what's important in generating the right constants.
So we use these two pieces of information to index the method pragmas that tell us what specific files to include.
Let's imagine we subclass FFISharedPool to add a shared pool for constants for an SQL database. We might have a class declaration like
FFISharedPool subclass: #MYSQLInterface instanceVariableNames: '' classVariableNames: 'MYSQL_DEFAULT_AUTH MYSQL_ENABLE_CLEARTEXT_PLUGIN MYSQL_INIT_COMMAND MYSQL_OPT_BIND MYSQL_OPT_CAN_HANDLE_EXPIRED_PASSWORDS MYSQL_OPT_COMPRESS MYSQL_OPT_CONNECT_ATTR_DELETE MYSQL_OPT_CONNECT_ATTR_RESET' poolDictionaries: '' category: 'MYSQLInterface-Pools'
The job of FFISharedPool is to compute the right values for the class variables on every platform we want to deploy the MYSQL interface on.
So we need to know the relevant include files and C flags for each platform/word-size combination. A few of them might look like
MYSQLInterface class methods for platform information mac32 "I describe the include files and C flags to use when developing a 32-bit MYSQL FFI interface on Mac OS X" <platformName: 'Mac OS' wordSize: 4> <cFlags: #('-m32') includeFiles: #('/opt/mysql/include32')> ^self "all the info is in the pragmas"
mac64 "I describe the include files and C flags to use when developing a 64-bit MYSQL FFI interface on Mac OS X" <platformName: 'Mac OS' wordSize: 8> <cFlags: #('-m64') includeFiles: #('/opt/mysql/include64')>
The above might cause FFISharedPool to autogenerate files called MYSQLInterface.mac32.c & MYSQLInterface.mac64.c. And these, when run, might output ston notation to MYSQLInterface.mac32.ston & MYSQLInterface.mac64.ston (or maybe to stdout which has to be redirected to MYSQLInterface.mac32.ston; whatever).
Now, you might use pragmas, or you might answer a Dictionary instance. What ever style pleases you and seems convenient and readable. But these methods define the necessary metadata (C flags, include paths, and ...?) for FFISharedPool to autogenerate the C program that, when compiled with the supplied C flags and run on the current platform, outputs the values for the constants the shared pool wants to define.
You can get fancy and have FFISharedPool autogenerate the C programs whenever one adds or removes a constant name. Or you can require the programmer run something, e.g. MYSQLInterface generateInterfaces. It's really nice if FFISharedPool submits the file to the C compiler automatically, but this can only work for e.g. 32 & 64 bit versions on a single platform. You have to compile the autogenerated program on the relevant platform, with the necessary libraries and include files installed.
You could imagine a set of servers for different platforms so one could submit the autogenerated program for compilation and execution on each platform. That's a facility I'd make it easy to implement. I could imagine that a programmer whose company develops an FFI interface and deploys it on a number of platforms would love to be able to automate compiling and running the relevant autogenerated code on a set of servers. I could imagine the Pharo community providing a set of servers upon which lots of software is installed for precisely this purpose. That means that people could develop FFI interfaces without even having to have the C compiler installed on their platform.
You could also add a C parser to FFISharedPool that parses the post-preprocessed code and extracts function declarations. But the important thing is autogenerating the C program so that it generates easily parsable output containing the values for the constants. You can extend the system in interesting ways once you ave this core functionality implemented.
So once the program is autogenerated and compiled for the current platform, it is run and its output collected in a file whose name can be recognised by FFISharedPool.
Now the class side of FFISharedPool might be declared as
FFIShardPool class instanceVariableNames: 'platformName wordSize'
and on start-up FFIShardPool could examine its subclasses, and for each whose platformName & wordSize do not match the current platform, search for all the matching FOOInterface.plat.ston files, parse them and update the subclasses' variables, and update that pool's platformName & wordSize. It could emit a warning on the Transcript or stdout (headful vs headless) indicating which subclasses it couldn't find the relevant FOOInterface.plat.ston files for.
But the end result is that
a) providing the system is deployed with FOOInterface.plat.ston files for each interface and platform used, a cross-platform application can be deployed *that does not require a C compiler*. b) providing that a system's FOOInterface files have been initialized on the intended platform, a platform-specific application can be deployed for a single platform *without needing the ston files*.
Does this make more sense now?
c) at startup the image checks its current platform. If the platform is
the same that it was saved on, no action is taken. But if the platform as changed then the relevant ston file is selected, parsed, and the values for the variables in the shared pool updated to reflect the values of the current platform.
So the C compiler is only needed when developing the interface, not when deploying it.
OK
Then Nicolas made a point that if we plan to manage all that complexity at the image level it may become a hell too.
So.... what if we take a simpler (probably not better) approach and we consider the "c program that exports constants and sizes" a VM Plugin? Let's say we have a UnixPreprocessorPlugin (that would work for OSX, Linux and other's Unix I imagine for the time being) which provides a function (that is exported) which answers an array of arrays. For each constant, we include the name of the constant, the value, and the sizeof(). Then from image side, we simply do one FFI call, we get the large array and we adapt it to a SharedPool or whatever kind of object representing that info.
This is what I suggestred in teh first place. That what is autogenerated is a shared object (be it a plgin or a dll doesn't matter, it is machine code generated by a C compiler form an autogenerated C program compiled with the platform's C compiler) that can be loaded at run-time and interrogated to fetch the values of a set of variables
OK, got it. But still, it would be easier if the "platform" in this case is the "machine where we build the VM we will then distribute" right? i mean, I would like to put this in the CI jobs that automatically builds the VM, and not myself building for each platform.
NO! For example, why would a company that has some proprietary arithmetic package implemented in its secret labs in C or C++ and accessed through the FFI want to have that code on the Pharo community's build servers?
*I mean, my main doubt is if this job of autogenerating C code, compile it, run it, export text file, and distribute text file with the VM, could be done as part of the VM building. *
For fuck's sake. Developing an FFI is not something one does when building a VM. It is something one does wen using the system. f you want to do this you *use a plugin*. The FFI is a different beast. It is to allow programers to interface to external librarys that are *independent from teh VM*.
I'm not going to answer this one again. OK?
. But I think that the textual notation suggested above is simpler. The test files are easier to distribute and change. Shared objects and plugins have a habit of going stale, and there needs to be metadata in there to describe the set of constants etc, which is tricky to generate and parse because it is binary (pointer sizes, etc, etc). Instead a simple textual format should be much more robust. One could even edit by hand to add new constants. It would be easy to make the textual file a versioned file. Etc, etc.
OK. Got it. And do you think using X Macros for the autogenerated C (from the SharedPool) is a good idea? And then I simply write a text file out of it.
I know that different users will need different constants. But let's say the infrastructure (plugin etc) is already done. And let's say I am a user that I want to build something with FFI and I need some constants that I see are not defined. Then I can simply add the ones I need in the plugin, and next VM release will have those. If Cog gets moved to Github, then this is even easier. Everybody can do a PR with the constants he needs. And in fact, if we have the infrastructure in place, I think that we each of us spend half an hour, we may have almost everything we need.
For example, I can add myself all those for signals (to use kill() from FFI), all those from fcntl (to make none blocking pipes), all those from wait()/waitpid() family (so that I can do a waitpid() with WNOHANG), etc etc etc.
I know it's not the best approach but it's something that could be done very easily and would allow A LOT of stuff to be moved to FFI just because we have no access to preprocess constants or sizeof() (to know how to allocate). I also know this won't cover macros and other stuff. But still.
If you think this is a good idea, I can spend the time to do it.
Cheers,
On Thu, May 10, 2012 at 10:09 AM, Nick Ager nick.ager@gmail.com wrote:
<snip> Well, like opendbx, maybe because opengl has quite standard interface... </snip>
and
<snip> It's not that it's not doable, it's that we gonna reinvent gaz plant and it gonna be so boring... I'd like to see a proof of concept, even if we restrict to libc, libm, kernel.dll, msvcrt.dll ... </snip>
<snip> Is the unix style select() ubiquitous or should I use WaitForMultipleObject() on Windows? Are specification of read/write streams implementation machine independant (bsd/sysv/others...) </snip>
Perhaps *a* way forward is to try to find existing projects which have already created cross-platform abstractions for platform specific functionality. Then we can use FFI to access that interface in a similar way to OpenGL and OpenDBX. For example NodeJs works across unixes - perhaps they have a useful cross-platform abstraction, boost has abstractions of IPC etc
Nick
-- Mariano http://marianopeck.wordpress.com
-- _,,,^..^,,,_ best, Eliot
-- Mariano http://marianopeck.wordpress.com
Hi Eliot,
Thanks, much clearer now. Sometimes I am slow :) I was confused because I was only thinking in libc kind of lib (very kernel and very likely used by the VM). But when you gave the SQL example, then I did get the general nature you were trying to explain. So it's clear now.
I would like to add 2 more comments:
1) Do you agree that besides the name / value it would also help having the result of sizeof ? Otherwise, I may still find problems when I need to allocate from FFI and it's not clear size of a struct (as it was my case same days ago). So in this case, it would be kind of an array rather than a key / value pairs.
2) As for the autogenerated C file, do you think X Macros is a good idea? See http://stackoverflow.com/questions/264269/what-is-a-good-reference-documenti...
Thanks,
On Sun, Jan 17, 2016 at 12:40 AM, Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Mariano,
On Sat, Jan 16, 2016 at 6:25 PM, Mariano Martinez Peck < marianopeck@gmail.com> wrote:
On Sat, Jan 16, 2016 at 11:02 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
On Sat, Jan 16, 2016 at 6:00 AM, Mariano Martinez Peck < marianopeck@gmail.com> wrote:
Hi all,
Sorry for reviving an old thread but I thought it was better to continue the discussion here because of the context. As you may have read, the other day I released a first approeach to a subset of OSProcess based on FFI (posix_spwan() family of functions):
https://github.com/marianopeck/OSSubprocess
And with that in mind, I wanted to share a few things with you. The main 2 problems I found with implementing this with FFI was:
- We have all already agree and discussed that fork+exec cannot be
done in separate FFI calls. So at the very min you need either a plugin method that does the fork()+exec() OR wrapping a lib like posix_spwan()
- The other main problem, is, as you all said (and mostly Nicolas),
is the problems with the preprocessor (constants, macros, etc).
With all that said, I was able to get my stuff working. However, I am still using some primitives of OSProcess plugin because of 2).
I read Eliot idea and what I don't like is the need of a C compiler in the user machine. I think that's a high constrain. Then Igor suggested that WE (developers and maintainers of a certain tool) are the ones that compiles the little C program to extract constant values etc and then WE provide as part of our source code, some packages with some SharedPool depending on the platform/OS. And Igor approach looked a bit better to me.
You misunderstand the proposal.
I think I did. But let me confirm that below ;)
The C compiler is needed /only when changing the set of constants/, i.e. when /developing/ the interface. The C compiler is /not/ needed when deploying.
The idea is to a) at development time, e.g. when a new variable is added to a SharedPool containing platform constants, a C program is autogenerated that outputs in some format a description of the names and values of all the constants defined in the pool. One convenient notation is e.g. STON. For the purposes of this discussion let's assume we're using ston, but any format the image an parse (or indeed a shared object the image can load on teh current pkatform) will do. The output of the autogenerated C program would be called something like <SharedPoolName>.<PlatformName>.ston, e.g. UnixConstants.MacOSX64.ston or UnixConstants.Linux32.ston. The ston files can easily be parsed by facilities in the Smalltalk image.
b) when deploying the system to a set of platforms one includes all the relevant platform-specific ston files.
OK. But let me ask something. Below you said "be it a plugin or a dll doesn't matter". To autogenerate the C program, I must know which header files to include for each platform and probably a few others things. For example, besides exporting the value, I would also like to export the sizeof(). At that depends how was the VM compiled, right? So...my question is...if such a autogenerated C code could be part of the VM building (considering all the settings being assume when building), cannot I reuse the knowledge the VM already has? Like which header files to include, if it was compiled 32 bits or 64 bits, which C compiler to use, etc..
I actually said that using text is easier than a dll. So I'm saying autogenerate a C program that outputs name-value pairs in some convenient textual representation, e.g. ston. But answering your question...
The knowledge in the VM as to what header files are included *applies only to the include files the VM uses*. The VM uses a subset of the platform. It doesn't for example include any headers that define a database interface. It doesn't include header files that define the interface to a UI tooklit such at GTK. Etc, etc. So in fact the VM *doesn't* include the knowledge one needs to determine the set of include files for an arbitrary FFI interface. And even so, the include files that it does use are in the VM's platform source files, and that information is not readily accessible.
Let me summarise. No, the VM cannot be used to determine the set of include files needed to generate constants used in an arbitrary FFI interface.
What I mean is if it would be easier if I take the SharedPool at VM
building time, and from there I autogenerate (and run) the C code that would generate the output. Then, when we "deploy" the VM, we can deploy it with relevant platform specific ston files as you said.
No. The VM is something that provides an FFI. It doesn't *define* an FFI. One must be able to develop an FFI interface without needing to rebuild the VM. So computing the values of constants should be *separate* from building a VM. Now let me give you more of an example.
Let's say we define a subclass of SharedPool called FFISharedPool. FFISharedPool 's job is to manage autogenerating a C file, compiling it for the platform, and organizing parsing the relevant output. Let's say we use a convention like class-side pragmas to define include files, and compiler flags. The VM provides two crucial pieces of information:
- the platform name
- the word size
One can't run a Mac OS VM on Linux, and one can't run a 64-bit VM on a 32-bit operating system. So taking this information from the VM accurately tells the current system what ABI (application binary interface) to use, and that's what's important in generating the right constants.
So we use these two pieces of information to index the method pragmas that tell us what specific files to include.
Let's imagine we subclass FFISharedPool to add a shared pool for constants for an SQL database. We might have a class declaration like
FFISharedPool subclass: #MYSQLInterface instanceVariableNames: '' classVariableNames: 'MYSQL_DEFAULT_AUTH MYSQL_ENABLE_CLEARTEXT_PLUGIN MYSQL_INIT_COMMAND MYSQL_OPT_BIND MYSQL_OPT_CAN_HANDLE_EXPIRED_PASSWORDS MYSQL_OPT_COMPRESS MYSQL_OPT_CONNECT_ATTR_DELETE MYSQL_OPT_CONNECT_ATTR_RESET' poolDictionaries: '' category: 'MYSQLInterface-Pools'
The job of FFISharedPool is to compute the right values for the class variables on every platform we want to deploy the MYSQL interface on.
So we need to know the relevant include files and C flags for each platform/word-size combination. A few of them might look like
MYSQLInterface class methods for platform information mac32 "I describe the include files and C flags to use when developing a 32-bit MYSQL FFI interface on Mac OS X" <platformName: 'Mac OS' wordSize: 4> <cFlags: #('-m32') includeFiles: #('/opt/mysql/include32')> ^self "all the info is in the pragmas"
mac64 "I describe the include files and C flags to use when developing a 64-bit MYSQL FFI interface on Mac OS X" <platformName: 'Mac OS' wordSize: 8> <cFlags: #('-m64') includeFiles: #('/opt/mysql/include64')>
The above might cause FFISharedPool to autogenerate files called MYSQLInterface.mac32.c & MYSQLInterface.mac64.c. And these, when run, might output ston notation to MYSQLInterface.mac32.ston & MYSQLInterface.mac64.ston (or maybe to stdout which has to be redirected to MYSQLInterface.mac32.ston; whatever).
Now, you might use pragmas, or you might answer a Dictionary instance. What ever style pleases you and seems convenient and readable. But these methods define the necessary metadata (C flags, include paths, and ...?) for FFISharedPool to autogenerate the C program that, when compiled with the supplied C flags and run on the current platform, outputs the values for the constants the shared pool wants to define.
You can get fancy and have FFISharedPool autogenerate the C programs whenever one adds or removes a constant name. Or you can require the programmer run something, e.g. MYSQLInterface generateInterfaces. It's really nice if FFISharedPool submits the file to the C compiler automatically, but this can only work for e.g. 32 & 64 bit versions on a single platform. You have to compile the autogenerated program on the relevant platform, with the necessary libraries and include files installed.
You could imagine a set of servers for different platforms so one could submit the autogenerated program for compilation and execution on each platform. That's a facility I'd make it easy to implement. I could imagine that a programmer whose company develops an FFI interface and deploys it on a number of platforms would love to be able to automate compiling and running the relevant autogenerated code on a set of servers. I could imagine the Pharo community providing a set of servers upon which lots of software is installed for precisely this purpose. That means that people could develop FFI interfaces without even having to have the C compiler installed on their platform.
You could also add a C parser to FFISharedPool that parses the post-preprocessed code and extracts function declarations. But the important thing is autogenerating the C program so that it generates easily parsable output containing the values for the constants. You can extend the system in interesting ways once you ave this core functionality implemented.
So once the program is autogenerated and compiled for the current platform, it is run and its output collected in a file whose name can be recognised by FFISharedPool.
Now the class side of FFISharedPool might be declared as
FFIShardPool class instanceVariableNames: 'platformName wordSize'
and on start-up FFIShardPool could examine its subclasses, and for each whose platformName & wordSize do not match the current platform, search for all the matching FOOInterface.plat.ston files, parse them and update the subclasses' variables, and update that pool's platformName & wordSize. It could emit a warning on the Transcript or stdout (headful vs headless) indicating which subclasses it couldn't find the relevant FOOInterface.plat.ston files for.
But the end result is that
a) providing the system is deployed with FOOInterface.plat.ston files for each interface and platform used, a cross-platform application can be deployed *that does not require a C compiler*. b) providing that a system's FOOInterface files have been initialized on the intended platform, a platform-specific application can be deployed for a single platform *without needing the ston files*.
Does this make more sense now?
c) at startup the image checks its current platform. If the platform is
the same that it was saved on, no action is taken. But if the platform as changed then the relevant ston file is selected, parsed, and the values for the variables in the shared pool updated to reflect the values of the current platform.
So the C compiler is only needed when developing the interface, not when deploying it.
OK
Then Nicolas made a point that if we plan to manage all that complexity at the image level it may become a hell too.
So.... what if we take a simpler (probably not better) approach and we consider the "c program that exports constants and sizes" a VM Plugin? Let's say we have a UnixPreprocessorPlugin (that would work for OSX, Linux and other's Unix I imagine for the time being) which provides a function (that is exported) which answers an array of arrays. For each constant, we include the name of the constant, the value, and the sizeof(). Then from image side, we simply do one FFI call, we get the large array and we adapt it to a SharedPool or whatever kind of object representing that info.
This is what I suggestred in teh first place. That what is autogenerated is a shared object (be it a plgin or a dll doesn't matter, it is machine code generated by a C compiler form an autogenerated C program compiled with the platform's C compiler) that can be loaded at run-time and interrogated to fetch the values of a set of variables
OK, got it. But still, it would be easier if the "platform" in this case is the "machine where we build the VM we will then distribute" right? i mean, I would like to put this in the CI jobs that automatically builds the VM, and not myself building for each platform.
NO! For example, why would a company that has some proprietary arithmetic package implemented in its secret labs in C or C++ and accessed through the FFI want to have that code on the Pharo community's build servers?
*I mean, my main doubt is if this job of autogenerating C code, compile it, run it, export text file, and distribute text file with the VM, could be done as part of the VM building. *
For fuck's sake. Developing an FFI is not something one does when building a VM. It is something one does wen using the system. f you want to do this you *use a plugin*. The FFI is a different beast. It is to allow programers to interface to external librarys that are *independent from teh VM*.
I'm not going to answer this one again. OK?
. But I think that the textual notation suggested above is simpler. The test files are easier to distribute and change. Shared objects and plugins have a habit of going stale, and there needs to be metadata in there to describe the set of constants etc, which is tricky to generate and parse because it is binary (pointer sizes, etc, etc). Instead a simple textual format should be much more robust. One could even edit by hand to add new constants. It would be easy to make the textual file a versioned file. Etc, etc.
OK. Got it. And do you think using X Macros for the autogenerated C (from the SharedPool) is a good idea? And then I simply write a text file out of it.
I know that different users will need different constants. But let's say the infrastructure (plugin etc) is already done. And let's say I am a user that I want to build something with FFI and I need some constants that I see are not defined. Then I can simply add the ones I need in the plugin, and next VM release will have those. If Cog gets moved to Github, then this is even easier. Everybody can do a PR with the constants he needs. And in fact, if we have the infrastructure in place, I think that we each of us spend half an hour, we may have almost everything we need.
For example, I can add myself all those for signals (to use kill() from FFI), all those from fcntl (to make none blocking pipes), all those from wait()/waitpid() family (so that I can do a waitpid() with WNOHANG), etc etc etc.
I know it's not the best approach but it's something that could be done very easily and would allow A LOT of stuff to be moved to FFI just because we have no access to preprocess constants or sizeof() (to know how to allocate). I also know this won't cover macros and other stuff. But still.
If you think this is a good idea, I can spend the time to do it.
Cheers,
On Thu, May 10, 2012 at 10:09 AM, Nick Ager nick.ager@gmail.com wrote:
<snip> Well, like opendbx, maybe because opengl has quite standard interface... </snip>
and
<snip> It's not that it's not doable, it's that we gonna reinvent gaz plant and it gonna be so boring... I'd like to see a proof of concept, even if we restrict to libc, libm, kernel.dll, msvcrt.dll ... </snip>
<snip> Is the unix style select() ubiquitous or should I use WaitForMultipleObject() on Windows? Are specification of read/write streams implementation machine independant (bsd/sysv/others...) </snip>
Perhaps *a* way forward is to try to find existing projects which have already created cross-platform abstractions for platform specific functionality. Then we can use FFI to access that interface in a similar way to OpenGL and OpenDBX. For example NodeJs works across unixes - perhaps they have a useful cross-platform abstraction, boost has abstractions of IPC etc
Nick
-- Mariano http://marianopeck.wordpress.com
-- _,,,^..^,,,_ best, Eliot
-- Mariano http://marianopeck.wordpress.com
-- _,,,^..^,,,_ best, Eliot
Let's say we define a subclass of SharedPool called FFISharedPool. FFISharedPool 's job is to manage autogenerating a C file, compiling it for the platform, and organizing parsing the relevant output. Let's say we use a convention like class-side pragmas to define include files, and compiler flags. The VM provides two crucial pieces of information:
- the platform name
- the word size
One can't run a Mac OS VM on Linux, and one can't run a 64-bit VM on a 32-bit operating system. So taking this information from the VM accurately tells the current system what ABI (application binary interface) to use, and that's what's important in generating the right constants.
So we use these two pieces of information to index the method pragmas that tell us what specific files to include.
Let's imagine we subclass FFISharedPool to add a shared pool for constants for an SQL database. We might have a class declaration like
FFISharedPool subclass: #MYSQLInterface instanceVariableNames: '' classVariableNames: 'MYSQL_DEFAULT_AUTH MYSQL_ENABLE_CLEARTEXT_PLUGIN MYSQL_INIT_COMMAND MYSQL_OPT_BIND MYSQL_OPT_CAN_HANDLE_EXPIRED_PASSWORDS MYSQL_OPT_COMPRESS MYSQL_OPT_CONNECT_ATTR_DELETE MYSQL_OPT_CONNECT_ATTR_RESET' poolDictionaries: '' category: 'MYSQLInterface-Pools'
The job of FFISharedPool is to compute the right values for the class variables on every platform we want to deploy the MYSQL interface on.
So we need to know the relevant include files and C flags for each platform/word-size combination. A few of them might look like
MYSQLInterface class methods for platform information mac32 "I describe the include files and C flags to use when developing a 32-bit MYSQL FFI interface on Mac OS X" <platformName: 'Mac OS' wordSize: 4> <cFlags: #('-m32') includeFiles: #('/opt/mysql/include32')> ^self "all the info is in the pragmas"
mac64 "I describe the include files and C flags to use when developing a 64-bit MYSQL FFI interface on Mac OS X" <platformName: 'Mac OS' wordSize: 8> <cFlags: #('-m64') includeFiles: #('/opt/mysql/include64')>
The above might cause FFISharedPool to autogenerate files called MYSQLInterface.mac32.c & MYSQLInterface.mac64.c. And these, when run, might output ston notation to MYSQLInterface.mac32.ston & MYSQLInterface.mac64.ston (or maybe to stdout which has to be redirected to MYSQLInterface.mac32.ston; whatever).
Now, you might use pragmas, or you might answer a Dictionary instance. What ever style pleases you and seems convenient and readable. But these methods define the necessary metadata (C flags, include paths, and ...?) for FFISharedPool to autogenerate the C program that, when compiled with the supplied C flags and run on the current platform, outputs the values for the constants the shared pool wants to define.
You can get fancy and have FFISharedPool autogenerate the C programs whenever one adds or removes a constant name. Or you can require the programmer run something, e.g. MYSQLInterface generateInterfaces. It's really nice if FFISharedPool submits the file to the C compiler automatically, but this can only work for e.g. 32 & 64 bit versions on a single platform. You have to compile the autogenerated program on the relevant platform, with the necessary libraries and include files installed.
You could imagine a set of servers for different platforms so one could submit the autogenerated program for compilation and execution on each platform. That's a facility I'd make it easy to implement. I could imagine that a programmer whose company develops an FFI interface and deploys it on a number of platforms would love to be able to automate compiling and running the relevant autogenerated code on a set of servers. I could imagine the Pharo community providing a set of servers upon which lots of software is installed for precisely this purpose. That means that people could develop FFI interfaces without even having to have the C compiler installed on their platform.
You could also add a C parser to FFISharedPool that parses the post-preprocessed code and extracts function declarations. But the important thing is autogenerating the C program so that it generates easily parsable output containing the values for the constants. You can extend the system in interesting ways once you ave this core functionality implemented.
So once the program is autogenerated and compiled for the current platform, it is run and its output collected in a file whose name can be recognised by FFISharedPool.
Hi Eliot,
OK, I have currently a very first prototype where I can autogenerate the C file from a FFISharedPool subclass, compile it, run it and get the ston file output. Please, read below.
Now the class side of FFISharedPool might be declared as
FFIShardPool class instanceVariableNames: 'platformName wordSize'
and on start-up FFIShardPool could examine its subclasses, and for each whose platformName & wordSize do not match the current platform, search for all the matching FOOInterface.plat.ston files, parse them and update the subclasses' variables, and update that pool's platformName & wordSize. It could emit a warning on the Transcript or stdout (headful vs headless) indicating which subclasses it couldn't find the relevant FOOInterface.plat.ston files for.
But the end result is that
a) providing the system is deployed with FOOInterface.plat.ston files for each interface and platform used, a cross-platform application can be deployed *that does not require a C compiler*. b) providing that a system's FOOInterface files have been initialized on the intended platform, a platform-specific application can be deployed for a single platform *without needing the ston files*.
I was thinking the following. Having to distribute the FFI wrapper (take as an example the myself wrapper) with the .ston files is a bit of a pain with MC. So I was thinking...what if FFISharedPool has all the machinery to allow FFI lib wrapper developer (the developer of the MySQL wrapper), to autogenerate the ston file as we said, BUT, the ston file is stored as methods in the MYSQLInterface subclass? Probably under a "autogenerated" protocol. That way, it's very easy to distribute and in addition, at system startup it's easier to "search" for the "ston files".
The only drawback is the for very large ston files MC will suffer a bit.. but..
Thoughts?
Does this make more sense now?
c) at startup the image checks its current platform. If the platform is
the same that it was saved on, no action is taken. But if the platform as changed then the relevant ston file is selected, parsed, and the values for the variables in the shared pool updated to reflect the values of the current platform.
So the C compiler is only needed when developing the interface, not when deploying it.
OK
Then Nicolas made a point that if we plan to manage all that complexity at the image level it may become a hell too.
So.... what if we take a simpler (probably not better) approach and we consider the "c program that exports constants and sizes" a VM Plugin? Let's say we have a UnixPreprocessorPlugin (that would work for OSX, Linux and other's Unix I imagine for the time being) which provides a function (that is exported) which answers an array of arrays. For each constant, we include the name of the constant, the value, and the sizeof(). Then from image side, we simply do one FFI call, we get the large array and we adapt it to a SharedPool or whatever kind of object representing that info.
This is what I suggestred in teh first place. That what is autogenerated is a shared object (be it a plgin or a dll doesn't matter, it is machine code generated by a C compiler form an autogenerated C program compiled with the platform's C compiler) that can be loaded at run-time and interrogated to fetch the values of a set of variables
OK, got it. But still, it would be easier if the "platform" in this case is the "machine where we build the VM we will then distribute" right? i mean, I would like to put this in the CI jobs that automatically builds the VM, and not myself building for each platform.
NO! For example, why would a company that has some proprietary arithmetic package implemented in its secret labs in C or C++ and accessed through the FFI want to have that code on the Pharo community's build servers?
*I mean, my main doubt is if this job of autogenerating C code, compile it, run it, export text file, and distribute text file with the VM, could be done as part of the VM building. *
For fuck's sake. Developing an FFI is not something one does when building a VM. It is something one does wen using the system. f you want to do this you *use a plugin*. The FFI is a different beast. It is to allow programers to interface to external librarys that are *independent from teh VM*.
I'm not going to answer this one again. OK?
. But I think that the textual notation suggested above is simpler. The test files are easier to distribute and change. Shared objects and plugins have a habit of going stale, and there needs to be metadata in there to describe the set of constants etc, which is tricky to generate and parse because it is binary (pointer sizes, etc, etc). Instead a simple textual format should be much more robust. One could even edit by hand to add new constants. It would be easy to make the textual file a versioned file. Etc, etc.
OK. Got it. And do you think using X Macros for the autogenerated C (from the SharedPool) is a good idea? And then I simply write a text file out of it.
I know that different users will need different constants. But let's say the infrastructure (plugin etc) is already done. And let's say I am a user that I want to build something with FFI and I need some constants that I see are not defined. Then I can simply add the ones I need in the plugin, and next VM release will have those. If Cog gets moved to Github, then this is even easier. Everybody can do a PR with the constants he needs. And in fact, if we have the infrastructure in place, I think that we each of us spend half an hour, we may have almost everything we need.
For example, I can add myself all those for signals (to use kill() from FFI), all those from fcntl (to make none blocking pipes), all those from wait()/waitpid() family (so that I can do a waitpid() with WNOHANG), etc etc etc.
I know it's not the best approach but it's something that could be done very easily and would allow A LOT of stuff to be moved to FFI just because we have no access to preprocess constants or sizeof() (to know how to allocate). I also know this won't cover macros and other stuff. But still.
If you think this is a good idea, I can spend the time to do it.
Cheers,
On Thu, May 10, 2012 at 10:09 AM, Nick Ager nick.ager@gmail.com wrote:
<snip> Well, like opendbx, maybe because opengl has quite standard interface... </snip>
and
<snip> It's not that it's not doable, it's that we gonna reinvent gaz plant and it gonna be so boring... I'd like to see a proof of concept, even if we restrict to libc, libm, kernel.dll, msvcrt.dll ... </snip>
<snip> Is the unix style select() ubiquitous or should I use WaitForMultipleObject() on Windows? Are specification of read/write streams implementation machine independant (bsd/sysv/others...) </snip>
Perhaps *a* way forward is to try to find existing projects which have already created cross-platform abstractions for platform specific functionality. Then we can use FFI to access that interface in a similar way to OpenGL and OpenDBX. For example NodeJs works across unixes - perhaps they have a useful cross-platform abstraction, boost has abstractions of IPC etc
Nick
-- Mariano http://marianopeck.wordpress.com
-- _,,,^..^,,,_ best, Eliot
-- Mariano http://marianopeck.wordpress.com
-- _,,,^..^,,,_ best, Eliot
Hi
2016-01-21 13:43 GMT+01:00 Mariano Martinez Peck marianopeck@gmail.com:
I was thinking the following. Having to distribute the FFI wrapper (take as an example the myself wrapper) with the .ston files is a bit of a pain with MC. So I was thinking...what if FFISharedPool has all the machinery to allow FFI lib wrapper developer (the developer of the MySQL wrapper), to autogenerate the ston file as we said, BUT, the ston file is stored as methods in the MYSQLInterface subclass? Probably under a "autogenerated" protocol. That way, it's very easy to distribute and in addition, at system startup it's easier to "search" for the "ston files".
The only drawback is the for very large ston files MC will suffer a bit.. but..
Thoughts?
After reading this thread I have no understanding why "platform constants" information should be distributed as ston files? Why not generate smalltalk classes or methods for each platform? They will initialize pool variables directly when platform change happens. And no problems with there distribution
2016-01-21 15:18 GMT+01:00 Denis Kudriashov dionisiydk@gmail.com:
Hi
2016-01-21 13:43 GMT+01:00 Mariano Martinez Peck marianopeck@gmail.com:
I was thinking the following. Having to distribute the FFI wrapper (take as an example the myself wrapper) with the .ston files is a bit of a pain with MC. So I was thinking...what if FFISharedPool has all the machinery to allow FFI lib wrapper developer (the developer of the MySQL wrapper), to autogenerate the ston file as we said, BUT, the ston file is stored as methods in the MYSQLInterface subclass? Probably under a "autogenerated" protocol. That way, it's very easy to distribute and in addition, at system startup it's easier to "search" for the "ston files".
The only drawback is the for very large ston files MC will suffer a bit.. but..
Thoughts?
After reading this thread I have no understanding why "platform constants" information should be distributed as ston files? Why not generate smalltalk classes or methods for each platform? They will initialize pool variables directly when platform change happens. And no problems with there distribution
In your example it can be methods on MYSQLInterface class side like: MYSQLInterface>>initWindows64Declarations
ConstantA := 1 ConstantB := 5
MYSQLInterface>>initWindows32Declarations
ConstantA := 2. ConstantB := 6
MYSQLInterface>>initUnix32Declarations
ConstantA := 3. ConstantB := 7
And this methods can be published in separate packages if needed
On Thu, Jan 21, 2016 at 6:24 AM, Denis Kudriashov dionisiydk@gmail.com wrote:
2016-01-21 15:18 GMT+01:00 Denis Kudriashov dionisiydk@gmail.com:
Hi
2016-01-21 13:43 GMT+01:00 Mariano Martinez Peck marianopeck@gmail.com:
I was thinking the following. Having to distribute the FFI wrapper (take as an example the myself wrapper) with the .ston files is a bit of a pain with MC. So I was thinking...what if FFISharedPool has all the machinery to allow FFI lib wrapper developer (the developer of the MySQL wrapper), to autogenerate the ston file as we said, BUT, the ston file is stored as methods in the MYSQLInterface subclass? Probably under a "autogenerated" protocol. That way, it's very easy to distribute and in addition, at system startup it's easier to "search" for the "ston files".
The only drawback is the for very large ston files MC will suffer a bit.. but..
Thoughts?
After reading this thread I have no understanding why "platform constants" information should be distributed as ston files? Why not generate smalltalk classes or methods for each platform? They will initialize pool variables directly when platform change happens. And no problems with there distribution
In your example it can be methods on MYSQLInterface class side like: MYSQLInterface>>initWindows64Declarations
ConstantA := 1 ConstantB := 5
MYSQLInterface>>initWindows32Declarations
ConstantA := 2. ConstantB := 6
MYSQLInterface>>initUnix32Declarations
ConstantA := 3. ConstantB := 7
And this methods can be published in separate packages if needed
Let's measure this. Let's say we have 8 platforms (that's an underestimate, because different Linux distributions may have different values for certain constants), but 8, which is 4 basic platforms times 32- & 64-bits. We have Mac x86 32-bit, Mac x64 64-bit, Windows x86 32-bit, Windows x64 64-bit, Linux x86 32-bit, Linux ARM 32-bit, Linux x64 64-bit, and soon enough there will be more. Further, there may be different versions over time.
So each of those initialization methods has - 1 slot for the global variable to be assigned - 1 slot for the literal value to assign to it - 3 bytes of bytecode per initialization for small methods, 4 for large methods. Let's say 4.
So the overhead in 32-bits is 12 bytes per constant, and in 64-bits is 20 bytes. So the overhead per constant for all platforms is 96 bytes per constant in 32-bits and 160 bytes per constant for 64-bits. A full system with sockets, files, a database connexion etc could easily exceed 100 constants. I think it would be nearer 1000. So the overheads are in the 10- to 100-k byte range (100k ~= 0.5% of the image) on 32-bits. That's low but it's also pure overhead. Every GC has to visit them. Every senders and implementors has to visit them, but they offer nothing of value. Whereas the small parser for whatever notation is used to store the constants externally (if they are needed in a given deployment) has a small constant overhead; its simple code.
Further, you still need the machinery to export the constants to be able to generate these initialization methods. If you've got the machinery and you don't need the methods why bother to generate the methods?
As the Scots say, many a mickle makes a muckle. _,,,^..^,,,_ best, Eliot
2016-01-22 22:35 GMT+01:00 Eliot Miranda eliot.miranda@gmail.com:
Let's measure this. Let's say we have 8 platforms (that's an underestimate, because different Linux distributions may have different values for certain constants), but 8, which is 4 basic platforms times 32- & 64-bits. We have Mac x86 32-bit, Mac x64 64-bit, Windows x86 32-bit, Windows x64 64-bit, Linux x86 32-bit, Linux ARM 32-bit, Linux x64 64-bit, and soon enough there will be more. Further, there may be different versions over time.
So each of those initialization methods has
- 1 slot for the global variable to be assigned
- 1 slot for the literal value to assign to it
- 3 bytes of bytecode per initialization for small methods, 4 for large
methods. Let's say 4.
So the overhead in 32-bits is 12 bytes per constant, and in 64-bits is 20 bytes. So the overhead per constant for all platforms is 96 bytes per constant in 32-bits and 160 bytes per constant for 64-bits. A full system with sockets, files, a database connexion etc could easily exceed 100 constants. I think it would be nearer 1000. So the overheads are in the 10- to 100-k byte range (100k ~= 0.5% of the image) on 32-bits. That's low but it's also pure overhead. Every GC has to visit them. Every senders and implementors has to visit them, but they offer nothing of value. Whereas the small parser for whatever notation is used to store the constants externally (if they are needed in a given deployment) has a small constant overhead; its simple code.
Further, you still need the machinery to export the constants to be able to generate these initialization methods. If you've got the machinery and you don't need the methods why bother to generate the methods?
As the Scots say, many a mickle makes a muckle.
Thank's Eliot for such detailed explanation. It makes sense. But personally I prefer Smalltalk solution although Smalltalk itself is pure overhead comparing to C.
My question was raised by Mariano idea to save ston files in methods. I think it can reduce problems which you described. But then literal array syntax can be more suitable than ston.
Hi Denis,
On Jan 23, 2016, at 7:30 AM, Denis Kudriashov dionisiydk@gmail.com wrote:
2016-01-22 22:35 GMT+01:00 Eliot Miranda eliot.miranda@gmail.com:
Let's measure this. Let's say we have 8 platforms (that's an underestimate, because different Linux distributions may have different values for certain constants), but 8, which is 4 basic platforms times 32- & 64-bits. We have Mac x86 32-bit, Mac x64 64-bit, Windows x86 32-bit, Windows x64 64-bit, Linux x86 32-bit, Linux ARM 32-bit, Linux x64 64-bit, and soon enough there will be more. Further, there may be different versions over time.
So each of those initialization methods has
- 1 slot for the global variable to be assigned
- 1 slot for the literal value to assign to it
- 3 bytes of bytecode per initialization for small methods, 4 for large methods. Let's say 4.
So the overhead in 32-bits is 12 bytes per constant, and in 64-bits is 20 bytes. So the overhead per constant for all platforms is 96 bytes per constant in 32-bits and 160 bytes per constant for 64-bits. A full system with sockets, files, a database connexion etc could easily exceed 100 constants. I think it would be nearer 1000. So the overheads are in the 10- to 100-k byte range (100k ~= 0.5% of the image) on 32-bits. That's low but it's also pure overhead. Every GC has to visit them. Every senders and implementors has to visit them, but they offer nothing of value. Whereas the small parser for whatever notation is used to store the constants externally (if they are needed in a given deployment) has a small constant overhead; its simple code.
Further, you still need the machinery to export the constants to be able to generate these initialization methods. If you've got the machinery and you don't need the methods why bother to generate the methods?
As the Scots say, many a mickle makes a muckle.
Thank's Eliot for such detailed explanation. It makes sense. But personally I prefer Smalltalk solution although Smalltalk itself is pure overhead comparing to C.
I can see the draw of the pure Smalltalk. Simplicity and brows ability. But imagine a tiny headless image deployed on containers, say 2mb. Now 100kb of initialization code doesn't look so good :-). Anyway I'm beating a dead horse. Mariano is generating initialization methods.
My question was raised by Mariano idea to save ston files in methods. I think it can reduce problems which you described. But then literal array syntax can be more suitable than ston.
I just want to be clear, I'm neutral about the notation used to export info from the C file. Liberal array syntax, chunk source format, ston, xml. It doesn't matter as long as it's convenient at expressing an attribute dictionary from names to attributes such as value, size, offset. Don't get hung up on the specific notation. If one were to go with the external file the only real requirements are that it be reasonably compact and quick to parse. That might kill xml but leave plenty of other candidates.
_,,,^..^,,,_ (phone)
On Thu, Jan 21, 2016 at 6:18 AM, Denis Kudriashov dionisiydk@gmail.com wrote:
Hi
2016-01-21 13:43 GMT+01:00 Mariano Martinez Peck marianopeck@gmail.com:
I was thinking the following. Having to distribute the FFI wrapper (take as an example the myself wrapper) with the .ston files is a bit of a pain with MC. So I was thinking...what if FFISharedPool has all the machinery to allow FFI lib wrapper developer (the developer of the MySQL wrapper), to autogenerate the ston file as we said, BUT, the ston file is stored as methods in the MYSQLInterface subclass? Probably under a "autogenerated" protocol. That way, it's very easy to distribute and in addition, at system startup it's easier to "search" for the "ston files".
The only drawback is the for very large ston files MC will suffer a bit.. but..
Thoughts?
After reading this thread I have no understanding why "platform constants" information should be distributed as ston files? Why not generate smalltalk classes or methods for each platform? They will initialize pool variables directly when platform change happens. And no problems with there distribution
Because one doesn't need the methods that initialize them, one only needs the values. The initialization methods are overhead. We discussed the issue earlier in the thread. See Thierry's message and my response. _,,,^..^,,,_ best, Eliot
vm-dev@lists.squeakfoundation.org