[squeak-dev] Re: [Vm-dev] [OSProcess] forking and file descriptors
maxleske at gmail.com
Fri Jan 9 08:32:53 UTC 2015
(Resending with proper subject…)
> On 08 Jan 2015, at 20:48, squeak-dev-request at lists.squeakfoundation.org <mailto:squeak-dev-request at lists.squeakfoundation.org> wrote:
> Date: Thu, 8 Jan 2015 16:56:30 +0100
> From: Henrik Johansen <henrik.s.johansen at veloxit.no <mailto:henrik.s.johansen at veloxit.no>>
> Subject: [squeak-dev] Re: [Vm-dev] [OSProcess] forking and file
>> On 08 Jan 2015, at 11:37 , Max Leske <maxleske at gmail.com <mailto:maxleske at gmail.com>> wrote:
>> We currently use ImageSegment to create snapshots of our object graphs. To ensure consistency (and for performance reasons) we create a fork of the image and then run the segment creation in the fork. We’ve always had minor issues with TCP sockets but they are pretty rare and have never corrupted any data (we close the TCP connections in the child).
>> Recently however, we created a new application which also makes heavy use of a database and now it seems that forking creates a real problem. In anticipation of possible problems I opted to destroy all sockets (with Socket>>destroy) in the fork, thinking that, since all file descriptors are copies of the ones in the parent process, the sockets in the parent process should be unaffected , .
>> With that mechanism in place however, we are seeing very weird things, such as multiples sockets in the parent (!) having the same file handle (which leads to the wrong data being read from the database and, in turn, corrupt objects).
>> AFAICT, the OSProcess plugin doesn’t offer any way of dealing with such problems so I was wondering if anybody has had any experience with these kinds of issues and whether there is some kind of best practice.
>> I am aware that the most simple option is to close the sockets in the parent before forking, but that will mean that we would have to wait for all database connections to finish executing, then blocking them to prevent new connections to the database. Depending on the time a query takes (which may well be a couple of seconds in our case) clients would need to wait for quite a long time before their request can be answered (and this scenario of course assumes that we only close the database sockets and leave the TCP sockets open…).
>> So under the condition that I need to fork that image, what is the best way to deal with open file descriptors?
>> Thanks for your time.
>>  http://man7.org/linux/man-pages/man2/fork.2.html <http://man7.org/linux/man-pages/man2/fork.2.html>
>>  http://man7.org/linux/man-pages/man2/clone.2.html <http://man7.org/linux/man-pages/man2/clone.2.html>
> If I understand the source correctly (at least on Unix, https://github.com/pharo-project/pharo-vm/blob/master/platforms/unix/plugins/SocketPlugin/sqUnixSocket.c <https://github.com/pharo-project/pharo-vm/blob/master/platforms/unix/plugins/SocketPlugin/sqUnixSocket.c> <https://github.com/pharo-project/pharo-vm/blob/master/platforms/unix/plugins/SocketPlugin/sqUnixSocket.c <https://github.com/pharo-project/pharo-vm/blob/master/platforms/unix/plugins/SocketPlugin/sqUnixSocket.c>>)
> The socketHandle in a Socket instance is a pointer to a private (platform-specific) struct.
> That struct again has a handle to the native socket, which I assume is what gets copied when you fork a process?
> Socket >> primDestroySocket frees the memory pointed to by socketHandle.
Hm… that gives me an idea: assume that everything works as advertised and that the child process gets copies of the socket descriptors (which can be closed safely without intefering with the parent). If I’m right, the Socket instances in the image hold on to the address of the *parent* handle (in an inst var). So now, when I close a socket with #primSocketDestroy:, the handle passed to the plugin will be the handle of the parent socket (although it sounds strange that the child should be able to close a file descriptor of its parent…).
That would mean that I must not close any sockets in the child. One option, it seems to me, is to suspend all processes that use sockets. Terminating them might pose another problem, if socket destruction is part of an unwind block in one of the processes (e.g. TCP connections in Seaside) then sockets will be destroyed during termination.
Another option: set all the socket handles to nil, then terminate the processes (yes ugly, but it might just work…).
> So, are you using clone or fork to create a fork of the image?
OSProcess uses plain fork() (in forkSqueak()). That’s what I use from the image.
> If their memory is shared (clone) instead of copied (fork), you might be kicking the feet out from under the parent image as well, so to speak…
From my understanding the file descriptors should be copied (fork). So that shouldn’t happen (but see above…).
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Vm-dev