[Vm-dev] Interrupted system call?

Andrew Gaylard ag at computer.org
Wed Feb 11 09:56:27 UTC 2009


Ian: I assume you mean this:

http://www.stanford.edu/~stinson/cs240/cs240_1/WIB.txt

Thanks -- it was an interesting read!

On Wed, Feb 11, 2009 at 6:11 AM, Andreas Raab <andreas.raab at gmx.de> wrote:

> There is obviously a lot I don't understand about interrupt handling on Unix
> since your description (and the stuff that I found looking for PC-losering
> problem) don't make much sense to me ;-)
>
> If I understand you correctly, then the program is in system call, then an
> interrupt happens and for some unexplicable reason that means the OS has to
> back out of the system call. Why is that? Wouldn't it be more sensible to
> just delay delivering the interrupt up to the point where the syscall
> returns? Yes, it doesn't guarantee real-time response but then there is
> probably more than one process running at any given time anyway so I
> wouldn't expect interrupts to be delivered real-time to user land anyway.
> And I *really* can't fathom the thought that any interrupt that happens for
> a process within a syscall somehow auto-magically leads to the kernel to
> forgetting the state associated with the call ;-)

Andreas: I'll try to clarify this for you.  The situation is this:

a. a user-level process makes a kernel call which might take a while,
typically for I/O.

b. an interrupt arrives and is delivered to the user-level process.  In
the Unix world, this is a software interrupt, and is called a "signal";
hardware interrupts are handled by the kernel and are not visible
to user-level processes.

c.  when delivering the signal to the user-level process, the kernel
needs to make a choice: should it
(1) call the signal handler and, when it returns, then cancel the I/O
operation and return an error (in this case, EINTR)?; or
(2) call the signal handler and, when it returns, then restart/resume
the I/O operation?

d. when the signal handler returns, should the kernel
(3) leave it up to the user-level process to re-instate it?; or
(4) re-instate the signal-handler itself?

All Unices with signal handling support options (1) and (3).
Modern Unices (since at least 12 years ago) also support options
(2) and (4).

>From what I can tell, Squeak assumes a bit of both models.
- it does not specify that system calls should be restarted.
- it does not re-instate the handler (i.e. it assumes that kernel will).
- it does not in every case do the manual check for EINTR and restart,
as John mentioned in his post.

My approach is to fix the former two, thus avoiding having to fix the
latter.  It should be simple, given that there are only a few calls to
signal() in the codebase: sqUnixMain.c, aio.c, UnixOSProcessPlugin.c.
It should be possible to replace each signal() call with sigaction() with
SA_RESTART, fixing both the syscall restarting problem and the
handler-reinstating problem.  This way we don't have to go through
the entire codebase looking for IO operations and checking for EINTR,
restarting, etc.  And we don't have to remember to add it into any
new code we might write in the future.

This whole issue is complicated further in that signal() on certain Unices
(including FreeBSD, Linux, and MacOS) will restart syscalls automatically,
and certain Unices won't (SYS-V,  including Solaris and -- I think -- HPUX).
And certain Unices will re-instate the handler automatically, and others
won't.  So the existing code may be broken yet appear to work, if you
happen to be on the "right" platform.

Another thing: signals may arrive from external processes (e.g. the kill
command) or from squeak itself.  aio.c asks the kernel to notify it when
there is I/O available for reading/writing/etc. When there's I/O ready,
the kernel sends SIGIO; when SIGIO arrives, Squeak jumps to
forceInterruptCheck.

- Andrew


More information about the Vm-dev mailing list